ethOS 1.2.9 brings a few changes which break my auto-restart script for ethOS 1.2.7. Since 1.2.9 contains improved GPU crash detection, I rewrote the restart script to use the built-in detection mechanisms. For the required cron job please see my initial post which is available here.
As long as the
DRY_RUN variable is set to false, the script won’t take any action, it just logs what it would do.
I have been running a crypto-currency mining rig on the Linux based ethOS distro for quite some time now. While I realize that ethOS is problematic license-wise, it’s still a great distro to get a mining rig up and running in almost no time. The Nvidia GPUs in my rig are well tuned to operate at their optimum cost/hashrate ratio. However, due to bugs in the miner and/or the GPU drivers, every few days one of the GPUs stops mining. Sometimes ethOS is able to recover the GPU and gets it back to mining but sometimes it doesn’t seem to detect the crashed/hanging mining process at all. This is why I added a cron job that runs every 15 minutes and checks if all GPUs are still mining. If not, the miners will be re started using the
minestart commands provided by ethOS.
The cron job starts the Bash script below. If it detects a problem, it writes to the console and additionally to
/tmp/rigcheck.log. It’s been running smoothly on my ethOS v1.2.7 mining rig. I recommend putting it in
/home/ethos/rigcheck.sh and don’t forget to add execute permissions using
chmod +x /home/ethos/rigcheck.sh
The cron job can be created like this:
cat << EOF | sudo tee /etc/cron.d/rigcheck
*/15 * * * * root /home/ethos/rigcheck.sh
Thanks to this script, crashed or hanging miners will be restarted fairly quickly and my rig’s pool-reported hashrate stopped dropping in such situations.