Skip to content
Snippets Groups Projects
Commit d0279bc9 authored by Andreas Hamacher's avatar Andreas Hamacher
Browse files

Merge branch 'generic_nvidia_smi_err' into 'master'

adding a script to catch a generic nvidia-smi error

See merge request !587
parents 4e903713 861fff11
No related branches found
No related tags found
1 merge request!587adding a script to catch a generic nvidia-smi error
#!/bin/bash
function check_gpu_generic_err() {
#echo ">>> Checking for generic GPU errors >>>>>>>>>>>>>>>>>>>>>>>>>"
MASSIVEDOC="Runs nvidia-smi to see if the GPUs have any errors. Errors may indicate faulty hardware"
/usr/bin/nvidia-smi | ( ! grep ERR )
ret=$?
if [[ $ret -ne 0 ]]
then
die 1 " $FUNCNAME ERROR GPU errors"
return 1
else
return 0
fi
}
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment