Lmod Warning:
-------------------------------------------------------------------------------
The following dependent module(s) are not currently loaded: curl/8.4.0
(required by: htslib/1.16)
-------------------------------------------------------------------------------




The following have been reloaded with a version change:
  1) curl/8.4.0 => curl/8.17.0     2) openssl/3.0.7 => openssl/3.6.0

2025-11-17 17:24:01.487314: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2025-11-17 17:25:40.626109: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2025-11-17 17:25:40.635764: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2025-11-17 17:25:40.665534: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: 
pciBusID: 0000:89:00.0 name: NVIDIA GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.75GiB deviceMemoryBandwidth: 573.69GiB/s
2025-11-17 17:25:40.665628: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2025-11-17 17:25:41.206435: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2025-11-17 17:25:41.899443: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11
2025-11-17 17:25:42.164714: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2025-11-17 17:25:42.587976: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2025-11-17 17:25:43.143799: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2025-11-17 17:25:43.339424: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11
2025-11-17 17:25:43.529843: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2025-11-17 17:25:43.531075: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2025-11-17 17:25:43.531863: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2025-11-17 17:25:43.535133: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2025-11-17 17:25:43.535642: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: 
pciBusID: 0000:89:00.0 name: NVIDIA GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.75GiB deviceMemoryBandwidth: 573.69GiB/s
2025-11-17 17:25:43.535683: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2025-11-17 17:25:43.535710: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2025-11-17 17:25:43.535733: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11
2025-11-17 17:25:43.535756: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2025-11-17 17:25:43.535777: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2025-11-17 17:25:43.535799: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2025-11-17 17:25:43.535820: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11
2025-11-17 17:25:43.535862: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2025-11-17 17:25:43.536674: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2025-11-17 17:25:43.536724: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2025-11-17 17:25:46.033474: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix:
2025-11-17 17:25:46.033573: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267]      0 
2025-11-17 17:25:46.033588: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1280] 0:   N 
2025-11-17 17:25:46.034884: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1406] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10064 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:89:00.0, compute capability: 7.5)
2025-11-17 17:25:57.061250: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2)
2025-11-17 17:25:57.062038: I tensorflow/core/platform/profile_utils/cpu_utils.cc:112] CPU Frequency: 2500115000 Hz
2025-11-17 17:25:58.483713: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2025-11-17 17:26:00.121920: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11
2025-11-17 17:26:00.129561: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2025-11-17 17:26:13.511800: W tensorflow/stream_executor/gpu/asm_compiler.cc:63] Running ptxas --version returned 256
2025-11-17 17:26:13.595650: W tensorflow/stream_executor/gpu/redzone_allocator.cc:314] Internal: ptxas exited with non-zero error code 256, output: 
Relying on driver to perform ptx compilation. 
Modify $PATH to customize ptxas location.
This message will be only logged once.
/home/users/shouvikm/miniconda3/envs/bpnet/lib/python3.7/site-packages/tensorflow/python/keras/engine/functional.py:595: UserWarning: Input dict contained keys ['coordinates', 'jitters', 'index', 'status', 'rev_comp'] which did not match any model input. They will be ignored by the model.
  [n for n in tensors.keys() if n not in ref_input_names])
/home/users/shouvikm/miniconda3/envs/bpnet/lib/python3.7/site-packages/tensorflow/python/keras/engine/functional.py:595: UserWarning: Input dict contained keys ['coordinates'] which did not match any model input. They will be ignored by the model.
  [n for n in tensors.keys() if n not in ref_input_names])
2025-11-17 17:32:03.598289: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
2025-11-17 17:32:15.874332: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2025-11-17 17:33:10.467840: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2025-11-17 17:33:10.476160: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2025-11-17 17:33:10.507680: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: 
pciBusID: 0000:89:00.0 name: NVIDIA GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.75GiB deviceMemoryBandwidth: 573.69GiB/s
2025-11-17 17:33:10.507776: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2025-11-17 17:33:10.514321: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2025-11-17 17:33:10.514401: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11
2025-11-17 17:33:10.517644: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2025-11-17 17:33:10.519972: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2025-11-17 17:33:10.525304: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2025-11-17 17:33:10.528382: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11
2025-11-17 17:33:10.530514: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2025-11-17 17:33:10.531477: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2025-11-17 17:33:10.531909: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2025-11-17 17:33:10.534511: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2025-11-17 17:33:10.535051: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: 
pciBusID: 0000:89:00.0 name: NVIDIA GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.75GiB deviceMemoryBandwidth: 573.69GiB/s
2025-11-17 17:33:10.535097: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2025-11-17 17:33:10.535125: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2025-11-17 17:33:10.535150: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11
2025-11-17 17:33:10.535173: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2025-11-17 17:33:10.535197: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2025-11-17 17:33:10.535220: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2025-11-17 17:33:10.535244: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11
2025-11-17 17:33:10.535267: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2025-11-17 17:33:10.536087: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2025-11-17 17:33:10.536139: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2025-11-17 17:33:11.013909: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix:
2025-11-17 17:33:11.014006: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267]      0 
2025-11-17 17:33:11.014020: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1280] 0:   N 
2025-11-17 17:33:11.015270: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1406] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10064 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 2080 Ti, pci bus id: 0000:89:00.0, compute capability: 7.5)
/home/users/shouvikm/miniconda3/envs/bpnet/lib/python3.7/site-packages/tensorflow/python/keras/layers/core.py:1059: UserWarning: bpnet.model.arch is not loaded, but a Lambda layer uses it. It may cause errors.
  , UserWarning)
batch:   0%|          | 0/40 [00:00<?, ?it/s]2025-11-17 17:33:12.554294: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2)
2025-11-17 17:33:12.554822: I tensorflow/core/platform/profile_utils/cpu_utils.cc:112] CPU Frequency: 2500115000 Hz
2025-11-17 17:33:12.814063: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2025-11-17 17:33:13.170403: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11
2025-11-17 17:33:13.172727: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2025-11-17 17:33:14.537275: W tensorflow/stream_executor/gpu/asm_compiler.cc:63] Running ptxas --version returned 256
2025-11-17 17:33:14.629656: W tensorflow/stream_executor/gpu/redzone_allocator.cc:314] Internal: ptxas exited with non-zero error code 256, output: 
Relying on driver to perform ptx compilation. 
Modify $PATH to customize ptxas location.
This message will be only logged once.
/home/users/shouvikm/miniconda3/envs/bpnet/lib/python3.7/site-packages/tensorflow/python/keras/engine/functional.py:595: UserWarning: Input dict contained keys ['coordinates', 'true_profiles', 'true_logcounts', 'rev_comp'] which did not match any model input. They will be ignored by the model.
  [n for n in tensors.keys() if n not in ref_input_names])
batch:   2%|▎         | 1/40 [00:21<14:00, 21.55s/it]batch:   5%|▌         | 2/40 [00:21<05:44,  9.06s/it]batch:   8%|▊         | 3/40 [00:22<03:07,  5.07s/it]batch:  10%|█         | 4/40 [00:22<01:54,  3.19s/it]batch:  12%|█▎        | 5/40 [00:22<01:15,  2.15s/it]batch:  15%|█▌        | 6/40 [00:23<00:51,  1.52s/it]batch:  18%|█▊        | 7/40 [00:23<00:37,  1.13s/it]batch:  20%|██        | 8/40 [00:23<00:27,  1.15it/s]batch:  22%|██▎       | 9/40 [00:24<00:21,  1.44it/s]batch:  25%|██▌       | 10/40 [00:24<00:17,  1.74it/s]batch:  28%|██▊       | 11/40 [00:24<00:14,  2.02it/s]batch:  30%|███       | 12/40 [00:24<00:12,  2.28it/s]batch:  32%|███▎      | 13/40 [00:25<00:10,  2.51it/s]batch:  35%|███▌      | 14/40 [00:25<00:09,  2.69it/s]batch:  38%|███▊      | 15/40 [00:25<00:08,  2.83it/s]batch:  40%|████      | 16/40 [00:26<00:08,  2.94it/s]batch:  42%|████▎     | 17/40 [00:26<00:07,  3.02it/s]batch:  45%|████▌     | 18/40 [00:26<00:07,  3.09it/s]batch:  48%|████▊     | 19/40 [00:27<00:06,  3.13it/s]batch:  50%|█████     | 20/40 [00:27<00:06,  3.15it/s]batch:  52%|█████▎    | 21/40 [00:27<00:06,  3.17it/s]batch:  55%|█████▌    | 22/40 [00:28<00:05,  3.19it/s]batch:  57%|█████▊    | 23/40 [00:28<00:05,  3.20it/s]batch:  60%|██████    | 24/40 [00:28<00:04,  3.21it/s]batch:  62%|██████▎   | 25/40 [00:28<00:04,  3.22it/s]batch:  65%|██████▌   | 26/40 [00:29<00:04,  3.23it/s]batch:  68%|██████▊   | 27/40 [00:29<00:04,  3.23it/s]batch:  70%|███████   | 28/40 [00:29<00:03,  3.23it/s]batch:  72%|███████▎  | 29/40 [00:30<00:03,  3.23it/s]batch:  75%|███████▌  | 30/40 [00:30<00:03,  3.23it/s]batch:  78%|███████▊  | 31/40 [00:30<00:02,  3.24it/s]batch:  80%|████████  | 32/40 [00:31<00:02,  3.22it/s]batch:  82%|████████▎ | 33/40 [00:31<00:02,  3.21it/s]batch:  85%|████████▌ | 34/40 [00:31<00:01,  3.19it/s]batch:  88%|████████▊ | 35/40 [00:32<00:01,  3.19it/s]batch:  90%|█████████ | 36/40 [00:32<00:01,  3.20it/s]batch:  92%|█████████▎| 37/40 [00:32<00:00,  3.21it/s]batch:  95%|█████████▌| 38/40 [00:33<00:00,  3.23it/s]batch:  98%|█████████▊| 39/40 [00:33<00:00,  3.24it/s]batch: 100%|██████████| 40/40 [00:33<00:00,  3.78it/s]batch: 100%|██████████| 40/40 [00:33<00:00,  1.19it/s]
  0%|          | 0/2518 [00:00<?, ?it/s]  4%|▍         | 96/2518 [00:00<00:02, 946.53it/s]  8%|▊         | 196/2518 [00:00<00:02, 977.76it/s] 12%|█▏        | 296/2518 [00:00<00:02, 983.77it/s] 16%|█▌        | 395/2518 [00:00<00:02, 982.30it/s] 20%|█▉        | 496/2518 [00:00<00:02, 986.21it/s] 24%|██▎       | 596/2518 [00:00<00:01, 984.23it/s] 28%|██▊       | 695/2518 [00:00<00:01, 982.38it/s] 32%|███▏      | 794/2518 [00:00<00:01, 977.06it/s] 35%|███▌      | 892/2518 [00:00<00:01, 970.25it/s] 39%|███▉      | 990/2518 [00:01<00:01, 971.64it/s] 43%|████▎     | 1088/2518 [00:01<00:01, 962.98it/s] 47%|████▋     | 1185/2518 [00:01<00:01, 961.27it/s] 51%|█████     | 1282/2518 [00:01<00:01, 956.08it/s] 55%|█████▍    | 1378/2518 [00:01<00:01, 953.70it/s] 59%|█████▊    | 1474/2518 [00:01<00:01, 954.21it/s] 62%|██████▏   | 1570/2518 [00:01<00:00, 950.49it/s] 66%|██████▌   | 1666/2518 [00:01<00:00, 945.08it/s] 70%|██████▉   | 1761/2518 [00:01<00:00, 940.57it/s] 74%|███████▎  | 1856/2518 [00:01<00:00, 936.75it/s] 77%|███████▋  | 1950/2518 [00:02<00:00, 932.42it/s] 81%|████████  | 2044/2518 [00:02<00:00, 929.10it/s] 85%|████████▍ | 2137/2518 [00:02<00:00, 923.66it/s] 89%|████████▊ | 2230/2518 [00:02<00:00, 910.90it/s] 92%|█████████▏| 2322/2518 [00:02<00:00, 900.28it/s] 96%|█████████▌| 2413/2518 [00:02<00:00, 896.31it/s] 99%|█████████▉| 2503/2518 [00:02<00:00, 887.28it/s]100%|██████████| 2518/2518 [00:02<00:00, 942.11it/s]
  0%|          | 0/2518 [00:00<?, ?it/s]  4%|▍         | 99/2518 [00:00<00:02, 981.53it/s]  8%|▊         | 201/2518 [00:00<00:02, 1002.21it/s] 12%|█▏        | 302/2518 [00:00<00:02, 1002.19it/s] 16%|█▌        | 403/2518 [00:00<00:02, 1001.89it/s] 20%|██        | 504/2518 [00:00<00:02, 993.58it/s]  24%|██▍       | 604/2518 [00:00<00:01, 990.35it/s] 28%|██▊       | 704/2518 [00:00<00:01, 990.12it/s] 32%|███▏      | 804/2518 [00:00<00:01, 985.46it/s] 36%|███▌      | 903/2518 [00:00<00:01, 974.08it/s] 40%|███▉      | 1001/2518 [00:01<00:01, 975.19it/s] 44%|████▎     | 1099/2518 [00:01<00:01, 969.49it/s] 47%|████▋     | 1196/2518 [00:01<00:01, 969.36it/s] 51%|█████▏    | 1293/2518 [00:01<00:01, 962.40it/s] 55%|█████▌    | 1390/2518 [00:01<00:01, 961.79it/s] 59%|█████▉    | 1487/2518 [00:01<00:01, 957.62it/s] 63%|██████▎   | 1583/2518 [00:01<00:00, 945.25it/s] 67%|██████▋   | 1678/2518 [00:01<00:00, 940.36it/s] 70%|███████   | 1773/2518 [00:01<00:00, 938.77it/s] 74%|███████▍  | 1868/2518 [00:01<00:00, 941.87it/s] 78%|███████▊  | 1963/2518 [00:02<00:00, 934.13it/s] 82%|████████▏ | 2057/2518 [00:02<00:00, 930.48it/s] 85%|████████▌ | 2151/2518 [00:02<00:00, 926.96it/s] 89%|████████▉ | 2244/2518 [00:02<00:00, 913.60it/s] 93%|█████████▎| 2336/2518 [00:02<00:00, 903.88it/s] 96%|█████████▋| 2427/2518 [00:02<00:00, 898.58it/s]100%|█████████▉| 2517/2518 [00:02<00:00, 889.61it/s]100%|██████████| 2518/2518 [00:02<00:00, 947.67it/s]
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
2025-11-17 17:34:22.644461: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2025-11-17 17:34:46.014691: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2025-11-17 17:34:46.019463: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2025-11-17 17:34:46.028821: E tensorflow/stream_executor/cuda/cuda_driver.cc:328] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2025-11-17 17:34:46.028882: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host: sh03-13n05.int
2025-11-17 17:34:46.028911: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname: sh03-13n05.int
2025-11-17 17:34:46.029015: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:200] libcuda reported version is: 550.163.1
2025-11-17 17:34:46.029066: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:204] kernel reported version is: 550.163.1
2025-11-17 17:34:46.029086: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:310] kernel version seems to match DSO: 550.163.1
2025-11-17 17:34:46.029450: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2025-11-17 17:34:46.031848: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2025-11-17 17:34:46.084065: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:196] None of the MLIR optimization passes are enabled (registered 0 passes)
2025-11-17 17:34:46.100471: I tensorflow/core/platform/profile_utils/cpu_utils.cc:112] CPU Frequency: 2500115000 Hz
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
slurmstepd: error: NVML: Failed to get Compute running process count(15): GPU is lost
slurmstepd: error: NVML: Failed to get usage(15): GPU is lost
RuntimeError: module compiled against API version 0xe but this version of numpy is 0xd
/home/users/shouvikm/miniconda3/envs/bpnet/lib/python3.7/site-packages/tensorflow/python/keras/layers/core.py:1059: UserWarning: bpnet.model.arch is not loaded, but a Lambda layer uses it. It may cause errors.
  , UserWarning)
