I’m trying to train EfficientDet using the CBIS-DDSM mammogram dataset. The Roboflow dataset I created has 3.4k training images, 75 validation images and 300 test images, resized to 512x512 from the native dimensions. Adapting the Roboflow-EfficientDet-v2 Colab notebook to this task using the default parameters, the onxx save failed after each epoch and I ran out of RAM after 23 epochs. I tried reducing the batch size to 4 and then 2, but in each case I now get DataLoader worker errors during the first epoch:
Epoch: 1/100. Iteration: 447/861. Cls loss: 0.54293. Reg loss: 0.67819. Batch loss: 1.22112 Total loss: 1.51785
52% 447/861 [01:47<01:34, 4.37it/s]error Traceback (most recent call last)
in/content/drive/MyDrive/Colab_Notebooks/Medicine/ObjectDetection/EfficientDet/Monk_Object_Detection/4_efficientdet/lib/train_detector.py in Train(self, num_epochs, model_output_dir)
258 epoch_loss =
259 progress_bar = tqdm(self.system_dict[“local”][“training_generator”])
→ 260 for iter, data in enumerate(progress_bar):
261 try:
262 self.system_dict[“local”][“optimizer”].zero_grad()5 frames
/usr/local/lib/python3.7/dist-packages/torch/_utils.py in reraise(self)
459 # instantiate since we don’t know how to
460 raise RuntimeError(msg) from None
→ 461 raise exception
462
463error: Caught error in DataLoader worker process 0.
Original Traceback (most recent call last):
File “/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/worker.py”, line 302, in _worker_loop
data = fetcher.fetch(index)
File “/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py”, line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File “/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py”, line 49, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File “Monk_Object_Detection/4_efficientdet/lib/src/dataset.py”, line 47, in getitem
img = self.load_image(idx)
File “Monk_Object_Detection/4_efficientdet/lib/src/dataset.py”, line 58, in load_image
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
cv2.error: OpenCV(4.6.0) /io/opencv/modules/imgproc/src/color.cpp:182: error: (-215:Assertion failed) !_src.empty() in function ‘cvtColor’
Separately, I tried using the Roboflow-TensorFlow2-Object-Detection.ipynb Colab notebook on the same dataset using EfficientDet-D0 and run aground at this step:
!python /content/models/research/object_detection/model_main_tf2.py \
–pipeline_config_path={pipeline_file}
–model_dir={model_dir}
–alsologtostderr
–num_train_steps={num_steps}
–sample_1_of_n_eval_examples=1
–num_eval_steps={num_eval_steps}
which terminates with:
Node: ‘EfficientDet-D0/model/stem_conv2d/Conv2D’ DNN library is not found. [[{{node EfficientDet-D0/model/stem_conv2d/Conv2D}}]] [Op:__inference__dummy_computation_fn_32318]
Suggestions appreciated!