RF-DETR data

Hi! I am currently using RF-DETR for a single class object detection task. My data has mainly ground truth bounding box labels for the single class and also some empty scenes with no GT annotations without the object (although it’s more imbalanced and mostly has the images with the object). I was wondering whether DETR architectures (and object detectors in general) use all images (including those images without GT bounding box annotations) during training and whether they learn from them. Or does the RF-DETR only train on the annotated images with the class label for single class object detection tasks? Thank you so much for the clarification!

Object detectors generally use all images during training. Images containing no annotations are useful to learn what IS NOT an example of the object.

1 Like

Thank you so much for the quick reply! Is the imbalance a big problem in that case? And does it learning from the background help with that?

Another issue I had was about multiple predictions, I thought that one of the main strengths of DETRs was that they don’t need NMS, but I still get a lot of multiple predictions, especially at lower confidence thresholds. I was wondering why this might be and how to address this. Especially in my evaluation of the model? (I am using it for smoke detection) Thank you so much for the help!

Hey mila88! I agree that the null images (intentionally no detections) are good to have. This article mentions using them as well. As for balance, I’ve heard you can go as much as 10% and be good to go, which is still a large imbalance. So yes, you should be good with a small number of those in the dataset.

1 Like

This is very helpful, thanks a lot!

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.