Hello everyone,
Thank you to the roboflow team for providing such a complete platform.
I was trying out RF-DETR to see if this promising model could perform better than YOLO11 for my specific use case: tracking padel balls on videos of people playing the game.
I was surprised at the validation metrics shown during training at universe.roboflow as they were about 20% higher than those I had obtained using a YOLO11m.
After the training was done I uploaded an image to test its performance. I obtain this correct result:
But when I get my model by means of:
model = get_model(model_id=model_id, api_key=api_key)
# Rest of the inference code
...
results = model.infer(frame_rgb, conf=0.2)[0]
I obtain no prediction. I canβt share two media files on a post so just imagine the top one without the bbox.
Which is odd because on the app it had a very high confidence. I have seen similar posts where app predictions differed from serverless ones but none that the discrepancy was this big.
I understand that the training and inference code are proprietary and will not be shared but it would be of huge help if you could shed some light into the techniques that you use in order to get closer to my desired result.
- Does the model automatically resize the image to the size it was trained on or is this something that I need to do manually?
- Is the server automatically using SAHI or other tiling method when providing the result?
- Last but not least, I assume the model expects RGB images correct? If I load them directly through cv2 they are BGR which is the expected format for supervision.plot_image() so I am not 100% sure about what format I should use when inferring.
Once again, thank you for your time.
