Speed of Small Object Detection Models

Erik_Broberg · August 19, 2025, 5:36pm

Describe your question/issue here! (delete this when you post)

Project Type: Object Detection
Operating System & Browser: MacOS / Chrome
Project Universe Link or Workspace/Project ID: Private (N/A)

I’ve got to the point of applying some models I have to their real applications, in which time is an important factor. I thought that by switching from the RF-DETR Base model to the RF-DETR Nano (on top of using more compressed images), I would cut down the time to process by magnitudes, but after comparing the two, the nano model working on the smaller images is not faster at all! What is the point of a nano model if it doesn’t process images faster at the sacrifice of performance? Is there something I don’t understand about where the speed of object detections models come from, and if so, what MLM has the fastest processing time?

Ford · August 19, 2025, 10:33pm

Hi @Erik_Broberg!
Your hypothesis is valid, per Roboflow RF-DETR benchmarking, RF-DETR nano has a smaller latency than RF-DETR base.

This leads me to believe the latency you’re experiencing is originating from another step in your pipeline. To help me troubleshoot, do you mind sharing further details on your pipeline and hardware configuration?

Erik_Broberg · August 20, 2025, 3:33pm

Sure. The function I’ve defined does an image conversion (~0.002s/image) such that my input image is compatible with the model, and then I run a relatively simple workflow containing my model (0.633s/image). This function iterates through 180 20KB images each time it runs, with total time 120s per batch, run on a local server. Is the workflow time so high because it’s getting reconfigured every time I call it? Then why wouldn’t the workflow configured intially when loaded onto my server?

Erik_Broberg · August 21, 2025, 6:37pm

Now, even when I just use the client.infer function (not the workflow), it still takes 0.6 seconds per image using a YOLO-NAS nano or RF-DETR nano model. Why is the inference time consistently bad for both of these? Could it have to do with the max_batch_size or max_concurrent_requests parameters for InferenceConfiguration? If so, how should and can I set these parameters?

Erik_Broberg · August 27, 2025, 4:38pm

I found out why, and it’s my fault for not understanding, this page gets to the heart of my issue: it wasn’t an issue with my pipeline or hardware. It was because the images I was running it on are so small (~135x135), the default 640x640 rescale applied to my training images increased processing times by over 20x (0.03s → 0.6s). Since I am running this on a Mac, I am also limited to unimpressive CP compared to GP. Now these times seems correct compared to what I expected.

TLDR: make sure to understand resizing and carefully select resizing dimensions based on your situation.

Topic		Replies	Views
Training time + credits Feedback	4	27	August 8, 2025
Measure inference time per image on Tensorflow Object Detection API Community Help	1	338	April 20, 2022
Newbie looking for help with RF-DETR nano with videos on Google Colab Community Help	2	19	August 14, 2025
RF-DETR training for a long time, please help Community Help	2	170	April 1, 2025
Newbie looking for help with RF-DETR nano on Google Colab Community Help export	4	39	August 5, 2025

Speed of Small Object Detection Models

Related topics