Describe your question/issue here! (delete this when you post)
Project Type: Object Detection
Operating System & Browser: MacOS / Chrome
Project Universe Link or Workspace/Project ID: Private (N/A)
I’ve got to the point of applying some models I have to their real applications, in which time is an important factor. I thought that by switching from the RF-DETR Base model to the RF-DETR Nano (on top of using more compressed images), I would cut down the time to process by magnitudes, but after comparing the two, the nano model working on the smaller images is not faster at all! What is the point of a nano model if it doesn’t process images faster at the sacrifice of performance? Is there something I don’t understand about where the speed of object detections models come from, and if so, what MLM has the fastest processing time?
Hi @Erik_Broberg!
Your hypothesis is valid, per Roboflow RF-DETR benchmarking, RF-DETR nano has a smaller latency than RF-DETR base.
This leads me to believe the latency you’re experiencing is originating from another step in your pipeline. To help me troubleshoot, do you mind sharing further details on your pipeline and hardware configuration?
Sure. The function I’ve defined does an image conversion (~0.002s/image) such that my input image is compatible with the model, and then I run a relatively simple workflow containing my model (0.633s/image). This function iterates through 180 20KB images each time it runs, with total time 120s per batch, run on a local server. Is the workflow time so high because it’s getting reconfigured every time I call it? Then why wouldn’t the workflow configured intially when loaded onto my server?
Now, even when I just use the client.infer function (not the workflow), it still takes 0.6 seconds per image using a YOLO-NAS nano or RF-DETR nano model. Why is the inference time consistently bad for both of these? Could it have to do with the max_batch_size or max_concurrent_requests parameters for InferenceConfiguration? If so, how should and can I set these parameters?
I found out why, and it’s my fault for not understanding, this page gets to the heart of my issue: it wasn’t an issue with my pipeline or hardware. It was because the images I was running it on are so small (~135x135), the default 640x640 rescale applied to my training images increased processing times by over 20x (0.03s → 0.6s). Since I am running this on a Mac, I am also limited to unimpressive CP compared to GP. Now these times seems correct compared to what I expected.
TLDR: make sure to understand resizing and carefully select resizing dimensions based on your situation.