I am trying to see if a 2-stage detector would be a better option for me than a 1-stage detector, since I have small objects contained within a larger object.
The second stage of the detector would need to run 20-50 detections quickly (so that the overall FPS per detection stays high). So I am wondering - does roboflow train give me a tiny model if I have a tiny input space?
One option for the second detector would be an input size of 128x32 object detector and the other is 32x32 image classifier. I can’t imagine needing a million parameters to fit a model in that case.
We don’t have support for this type of use-case built into Roboflow Train, but the product is made to be completely interoperable with whatever model you choose so you can customize things however you like; you can use it for dataset management (and probably for the first stage where you need better accuracy) & use your own custom model for your 2nd-stage classifier.
In general I would recommend doing a single stage model that does both the localization and classification but would need to know more about your use-case). We’ve seen tiling be really effective for small objects.
here’s the workspace: Sign in to Roboflow
(that workspace has a few variants of what I’m doing.)
I am trying to detect an “region” that has “data” on it in the form of symbols drawn on the paper.
I tried one version where I detect the symbols drawn on the paper (because object detection is axis aligned) … but then I have to do a lot of work to reconstruct the regions from that (and it’s prone to error if two regions are side by side or if a symbol is detected incorrectly - how do you know which region the symbol belongs to?)
So now I’m trying to detect the region directly using the yolov5-obb model. That way I get the orientation of the region with hopefully a tight bounding box. That leaves somehow getting the symbols from the region, for which I am trying to do the 2-stage thing. 2-stage is unnecessary if I can train a model with both the larger regions and the tiny symbols to detect. So far I’ve seen that the larger object dominates the training and then the smaller objects aren’t detected as well.