Object Detection: Model Identifying Bolt Heads as Full Bolts

I’m going through the Roboflow “getting started” tutorial for for hardware bolts. A “bolt” should be the entire assembly: hex head + threaded body. It’s kinda a tricky objective once you dig into it…

While the model successfully detects complete bolts, it also potentially has two major major issues:

  1. Partial Misclassification: It often identifies just the hex head portion of a bolt and incorrectly labels it as a full “bolt.”
  2. Redundant Detections: Sometimes, it creates two overlapping boxes for the same physical bolt: one for just the head and another for the entire bolt (head+threads), both labeled “bolt.”

How can we best address this through annotation and training strategies, particularly within a platform like Roboflow?

It’s intuitive to imagine why the model has a hard time with double labels in this kind of situation with bolts jumbled together, laying across each other, sometimes standing on end so that only the hex is visible, etc. I’m just curious what best practices are in ML for handling these kinds of scenarios in general.

I guess maybe one strategy would be to just annotate the heads and exclude the threaded body?

Curious what the community thinks about problems like these in general.

I like your idea about just focusing on the heads. You would end up with the opposite problem though - with only threads showing, that bolt would not be identified.

If you’re going with object detection, make sure your bounding boxes surround the entire bolt, including hidden parts that are covered by other bolts. If it’s instance segmentation, it’s just the visible part you’re after.

That being said, providing some more information might help people troubleshoot with you. You could share a link to your dataset or even some sample images. A big one to share - what’s the end goal? If you’re trying to count, then you’re always going to miss bolts underneath anyways, but maybe you just want to make sure you see at least 20 bolts in the image to know the stock is not low. Or, if it’s a side view of a filled up container, you could train to identify more generally how low the inventory is and ignore individual identifications.