I recently created a model using Roboflow’s services, which was trained entirely on synthetic data (This is my project Id: bottle-identifier). The task involves recognizing retail bottles of alcohol. However, when I evaluated the model on real photos of a store shelf, the results were unsatisfactory.
Could anyone advise on where I might have gone wrong? Is it possible that synthetic data alone is insufficient for this purpose? Any tips on how to improve the model’s performance with real-world images would be greatly appreciated.
Interesting project! Synthetic data is great, but the most important think to remember in computer vision and machine learning in general is that you want your training data to represent what it will see while being used (in the wild).
The synthetic data you generated doesn’t look representative of the data that you’re trying to predict with. I would alter your synthetic generation strategy to remove the stock images and add more realistic backgrounds, increase the number of objects added, and alter positioning to create rows of objects (like in real life) instead of random positioning.