Solid Dataset for Food Detection Model

FoodDetection · April 8, 2024, 8:19am

Hello all

I am relatively new to ML and Object Detection and would love to get some insights on what is a good dataset for my desired model.

I am trying to train an object detection model for the following products:

I will be using a camera that is filming a flat surface from above at a distance of about 0.5-1 metre. It should simualate a checkout, meaning that “customers” can purchase any combination of these products.

Since I am quite unsure about the best approach for a solid dataset I would appreciate any response on the following questions:

Is it better to only provide pictures where all of the products are visible, or should I also provide images with only one product in it? Or should I mix it up with a specific ration of images with all products in it / images with only one product in it?
If the objects to detect are always placed on the same flat surface, is it needed/a good idea to provide images to the dataset that show the objects on different backgrounds?
How many images will I roughly need? I was told that a good rule of thumb is to provide 1-2k pictures per product. This somewhat interfers with my first questions. Should I provide pictures per product or pictures with all products in them.
Right now I am using a Roboflow trials account that is locked to a duplication of 5x for a dataset. It states that more duplications require an upgrade. What upgrade do i need to unlock these? Is it the normal monthly subscription?
Is it better to provide more “raw” images or does the duplication of the uploaded images a better job at providing useful images for the dataset, rather than just uploading more images on my own?
My plan is to train the model on my local machine, since I have read, that there is no possibility to directly download a trained model from roboflow? The model needs to run on an offline device.

I would really appreciate if you could give me a rough estimation of what / how many raw pictures I should provide to the dataset. Maybe there is even a good choice of preprocessing / augmentation steps I should use for this specific usecase?

Any help is appreciated
Thanks a lot

Greetings

Project Type: Object Detection

Jacob_Witt · April 8, 2024, 7:49pm

Hi @FoodDetection - thanks for posting!

Responses below:

Is it better to only provide pictures where all of the products are visible, or should I also provide images with only one product in it? Or should I mix it up with a specific ration of images with all products in it / images with only one product in it?

The two golden rules of computer vision: 1) If a human can see it, so can a model. 2) Your training data should look like your production data. To that end, I suggest you use the same set of cameras / camera angles / locations as you expect your model to see in the wild. Generally, more variance is good.

If the objects to detect are always placed on the same flat surface, is it needed/a good idea to provide images to the dataset that show the objects on different backgrounds?

If the images will always be on the same flat surface, you don’t need to vary the backgrounds. However, it may help improve the robustness of the model (e.g., if lighting or other contexts change).

How many images will I roughly need? I was told that a good rule of thumb is to provide 1-2k pictures per product. This somewhat interfers with my first questions. Should I provide pictures per product or pictures with all products in them.

If you are only looking at these objects on the same consistent background, I could see ~2,000 well-varied images total being fine. Make sure to use different lighting and contexts. My advice is always to start small (try training on ~200 images) to get a sense of how your model improves with increased sample size. You can also use your early models to pre-label new images.

Right now I am using a Roboflow trials account that is locked to a duplication of 5x for a dataset. It states that more duplications require an upgrade. What upgrade do i need to unlock these? Is it the normal monthly subscription?

We can provide more variants per image after you purchase a starter plan; just shoot us an email at starter-plan@roboflow.com once you upgrade and we’ll let you test it out. Generally, going more than 5x will overfit your model. However, if you have an extremely constrained environment you want to run the model in, overfitting might not be horrible.

Is it better to provide more “raw” images or does the duplication of the uploaded images a better job at providing useful images for the dataset, rather than just uploading more images on my own?

Raw images are always going to improve model performance more than duplicats / augments.

My plan is to train the model on my local machine, since I have read, that there is no possibility to directly download a trained model from roboflow? The model needs to run on an offline device.

You can run models from Roboflow on your own device! Check out inference.roboflow.com

I would really appreciate if you could give me a rough estimation of what / how many raw pictures I should provide to the dataset. Maybe there is even a good choice of preprocessing / augmentation steps I should use for this specific usecase?

Again, my advice is to start at 200, train a model quickly, and add more images as you need. You may need to get to ~2,000 well-labeled images to get a model that works without common errors. This is assuming you are only detecting the images in question from the angle / background you’re showing.

FoodDetection · April 9, 2024, 9:45am

Hey @Jacob_Witt

Thank you for your response. Your explanations really helped a lot.

Just for clarification:
I should provide roughly 2000 raw images by myself- correct?
And then do a 5x, so that i end up with ~10k images?

Thank you for your help.

Jacob_Witt · April 9, 2024, 2:05pm

I wouldn’t even count the augmented images together with your source images - they do not provide nearly as much incremental value to your model.

Yes, you should aim for 2,000 raw images, but start training on ~200. I suggest experimenting with no augs / some augs / many augs to see what works best.

system · April 30, 2024, 2:05pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Testing the model with new images Community Help bugs	2	181	January 9, 2023
Dataset size vs number of images Community Help	2	46	May 22, 2025
Asking Question Community Help	3	756	October 15, 2023
Detection Needed Community Help	6	245	January 4, 2024
Incorrect Label Detect Community Help	5	236	November 14, 2023

Solid Dataset for Food Detection Model

Related topics