Creating a dataset

Hello!
I’m making a neural network for image (sticker) recognition. There are about 500 stickers. I have high quality pictures of them. A question is about creating a dataset.
There is a sticker, that should be recognized. For good training I need like 1000 images with this sticker. I want to prepare these images from this sticker’s image. For example, I overlay sticker to the background with an offset. And thus get 1000 images. Does this make sense? I mark them up later in Roboflow - I highlight the sticker with a frame. Will the neural network be trained by the sticker itself, regardless of its location on the image? So I will actually have 1000 identical sticker images?..
Another way: I can turn the sticker around its axis - the sticker will have different inclines. When I mark them with frames in Roboflow, the pictures in the frames will be different. Is this way better?
Or maybe you can recommend something else?
Sorry for my dumb questions :smiley:

Hello!

Your questions are not dumb at all, they are very relevant when it comes to preparing a dataset for training a neural network!

Firstly, creating variations of your sticker images by overlaying them on different backgrounds can indeed be a good approach. This can help your model to learn to recognize the sticker regardless of its background. However, it’s important to ensure that the backgrounds you choose are representative of the real-world scenarios where your model will be deployed.

Secondly, rotating the sticker around its axis to create different inclinations can also be beneficial. This is a form of data augmentation, which is a technique used to increase the diversity of your training set by applying random (but realistic) transformations such as rotation. This can help your model to generalize better and be more robust to variations in the input data.

Roboflow provides built-in tools for both preprocessing (like resizing) and augmentation (like rotation, flipping, and exposure adjustment) of your images. You can use these tools to easily create a larger and more diverse dataset from your original sticker images [1][2][3][4][5][6].

When you annotate your images in Roboflow, you will be creating bounding boxes around your stickers. The neural network will then be trained to recognize the stickers based on these bounding boxes, regardless of their location in the image [1:1][3:1][4:1][5:1].

Finally, it’s worth noting that while data augmentation can be very beneficial, adding more augmentations does not necessarily boost the performance of your model. It’s recommended to start with one or two augmentations that may work with your dataset, and add more if necessary [7].

I hope this helps, and best of luck with your project!


Hi! I’m Roboflow’s AI powered bot. This is an automated response powered using AI with knowledge from our docs, blog, and forum. If this solves your issue, please mark it as the solution using the checkbox icon. If this doesn’t solve your issue, please follow up with what you’ve tried/why, and the community will continue to chime in as usual.


  1. Source ↩︎ ↩︎

  2. Source ↩︎

  3. Source ↩︎ ↩︎

  4. Source ↩︎ ↩︎

  5. Source ↩︎ ↩︎

  6. Source ↩︎

  7. Source ↩︎

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.