How can I improve my dataset for increased mAP

mfa · January 20, 2023, 1:17am

Frame (23)

Hi all,

I want to use yolov4 object detector to detect LED matrices like the one in the attached picture. The goal of my project is to perform automated RoI of these types of LED matrices in vehicular scenarios, mainly.

Unfortunatelly, these type of objects are not very popular and I could not find a way to produce a good dataset for training. I’ve tried to train yolov4 algoritm with different cfg parameters but 2 things always happens:

Overfitting
Alghoritm does not converge and no detection is performed.

Do you have any tips on how can I improve my dataset? This kind of object is not very popular.

leo · January 22, 2023, 12:52am

Hi @mfa

Sounds like an interesting project. I’ve also experimented with some unpopular object detection use cases before. The easiest way to improve your dataset from overfitting is to augment it via Roboflow’s augmentation settings when generating your dataset.

Also, are you collecting the data yourself?
If so, it might be worth looking into automating your data collection. For my use case, I was able to automate a lot of my data collection and annotation by using the upload API and the annotation API, or using model-assisted labeling.

Mohamed · January 23, 2023, 7:24pm

Great answer @leo!!

@mfa - A few more questions to get to the root of the issue:

What are your current mAP/precision/recall scores?
How many images do you have labeled?
How many classes are in the dataset?
How many labels per class?

Questions 2-4 can be answered by viewing the “Health Check” page on your project:

Also be sure that you are creating tight bounding boxes around each object you are trying to label, and label every instance of said object in every image that is being used for the model training.

https://docs.roboflow.com/annotate/best-practices

mfa · January 27, 2023, 12:28am

Hi stellasphere,

Tks for the answer. And yes, I am collecting the data myself and I am actually using these tools to facilitate the preprocessing step. However, it is being hard to think in some ways to get samples with different angles / views, as data augmentation does not seen to be enough to eliminate overfitting.

My dataset has 370 images, is it enough or should I collect more?

mfa · January 27, 2023, 12:33am

Hi Mohamed,

Tks for the answer and sorry for the delay. Answering your questions, I am attaching the Health Check statistics:

There is only one class, the LED Matrix, as I want to detect and remove other objects that may be near of it. My main goal is to plug it on a car’s backlight and be able to detect / recognize the LED Matrix from a camera embedded on another car during driving.

Mohamed · February 2, 2023, 11:31pm

It appears you’re making one label for the entire set of LED lights, correct?

Your dataset is just going to need more labeled examples, in this case. Once you get to around 300-500 labeled examples and generate a version to train, you’ll see some better results.

Active Learning is another great process to try to improve future model performance on newly trained models: Active Learning - Roboflow

mfa · February 8, 2023, 1:47am

Sorry for the delay again.

Answering your questions:

Yes, I have only one label for the entire set. I will try to increase my dataset diversity. The main problem is that I don’t have too many samples to work with.

Do you have any tips on how can I create this dataset to be diverse and not produce overfitting after training?

Tks,

Matheus

Mohamed · February 23, 2023, 6:16pm

You could try synthetic data:

Topic		Replies	Views
How to apply YOLOV8 on a custom dataset prepared by Smart Polygon Community Help	2	644	February 1, 2024
Detecting Objects which are not trained Community Help	5	202	November 18, 2023
Train YOLOv4 on a Custom Dataset Community Help	2	234	September 6, 2023
YOLOv8 isn't working Community Help	4	827	March 15, 2024
Detection Needed Community Help	6	240	January 4, 2024

How can I improve my dataset for increased mAP

Related topics