How to output the class label distribution if I export my dataset as YOLO v5 PyTorch format?

MheadHero · May 12, 2022, 4:00pm

So basically I wanted to output the class label distribution according to train/validation/test set. But if using the YOLOv5 PyTorch format, I have no idea how to code it.

brad · May 12, 2022, 10:42pm

Hi @mheadhero, you’d need to write a script to do this count. I probably wouldn’t use the YOLOv5 PyTorch format because then you have to deal with mapping numeric class identifiers back to strings via the labelmap.

The YOLOv5 Oriented Bounding Boxes format is similar but includes the labels in the annotations directly so it makes this task easier.

Here’s an example of doing it using bash with the playing cards dataset from Roboflow Universe.

From the Playing Cards.v1-v1.yolov5-obb/train/labelTxt directory, running this command:

ls | grep .txt | xargs cat | cut -d" " -f9 | sort | uniq -c | sort -nr

Does the following:

List all the files
Find all the .txt files
Output their contents
Split by space & grab the 9th column (the class name)
Sort them so all the same classes are next to each other
Count the number of lines in a row that are the same
Sort them in descending order by the class with the most examples

So for the train set this gives me the following output:

Meaning there are 1374 Queen of Spades, 1367 Five of Clubs, etc.

You can do this in the valid and test label directories as well to get counts for those.

Topic		Replies	Views
When exporting in yolov5 pytorch format, class names do not appear. 🛠️ Feature Reqs bugs , export	2	594	September 7, 2022
Label class number changed after export dataset 🤝 Community Help export	6	1533	June 9, 2024
Unable to find Documentation on YOLOv5 Classification 🤝 Community Help feature-request	0	221	November 1, 2022
Broken Dataset 🤝 Community Help	11	1594	February 7, 2022
YOLO v5 test images are not successful 🤝 Community Help	0	177	July 13, 2022

How to output the class label distribution if I export my dataset as YOLO v5 PyTorch format?

Related topics