Exporting dataset in folder structure with one folder per class

Hi community!,

I have a question regarding exporting a dataset to upload it and use it afterwards in Google Colab.

My dataset has several (6) classes: “freshripe”, “freshunripe”, “overripe”, “ripe”, “rotten”, “unripe”.

Here is an example of an image with three annotations (one freshripe and two ripe bananas):

When I export this dataset to Google Colab using the “yolov5pytorch” format I receive a dataset with the following folder structure:

Banana-Ripening-Process/
|
| --  test/
|       --     images/
|       |       | -- banana_0001.jpg
|       |       | -- banana_0002.jpg
|       |       labels/ 
|       |       | -- banana_0001.txt
|       |       | -- banana_0002.txt
| --  train/
|       --     images/
|       |       | -- banana_0301.jpg
|       |       | -- banana_0302.jpg
|       |       labels/ 
|       |       | -- banana_0301.txt
|       |       | -- banana_0302.txt
| --  valid/
|       --     images/
|       |       | -- banana_0401.jpg
|       |       | -- banana_0402.jpg
|       |       labels/ 
|       |       | -- banana_0401.txt
|       |       | -- banana_0402.txt

But what I would like to receive is the following folder structure (with the annotations already cropped and passed to images)

"freshripe", "freshunripe", "overripe", "ripe", "rotten", "unripe"
Banana-Ripening-Process/
|
| --  test/
|       --     freshripe/
|       |       | -- banana_0001.jpg
|       |       freshunripe/ 
|       |       | -- banana_0002.txt
|       |       | -- banana_0003.txt
|       |       overripe/ 
|       |       | -- banana_0004.txt
|       |       ripe/ 
|       |       | -- banana_0005.txt
|       |       rotten/ 
|       |       | -- banana_0006.txt
|       |       unripe/ 
|       |       | -- banana_0007.txt
| --  train/
|       --     freshripe/
|       |       | -- banana_0101.jpg
|       |       | -- banana_0102.jpg
|       |       freshunripe/ 
|       |       | -- banana_0103.txt
|       |       overripe/ 
|       |       | -- banana_0104.txt
|       |       ripe/ 
|       |       | -- banana_0105.txt
|       |       rotten/ 
|       |       | -- banana_0106.txt
|       |       | -- banana_0107.txt
|       |       unripe/ 
|       |       | -- banana_0108.txt
| --  valid/
|       --     freshripe/
|       |       | -- banana_0301.jpg
|       |       | -- banana_0302.jpg
|       |       freshunripe/ 
|       |       | -- banana_0303.txt
|       |       overripe/ 
|       |       | -- banana_0304.txt
|       |       | -- banana_0305.txt
|       |       ripe/ 
|       |       | -- banana_0306.txt
|       |       rotten/ 
|       |       | -- banana_0307.txt
|       |       | -- banana_0308.txt
|       |       unripe/ 
|       |       | -- banana_0309.txt

Is it possible to do this automatically? (and not having to search for all classes bounding boxes in Colab and doing the crop in there). Is there any format that arranges the information of the dataset this way?

Thanks in advance!!
Andrés.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.