Hi community!,
I have a question regarding exporting a dataset to upload it and use it afterwards in Google Colab.
My dataset has several (6) classes: “freshripe”, “freshunripe”, “overripe”, “ripe”, “rotten”, “unripe”.
Here is an example of an image with three annotations (one freshripe and two ripe bananas):
When I export this dataset to Google Colab using the “yolov5pytorch” format I receive a dataset with the following folder structure:
Banana-Ripening-Process/
|
| -- test/
| -- images/
| | | -- banana_0001.jpg
| | | -- banana_0002.jpg
| | labels/
| | | -- banana_0001.txt
| | | -- banana_0002.txt
| -- train/
| -- images/
| | | -- banana_0301.jpg
| | | -- banana_0302.jpg
| | labels/
| | | -- banana_0301.txt
| | | -- banana_0302.txt
| -- valid/
| -- images/
| | | -- banana_0401.jpg
| | | -- banana_0402.jpg
| | labels/
| | | -- banana_0401.txt
| | | -- banana_0402.txt
But what I would like to receive is the following folder structure (with the annotations already cropped and passed to images)
"freshripe", "freshunripe", "overripe", "ripe", "rotten", "unripe"
Banana-Ripening-Process/
|
| -- test/
| -- freshripe/
| | | -- banana_0001.jpg
| | freshunripe/
| | | -- banana_0002.txt
| | | -- banana_0003.txt
| | overripe/
| | | -- banana_0004.txt
| | ripe/
| | | -- banana_0005.txt
| | rotten/
| | | -- banana_0006.txt
| | unripe/
| | | -- banana_0007.txt
| -- train/
| -- freshripe/
| | | -- banana_0101.jpg
| | | -- banana_0102.jpg
| | freshunripe/
| | | -- banana_0103.txt
| | overripe/
| | | -- banana_0104.txt
| | ripe/
| | | -- banana_0105.txt
| | rotten/
| | | -- banana_0106.txt
| | | -- banana_0107.txt
| | unripe/
| | | -- banana_0108.txt
| -- valid/
| -- freshripe/
| | | -- banana_0301.jpg
| | | -- banana_0302.jpg
| | freshunripe/
| | | -- banana_0303.txt
| | overripe/
| | | -- banana_0304.txt
| | | -- banana_0305.txt
| | ripe/
| | | -- banana_0306.txt
| | rotten/
| | | -- banana_0307.txt
| | | -- banana_0308.txt
| | unripe/
| | | -- banana_0309.txt
Is it possible to do this automatically? (and not having to search for all classes bounding boxes in Colab and doing the crop in there). Is there any format that arranges the information of the dataset this way?
Thanks in advance!!
Andrés.