Logic of Image Compressing inside Datasets for Object Detection

Hello everyone,

My name is Mikhail Korotkin, and I work as a Solution Architect at Customertimes. We are launching a project utilizing object detection technology and are currently selecting a service for photo annotation. However, I am having difficulty understanding the image compression logic within the service.

I uploaded a photo for manual annotation that originally weighed 1.5 megabytes. After annotating it and exporting it to a dataset for YOLOv8, the same photo in the dataset weighed 1 megabyte (compressed by 30%). Another example: I annotated photos for a different project and used the auto-generated photos feature. In this case, the photo size in the dataset dropped from 1.5 megabytes to 200-300 kilobytes.

Could you please explain the logic behind this compression? I believe that the size of the photo on which the neural network is trained will critically affect the model’s performance in the field. Your expertise on this matter would be greatly appreciated.

Thank you in advance.

Hi @Mikhail_Korotkin - what is the resolution of your images?

Images will get downsized if you use the resize preprocessing step, or if your images are over 2,000 x 2,000 pixels.

Generally, we recommend downsizing closer to 640x640 (at least to start). The models need far less resolution than you’d think, and smaller resolutions enable significantly faster inference and training.