Resizing large ~16:9 (2688x1520) images for use with yolov5?

Hi, I’m new to Roboflow. I’m labeling images of animals we caught on our cameras for better detection purposes. The images we have are all large 16:9-ish images, 2688x1520 pixels.

Following Ultralytics tutorial it suggests resizing them to the yolov5 640x640 default size, which takes the nice identifiable images of the animals and squishes them. I read through the two articles on resizing (Preprocess Images - Roboflow Docs and You Might Be Resizing Your Images Incorrectly), so I’m guessing the crop option will be better than the strech/resize option. Padding is also an option, but I’m wondering if the animals might end up being too small.

My question is, would it better if I did some editing on them first? Cropped or resized them in photoshop? Depending on where the bounding box is, it would unfortunate if roboflow just cropped it right out lol


Hey there!

Have you taken a look at the small object resources as well?

Specifically, tiling has yielded good results for use cases like this:

Take a look and let me know if that solves it!

Hi, I’m not sure what I was asking was clear. I’m going to add some images here to illustrate my point. And it’s not easy because of new user restrictions…

This is the original capture from camera:

If I upload it as is to roboflow and add bounding boxes this is the result:

Then when I go to generate a Dataset with the “Resize: Strech to 640x640” option (as recommend by the ultralytics tutorial), the model is going to be learning what very squished looking deer look like:

Am I correct so far?

But if I manually crop the edges off the longest dimension (the width), keeping the shorter dimension untouched. I.e:

Adding bounding boxes, I get:

Finally, using the “Resize: Strech to 640x640” option again, results in:

Now just for arguemnts sake, going back and testing my newly generated model (of 1 image lol) with my orignal image. Won’t I get a 100% label/match detection with the pre-cropped “resized to 640x460” version? While if the model was trained with the squished “640x640 version” it may not recognize that the original image contains any deer. Which is just a long way of saying, I would think keeping the correct aspect ratio is important?