Inference Mismatch for Fine-Tuned RF-DETR Segmentation

Hi Dear Roboflow Support Team,

I’m encountering a confusing issue with inference on my fine-tuned RF-DETR segmentation model (ID: saem-segment-lxryl/1). My inference results are only good when I use a preprocessing pipeline that seems to contradict the one shown in my dataset’s version.

Here are the details:

1. Training Pipeline (As Understood from Roboflow UI)

My dataset consists of 9356 x 13245 images. The preprocessing steps listed in my Roboflow version are:

  1. Tile: 4 rows x 4 columns

  2. Resize: Fit within 2048x2048

This correctly results in a dataset of 1446 x 2048 images, which I’ve confirmed in the UI.

Step 1: 9356x13245 tiled 4x4 → 2339x3312 tiles.

Step 2: 2339x3312 tiles + “Fit within 2048x2048” → 1446x2048 tiles

2. The Problem: “Correct” Pipeline Fails

When I replicate this exact pipeline for inference (i.e., feeding the model a 1446 x 2048 tile), the results are extremely poor.

3. The “Accidental” Pipeline That Works

While debugging, I discovered a different pipeline that gives exceptionally good results:

  1. Take the entire 9356 x 13245 image.

  2. First, resize the entire image to a 2048x2048 square.

  3. Then, tile that 2048x2048 image into a 4x4 grid (resulting in 512x512 tiles).

  4. Run inference on these 512x512 tiles.

I think this was a lucky coincidence, as i can see the models img_size_h and img_size_w both being 432px.

My Question

I would expect the best results to come from the 1446x2048 tiles that match the dataset, but this is not happening. Why does the “accidental” pipeline work?
I don’t really get what I am missing.

I’m happy to share details of the project with the support members, and the actual code used for running the inference.

Thank you a lot!

Andreas

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.