RF-DETR Training Failed due to missing "wandb" module

This is a little confusing since we’re trying to train using the new RF-DETR model in the cloud, but it fails every time saying that Roboflow’s Python process is missing the “wandb” module.

The image dataset is correctly resized to 640x640, so the suggested fix does not seem to be the problem.

  • Project Type: Object Detection
  • Operating System & Browser: macOS + Firefox
  • Project Universe Link or Workspace/Project ID: stationa/we-rooftops-det (private workspace)

Hi, we’ve deployed a new version, can you retry and let me know how it goes

Thanks @peter_roboflow. That seems to have worked to train now!

I’m a little confused though as:

  1. Training took only a little over an hour on a dataset of 2700 images despite that the estimates were claiming it would take closer to 8 hours. It also stopped after only 18 epochs (this might just mean the model wasn’t seeing much progress and stopped, but not sure).

  2. The confusion matrix and vector analysis charts are completely empty at all confidence thresholds.

  3. Both precision and recall show up as 100% which is highly suspect.

Any thoughts on what happened here?

(Sorry I tried to upload more screenshots, but the forum won’t let me since my account is new?)

@jerluc sorry about the forum not allowing image uploads, not sure how to fix that :confused:

  1. Training took only a little over an hour on a dataset of 2700 images despite that the estimates were claiming it would take closer to 8 hours. It also stopped after only 18 epochs (this might just mean the model wasn’t seeing much progress and stopped, but not sure).

yeah, we do early stopping so that we don’t waste your money while the model isn’t improving. rf-detr converges really fast so it should generally take a lot less time than estimated – one of the aspects of the model we are proud of since it should save users credits

  1. The confusion matrix and vector analysis charts are completely empty at all confidence thresholds.

hmmm, that should not be happening, we will look into that!

  1. Both precision and recall show up as 100% which is highly suspect.

you are right to be suspicious – those numbers arent getting calculated yet for rf-detr other than in model eval. That was a frontend bug I thought we had fixed – do you mind sharing where exactly in the app you are seeing that? If you can’t upload images here you can electronically mail me at my name at roboflow dot com

Thanks @peter_roboflow! It makes sense then that it didn’t take very long. I was just surprised to see it so much faster than was estimated!

Please let me know if you’ve found out anything about the confusion matrix. I figured this was related to the precision and recall showing up incorrectly, but maybe not.

For context, this is a screenshot from the top of the “Visualize” page that shows the 100% numbers:

Hello @jerluc I’m very sorry you experienced this issue. We found what was causing this problem and are currently working on fixing it. We will let you know as soon as it is ready!