Could Someone Give me Advice with Fine-Tuning a Model Using Roboflow?

Hello there, :wave:

I am new to using Roboflow; and I am currently working on a project where I need to fine-tune a pre-trained model for my specific use case. I have uploaded my dataset, which consists of a mix of images with varied labels; and I have already gone through the process of creating and augmenting the dataset. Now I am at the stage where I need to fine tune a model; but I am encountering a few challenges that I hope some of you might have insights on.

I am unsure about which pre-trained model I should select for fine-tuning. The task I am working on involves object detection, but I am not sure whether I should go with a general-purpose model (like YOLOv5) or a specialized model that might perform better with the type of images I am using.

I am running into some issues with training time. My dataset is not huge; but the training seems to be taking a while. Are there any tips for optimizing the training process or reducing the time it takes without sacrificing too much in terms of accuracy? :thinking:

Also, I have gone through this post; https://discuss.roboflow.com/t/how-large-of-a-dataset-is-necessary-to-fine-tune-something-like-sam-devops which definitely helped me out a lot.

When it comes to evaluating my models performance; I am trying to figure out what metrics I should focus on. Accuracy is clear; but for object detection; are there any other metrics that I should be tracking to better gauge my models effectiveness? :thinking:

Thanks in advance for your help and assistance. :innocent:

Hi @roberrttt - thanks for posting!

My advice when getting started is to start small. The fewer images you have, the easier it is to make changes to your labeling strategy.

Your whole goal should be getting a v1 model ready and running on production data. Once it’s live, you can use workflows to sample problem images and upload them to Roboflow for re-labeling. This is the best way to improve a model quickly.

In most cases you don’t need thousands and thousands of labeled images to get a v1 model ready. I would focus instead on image selection (you want your training data to mirror what your production data will look like, and you want varied images in lots of settings as opposed to 10k frames from the same video) as well as label quality (you want every image labeled exactly how you want the model to output it back to you.

In the spirit of starting easy, I would almost always start with a general model and move to something more specialized only after you confirm you need it. Almost all of our customers fine-tune general object detectors.

For evaluating the model, you should use our evaluation toolkit to easily see which images are giving your model problems.