I trained a model with a thousand images to recognize carrots. I trained another model with a thousand images (some overlapping) to recognize knives.
Is there an easy way to merge the two into a new model so I have a “Carrots & Knives” model that merges datasets and labels?
Great question! I have a few thoughts. You wouldn’t directly merge them in the way you might paste two Word docs together. But there are ways to make use of both.
Your end goal kind of matters too. If latency is critical, a single model trained on both might be faster. But we’re talking 20 ms faster (in not having an extra model to run in the workflow and having the efficiency of them combined), not like 4 seconds or anything crazy big like that.
Here’s a couple ideas from easiest to most complicated.
- build a workflow in Roboflow that runs each model separately on the same image
- further fine tune the carrot model by adding in the knife images/annotations and then training it on that additional data
- start from scratch with about 2000 images and train a single model that finds both
Does it make any difference if all the source imagery is cutting board that generally has both carrots and knives (in most situations) rather than pictures of carrots alone and stock imagery of knives?
The composition of the source leans towards accepting both where a merging of labels would theoretically work (assuming that’s possible)
The ideal scenario is that you are training on images that closely reflect what you will use the model on. So if you plan to have it find carrots and knives on cutting boards in the future, then the fact that they are both in those training images is ideal. But if you are trying to find a carrot in a salad having trained only with carrots on cutting boards you will have poorer results when you go to look for carrots in a salad.
I’m not sure you are implying this, but the fact that both classes (carrots and knives) are in most of the images is actually better than if they were in separate images. It helps the model avoid mistaking a long narrow carrot for a knife or an orange knife handle for a carrot. And just a reminder, you also want to try to have about 5-10% of your images be “null” images - so a cutting board with no carrot and no knife. It could have a fork or spoon on there, maybe some onions, or just be empty. Whatever will be close to what could happen when you are using the model later. That also helps the model perform better.
Essentially what happened is that I had many images of cutting board shots with carrots, vegetables, and sometimes knives. All the same kind of shot from different angles.
I didn’t realize I would need the knives at first so I only classified the carrots. Then I made another model on mostly the same data and classified the knives.
So they could have been together had I known better up front. But they’re basically same data, same imagery, but two separate models/label sets.
Ah. Well, the straightforward option is to go into the dataset with more annotations (sounds like carrots) and just annotate the knives and train a new model.
If you want to try a more painful but possibly rewarding method - you could download both datasets with annotations. If the images were the same for both sets (same name, dimensions, etc), you could in theory move the knife annotations over to the carrot annotation files and the re-upload both images and annotations to Roboflow. I would try giving all the files to an LLM to have it attempt the join.
Essentially the annotations “before”:
{
“boxes”: [
{
“label”: “Carrot”,
“x”: 310,
“y”: 398.5,
“width”: 90,
“height”: 207,
“id”: “1”
},
],
“height”: 720,
“key”: “foodoncuttingboard128.jpg”,
“width”: 1280
}
And “after”:
{
“boxes”: [
{
“label”: “Carrot”,
“x”: 310,
“y”: 398.5,
“width”: 90,
“height”: 207,
“id”: “1”
},
{
“label”: “Knife”,
“x”: 825,
“y”: 443,
“width”: 96,
“height”: 202,
“id”: “2”
}
],
“height”: 720,
“key”: “foodoncuttingboard128.jpg”,
“width”: 1280
}