Please can you help me understand a fundamental of design. Let’s say that I train two seperate
detection models, “stamps” and “coins” and that both models are working great.
And now say that I’d like to detect both stamps and coins from a single video feed. How should I approach this problem? Can I somehow merge the two models? Or should I try to merge the two annotation sets, and then retrain?
A problem that I have with the datasets is that the original “stamps” training images contain unannotated coins. And the “coins” training images contain “stamps”.