How can I create a dataset filtering for specific classes? I realize I can filter by tag, but that’s not the same.
Not sure if there is a formal way to do this but here’s an idea.
You seem to have found the tag filter. So this method utilizes that.
I went to my Dataset and filtered for only the classes I wanted - in this case “buck”. Then I checked the box to select all (two images) and then the “Action” dropdown. In that dropdown is “Apply Tags” so I just put the tag “keep” on all those (two of them) selected images.
Now I go to the Dataset and click “+ New Dataset Version” at the top right, I just have to add a preprocessing step to only allow those images with a “keep” tag (the tag filtering I think you were referring to.)
Now I have just those two images (plus augmentations) in this version. I hope that helps but if not just pass along some more details. And I’m sure others will chime in as well if there is a better solution.
[Side note: there is a “Modify Classes” in the Preprocessing step when creating a new data version - but that appears to keep ALL images and just erases any annotations you don’t want. So you have the same size dataset but only the classes you want and then a bunch of “null” images. Not sure that’s what you wanted.]
@Automatez Great answer!! That is correct, when you use “Modify Classes” as a preprocessing step, it keeps all the images and removes the annotations you don’t want. Thanks again for contributing to the Roboflow community!!
@sp88011 This is an interesting feature request! To help us better understand, could you share why this feature is important to you, what exactly you’re trying to accomplish, and how it would improve your experience with Roboflow?
Thank you again for your feedback and contributing to the Roboflow community!
Thanks @Automatez that’s a nice workaround and achieves what I wanted to do.
@Ford - I’m new to Roboflow and CV in general so take this with a grain of salt.
Since I’m working with “versions of datasets” inside Roboflow, I think that being able to slice any existing dataset by classes/annotations is the main way to build high quality “versions” tailored to specific use cases.
I think this would also make your data marketplace (Universe) more powerful…allow me to prune those datasets to the images I need. Allow me to “mix & match” images from different datasets, e.g. “All images containing bicycles”.
Right now, this work takes place outside Roboflow for no good reason.
Possible I’m missing something fundamental here for why this isn’t a good practice…
@sp88011 First off, welcome to the Roboflow community! Please keep great ideas like this flowing, I love it!
@Automatez What do you think? Would you benefit from a feature like this?
@Ford that’s a feature I could see making use of someday. I could see project directions changing or side projects being developed that only utilize some of the classes. Some possible benefits:
- your annotation people went a little overboard and now you want to test a model with fewer classes in case it could accomplish the end goal more simply
- side projects that pop-up which only require some of the classes to execute an effective model
- allow the user to quickly create the limited class version without flooding it with all the “null” images that “Modify Classes” would create in removing annotations
- like @sp88011 mentioned, if a user had multiple datasets and then realized they wanted to grab a class from each and combine, that would be made much easier
- the workaround I suggested could be layered (keep_classDog, keep_classDogCat, keep_classHorseCat, etc) but that could get messy so a formal process would be nice
If Roboflow did add this feature, it might be nice to have some stats built in from the start. How many images are being brought over from the one class would be a basic one. But someone needing this might also find it useful to know how many annotations were wiped out. Eg. “You have kept 57 images with dogs, there are 82 annotations with class=dog. Removed from the 57 images were 23 annotations of class=cats and 2 annotations of class=horses.” That will also communicate clearly what they have create for a dataset in case they anticipated something else. Or maybe you even build in the option to keep any image with a dog AND retain all other annotations as well.
So much to consider and play with in this platform! Thanks, Ford!
@Automatez Fantastic insights!! Love the idea to provide stats from the start. Would you be interested in picking and choosing images from universe that contained certain classes? For example, if you only wanted images with dog and cat annotations?
@Ford that’s a really good idea. I could see that being especially useful for quick proof-of-concept situations - to see if it’s worth annotating a certain combination of items/classes. Or even to demo to a stakeholder (a client, your manager, your professor) without having to get a bunch of their image data first.
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.