Train/Test Rebalancing Doesn't Randomly Select from All Classes

The train/test rebalancing feature still seems to not be randomly allocating images across classes into each set. This bug was described years ago here: Need Ability to randomly shuffle "merged dataset" when allocating Train/Val/Test Sets - #4 by Fredrik_T

Is there any update on this behavior? This is serious limitation to being able to work with a dataset of any decent size.

Edit: Another thread detailing this same bug: After Merging Datasets, Re-balancing (Train/Val/Test) excludes multiple classes in VAL split - #7 by 25benjaminli

@brad tagging you directly since you’ve been involved on previous threads.

What’s the underlying problem you’re trying to solve?

I am trying to get my train/valid/test splits to have equal representations of each of the classes in my dataset. If I generate a new version and change the splits, the new distribution does not have a random selection of classes.

The only way to get a new random distribution at that point is to download the imagery and re-upload it using the web interface. This is time consuming when dealing with nearly 10k images.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.