Train/Test Rebalancing Doesn't Randomly Select from All Classes

The train/test rebalancing feature still seems to not be randomly allocating images across classes into each set. This bug was described years ago here: Need Ability to randomly shuffle "merged dataset" when allocating Train/Val/Test Sets - #4 by Fredrik_T

Is there any update on this behavior? This is serious limitation to being able to work with a dataset of any decent size.

Edit: Another thread detailing this same bug: After Merging Datasets, Re-balancing (Train/Val/Test) excludes multiple classes in VAL split - #7 by 25benjaminli

@brad tagging you directly since you’ve been involved on previous threads.

What’s the underlying problem you’re trying to solve?

I am trying to get my train/valid/test splits to have equal representations of each of the classes in my dataset. If I generate a new version and change the splits, the new distribution does not have a random selection of classes.

The only way to get a new random distribution at that point is to download the imagery and re-upload it using the web interface. This is time consuming when dealing with nearly 10k images.