Need Ability to randomly shuffle "merged dataset" when allocating Train/Val/Test Sets

First off… Roboflow is an awesome platform!!

An issue I discovered concerns the distribution of images when creating or re-balancing the dataset for Train-Val-Test. It appears that there is no random shuffle and in most cases, we are ending up with a Validation Set solely consisting of only one object class. For example, a dataset consisting of 5 object classes (car, bus, van, truck, boat) that were independent datasets uploaded to Roboflow and then subsequently “merged” and then re-balanced (80/15/5) when creating a new version results in the Validation set solely consisting of e.g. “bus” and no other images from any other object class. Same goes for the Test Set which isn’t really that big of a deal but clearly we would want to have a good balance of all classes when performing validation.

Don’t think I have missed some function inside of Roboflow but if so, please advise. Having to export 5 classes after being merged, apply random shuffle before splitting into Train/Val/Test, and then having to re-upload the properly balanced dataset before generating an formatted export seems counter productive.

Please advise and as always, keep up the great work.

I also discovered the same issue yesterday evening, have you found a solution for that? Thanks in advance

Thanks for reporting.

This sounds like a recent regression stemming from us keeping uploaded images in sequential order to make it easier for annotators to label sequences of video frames.

Let me look at the code and see what I can do to restore the randomness for the model while preserving the ordering for humans in the tool.