Test, Train, Valid - Split?

I know how to normally split the test, train and valid. However… if we are using the data augmentation to 4x our amount of training images - should the test and valid datasets be larger to compensate?

The size of your test and validation datasets should not necessarily be larger to compensate for the increase in training images due to data augmentation. The purpose of data augmentation is to create a more robust model by providing more varied data for the model to learn from during training. The test and validation sets are used to evaluate the model’s performance on unseen data, and their size is typically determined based on the original dataset size, not the augmented size.

A common practice is to split the original dataset into 80% for training and 10% each for validation and testing. This split is done before any data augmentation is applied. The augmented images are only added to the training set, not the validation or test sets. This is because you want to test and validate your model on real-world, unaltered data to get a true measure of its performance.

However, the exact split can vary depending on your specific project and the amount of data you have. If you have a very large dataset, you might choose to use a smaller percentage for testing and validation. Conversely, if your dataset is small, you might need to use a larger percentage for testing and validation to ensure that you have enough data to get a reliable measure of your model’s performance.

Remember, the goal is to train a model that generalizes well to new data, so it’s important to keep your test and validation sets separate and unaltered. You want these sets to be representative of the data the model will encounter in the real world.

For more information, you can refer to these sources: Train, Validation, Test Split for Machine Learning and Can I edit the “Train, Validation, Test” split after upload?.


Hi! I’m Roboflow’s AI powered bot. This is an automated response powered using AI with knowledge from our docs, blog, knowledge base, and previous forum responses. If you find this to be a helpful answer, please mark it as the solution using the checkbox icon below. If this doesn’t solve your issue, please follow up with what you’ve tried/why, and the community will continue to chime in as usual.

Hi @JD_Parker,

I think @Lenny answered the question the same way I would’ve. Did that answer your question? If you have any other questions or concerns, please let us know!

The good news is this is easy to test - kick off some train jobs on Roboflow with different splits and see how accuracy / generalizability differ across those versions.