Data Augmentation Split Issue

Florian · January 1, 2025, 11:16pm

Hi,
I’m currently building a dataset and I’m facing something weird :
When I split my data (train/valid/test) with the default 70%/20%/10% split, after the augmentation step, the split is not the same anymore.
It seems like it’s only increasing the data in the training set.
I don’t know if it’s how it’s supposed to work, hope someone can help/explain.

Thanks in advance for your help !

Florian · January 1, 2025, 11:19pm

the split after the augmentation step :

Jacob_Witt · January 2, 2025, 5:01pm

Hi there! This is because the split works on your original set of source images. After you create that split, we augment only images in the train set. If valid or test images were augmented, this would hurt model performance.

system · January 9, 2025, 5:02pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Test, Train, Valid - Split? Community Help	3	2340	July 26, 2023
Augmented images split location and behaviour Community Help	2	226	December 24, 2023
Create a Dataset Version Community Help	2	34	January 9, 2025
Train/Valid/Test Split after Tiling? Community Help feature-request	0	246	April 9, 2023
Issue Creating a new version Community Help	1	29	February 22, 2025

Data Augmentation Split Issue

Related topics