How to merge two datasets

I would like to merge my own object detection dataset with a public one that has many more annotated images in the same class. Is this possible? How is it done?
I am using the Oak-D camera (Rgb mode) and a Raspberry Pi with Raspian.

Hi,

Here is how to merge datasets: Merging Projects or Datasets - Roboflow

Additionally, you may want to use Modify Classes to remap/relabel classes if they don’t have the same name.

Example: “car” class in your project, and “Car” class in the public project? Use Modify Classes after merging the datasets to change both classes to “car” or both classes to “Car.”

How do we copy a public dataset into our workspace?

Select the dataset version of the project in question. From there, export it in COCO JSON format, unzip the folder and upload to your project.

I followed your instructions but it only downloads one image, not 4203 images!

Which dataset are you referring to? Can you send me a link so I can take a look?

And which version of the dataset did you download?

It’s the public “weeds” dataset, under agriculture. I think I figured out, the untrained version has 4203 images, the trained version has ONE. haha

I got them downloaded, but can’t upload here because of firewall. I’ll have to wait until I’m at another location.

Good day Mohamed, Please it is possible to colllaborate projects from different individuals after creatiung a dataset?

Hi @Andrey-blaze - if the aim is to work on the same project with teammates, that only requires inviting them to the workspace with the associated project(s): Adding Team Members to a workspace - Roboflow

To merge projects in a workspace: Merge Projects/Datasets - Roboflow

To duplicate projects in a workspace: Duplicating Projects or Datasets - Roboflow

Cloning Images: Dataset Upload: Cloning Images from Roboflow Universe - Roboflow |

  • this feature is currently available for any public datasets (public projects in Community/Public workspaces) — images from public projects on Roboflow Universe can be cloned to any project in a workspace (both public or private) that you are in.
  • there are pre-filled citations available on the Overview/main page for Universe datasets, for you to leverage if you clone a project

Have a good time, Mr. Mohammad. I want to clone 1000 images from two dataset that exists in Univers and has 8000 images, and have each member of the team annotate 4 different classes in each image. Please guide me. Thank you

Hi @Navid_Mashoofi - we have a guide here for Cloning Images:

If you select the box on the Dataset page to reveal 200 images per page, then you can click “Select All” afterward on each page until you’ve selected 1,000 images.

From there, go ahead and clone the images.

You can invite your teammates, and Assign them labelling jobs: https://docs.roboflow.com/annotate/collaborative-workflow

There is an option to add labelling instructions in written format for each labelling job. The formatting is Markdown, meaning you can add example images of how you want things labelled, too.

image

image

1 Like

Yes, it is possible to merge your own object detection dataset with a public one that has more annotated images in the same class. The process of merging the two datasets involves combining the images and annotations from both datasets into a single dataset.

Here are the steps to merge two object detection datasets:

  1. Format the datasets: Make sure both datasets are in the same format. There are several formats for object detection datasets, such as COCO, Pascal VOC, and YOLO, among others. You should format your dataset to match the format of the public dataset you want to merge with.
  2. Check the labels: Make sure that the labels in your dataset match the labels in the public dataset. If the labels are different, you will need to re-label your dataset to match the public dataset.
  3. Combine the datasets: Combine the images and annotations from both datasets into a single dataset. You can do this by copying the images and annotations from one dataset into the other, or by creating a new dataset that includes both sets of images and annotations.
  4. Balance the dataset: If the public dataset has significantly more annotated images than your dataset, it may be necessary to balance the dataset by randomly selecting a subset of the images from the public dataset.
  5. Train the model: Once the datasets are merged, you can use the combined dataset to train your object detection model.
1 Like