Dataset License Issue

Hi, I was just made aware of this website as a repo for datasets. I came across a specific dataset: Pet Waste Detect Object Detection Dataset by Test which happens to be a dataset that I created, although I’m not the one who uploaded it here.

Now, I don’t mind someone uploading my dataset here; in fact I’m glad to see it. The entire reason I made it was to share, but my dataset license is CC BY 4.0, which means it should have some attribution to my work. A link to: GitHub - Erotemic/shitspotter: An open source algorithm and dataset for finding poop in pictures. would be nice.

I don’t see an option to report a license violation, nor do I see a way to contact the uploader, and I was wondering if there was something I could do so I could be properly attributed.

Hi @erotemic!
I’m sorry you ran into this issue. We see that you’ve forked the dataset under your own account, perfect!

To close the loop, the improperly attributed version has been taken down, so the dataset is now correctly attributed to your name and account.

Thanks again for bringing this to our attention!

Hi @Ford, Thanks for the fast response.

It looks like roboflow has quite a few poop datasets, that are disjoint from mine. My survey of existing datasets completely missed these!

I’d like to fix that by running my dataset analysis scripts on these datasets, and to start organizing which ones look high quality I’ve started forking them into my workspace. However, when I do that, I don’t see any link back to the original dataset on my fork, which makes it hard to correctly attribute the original authors.

Am I missing a UI element, or is there a way to query for the original repo with an API call? I do see that forking a dataset appends a hash onto the end of the project-id, but I don’t see any way to figure out what the original workspace was.

As a follow on, is there a way to contact or message users? I’d like to verify that the publishers of any datasets are the original authors for as many cases as I can. It would also be useful to try and ask about additional dataset information.

Lastly, I’m noticing that many of these datasets are augmented when I download them, even though I selected a pipeline without any augmentations. Is that because that’s how the author’s uploaded them? Or are downloadable artifacts always post-augmentation?

Hi @erotemic,
My pleasure! I will address your questions in order below:

  1. Unfortunately there is no built-in backlink to the original project after forking a dataset from Universe. My suggestion is to store the original Universe URL in the project description.
  2. There is currently no native functionality to message other users within the Roboflow app.
  3. By default, Universe displays the base datasets. However, if you’re seeing augmentations, these must have been applied by the authors.

Please let me know if I can assist any further!