How to upload multimodal (vision language) dataset

I have a dataset of invoices with images and data extraction in JSON format. Like an OCR. I created a folder where there is a subfolder with all the images in JPG and a file annotaitons.jsonl, where each line contains:

{
'image': "path/to/image.jpg",
'prefix': "",
'suffix': "extracted data"
}

I’m trying to upload this to Roboflow, but it’s not interpreting the annotations. Either I just keep the images and go to manual annotation, or it gets stuck on this screen

Should I upload these files somewhere and exchange the ‘image’ key with the urls? Should I use another format?

Hello @Samuel_Lima_Braz, welcome to the forum, and thank you very much for contacting us.

I would like to ask two things so that we can validate this. Could you please provide the workspace and the project so that we can get more details?

But looking at your JSON format, it seems that the problem is there. You used single quotes instead of double quotes, which is invalid JSON. If that is the case, we can easily solve the problem.

But if it still doesn’t work, please contact me again and I will be happy to help you.

Best regards,
Leandro Rosemberg

Now I was able to upload using the CLI. Actually, I was using double quotes, but I mistyped them in the question.

The format I used was:

{
  "image": "20210322_172436.jpg",
  "prefix": "<JSON>",
  "suffix": "{\"company\": \"intermarche\", \"date\": \"2019/06/20\", \"address\": \"armação de pera\", \"total\": \"24.99\", \"invoice_number\": \"0EAA061219/134\", \"buyer_nif\": \"514407395\", \"vat_value\": \"4.67\", \"seller_nif\": \"508162294\"}"
}

However, it didn’t work through the interface—or at least there was no indication that any upload was happening. With 814 files, using the CLI command roboflow import -w tech-ysdkk -p brazilian-documents data/invoices/ worked perfectly.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.