Issues with Exporting Florence-2 Batch Workflow Annotations for Training an Object Detection Model

aarnavshah12 · March 10, 2026, 5:33am

Summary:
I am building a workflow that runs Florence-2 locally to automatically generate annotations for images. I used the batch processing feature to run the workflow on a dataset of images. I would like to export the resulting annotations together with the corresponding images so they can be uploaded back and used to train an object detection model for a new project.

The Workflow: I have successfully configured a Florence-2 model block within Roboflow Workflows to detect humans. Using the Batch Utility, I am able to generate high-volume inference results.

The Problem: I am unable to “bridge” these VLM results back into a Roboflow project as valid ground-truth annotations. My current outputs are either:

Visualized Images: The bounding boxes are “burnt-in” to the image pixels, making them uneditable and unusable for training.
JSON Metadata: I have the coordinate data, but I cannot find a way to re-upload this JSON alongside my raw images so that Roboflow recognizes them as actual labels.

The Goal: I need a way to ensure the “reasoning” from the VLM (Florence-2) persists as structured data that can be used to train an RF-DETR model. What should I do to turn these batch outputs into editable labels in the Annotate tab?

Workflow Link

Project Type: Object Detection

Operating System & Browser: Windows / Google Chrome

Do you grant Roboflow Support permission to access your Workspace for troubleshooting?: Yes

erik_roboflow · March 10, 2026, 2:29pm

Hi @aarnavshah12 ,

If you plan to use Florence2 results for annotations in datasets, you’d need to convert the Florence2 results into dataset-compatible annotation formats (see here), so you’d be able to upload them to Roboflow (cli docs here).

If you want this automated, workflow would be like:

Start batch job (* batch CLI docs), wait for it to finish
Download batch results (Florence2 outputs), use same CLI tool
Convert Florence2 outputs to match dataset-compatible format
Upload images+annotations to Roboflow

Thoughts?

aarnavshah12 · March 10, 2026, 2:49pm

Hi Eric, any idea on how I can convert the Florence2 outputs to match dataset-compatible formats? That’s the main issue I’m facing.

erik_roboflow · March 10, 2026, 3:01pm

Sure, could you please share an output from Florence2?

aarnavshah12 · March 10, 2026, 3:24pm

Yeah sure. The image I’ve uploaded is one of them. Also, I’m getting CSV files with information like so:

{“output_3”: “<deducted_image>”, “output_1”: “<deducted_image>”, “output_2”: {“image”: “<deducted_image>”}, “output_4”: {“image”: “<deducted_image>”}, “output_5”: false, “output_6”: {“image”: {“width”: 500, “height”: 413}, “predictions”: [{“width”: 228.0, “height”: 356.0, “x”: 219.0, “y”: 234.0, “confidence”: 1.0, “class_id”: 0, “class”: “humans”, “detection_id”: “44ee815b-a447-4866-ace3-00ca640b8610”, “parent_id”: “image”}]}, “output_7”: “ecd8c9c7-df2b-4e22-8fc9-e088920123f6”, “output_8”: {“error_status”: false, “predictions”: {“image”: {“width”: 500, “height”: 413}, “predictions”: [{“width”: 228.0, “height”: 356.0, “x”: 219.0, “y”: 234.0, “confidence”: 1.0, “class_id”: 0, “class”: “humans”, “detection_id”: “44ee815b-a447-4866-ace3-00ca640b8610”, “parent_id”: “image”}]}, “inference_id”: “ecd8c9c7-df2b-4e22-8fc9-e088920123f6”}, “output_9”: “{\“bboxes\”: [[105, 56, 333, 412]], \“bboxes_labels\”: [\“humans\”], \“polygons\”: [], \“polygons_labels\”: []}”, “output_10”: {“raw_output”: “{\“bboxes\”: [[105, 56, 333, 412]], \“bboxes_labels\”: [\“humans\”], \“polygons\”: [], \“polygons_labels\”: []}”, “parsed_output”: {“bboxes”: [[105, 56, 333, 412]], “bboxes_labels”: [“humans”], “polygons”: [], “polygons_labels”: []}, “classes”: [“humans”]}, “output_11”: {“bboxes”: [[105, 56, 333, 412]], “bboxes_labels”: [“humans”], “polygons”: [], “polygons_labels”: []}, “output_12”: [“humans”]}

erik_roboflow · March 10, 2026, 4:12pm

Hi @aarnavshah12 , you should ask an LLM/coding agent to convert the output you sent to eg coco-json dataset format, 2 things to note:

Correct mapping between bbox_labels (“humans”, output from florence2) to class id in your dataset. So eg. “humans” → class_id=0
Mapping between batch outputs and input images, so you can correctly assign annotations to image ids

Once you have coco-json annotations and images you can use CLI to upload the dataset to Roboflow platform

aarnavshah12 · March 10, 2026, 4:29pm

Mhm. Thanks! I understand!

Topic		Replies	Views
Using edited roboflow detections as annotations for re-training? 🤝 Community Help	4	117	July 26, 2024
I have images and their corresponding bounding box coordinates in a JSON file, but when I upload them together, Roboflow isn't linking them correctly to generate annotated images 🤝 Community Help	2	131	June 25, 2025
Does roboflow provide json file for annoted images? 🤝 Community Help segmentation , formats , convert	2	562	November 22, 2023
How to use exported JSON video annotations from LabelBox? 🤝 Community Help	7	297	July 2, 2025
Dataset annotations for object detection from another tool visible on Roboflow, but lost when generating version 🤝 Community Help	1	525	March 27, 2023

Issues with Exporting Florence-2 Batch Workflow Annotations for Training an Object Detection Model

Related topics