Hi! I’m working on automatic dataset annotation using Grounded SAM.
Now I want to upload the annotations to Roboflow, like in the notebook, to do some manual alternations.
Now my question is: I made the dataset using the predict_with_caption command instead of the predict_with_classes command, since I want to use a more complex text prompt (e.g. chair with a man) opposed to a single class (e.g. chair), and as I understand, a caption is the best way to do so.
I altered the cell “Extract labels from images” in the notebook as such:
for image_path in tqdm(image_paths):
image_name = image_path.name
image_path = str(image_path)
image = cv2.imread(image_path)
# detect objects
detections, labels = grounding_dino_model.predict_with_caption(
image=image,
caption=caption,
box_threshold=BOX_TRESHOLD,
text_threshold=TEXT_TRESHOLD
)
detections.mask = segment(
sam_predictor=sam_predictor,
image=cv2.cvtColor(image, cv2.COLOR_BGR2RGB),
xyxy=detections.xyxy
)
images[image_name] = image
annotations[image_name] = detections
Uploading the annotations using the standard cells in the notebook gives issues regarding the class ID and such, I assume since the detections and annotations are structured slightly different. How could I adapt the following code to get my annotations in VOC format?
sv.Dataset(
classes=CLASSES,
images=images,
annotations=annotations
).as_pascal_voc(
annotations_directory_path=ANNOTATIONS_DIRECTORY,
min_image_area_percentage=MIN_IMAGE_AREA_PERCENTAGE,
max_image_area_percentage=MAX_IMAGE_AREA_PERCENTAGE,
approximation_percentage=APPROXIMATION_PERCENTAGE
)
Thanks in advance!
- Project Type: Automatic Annotation for Object Detection
- Operating System & Browser: Windows 11/ Chrome
- Project Universe Link or Workspace/Project ID: Google Colab