Masks become crooked and noisy after passing through supervision library

I am working on Yolov8 and SAM. I am trying to use the same annotations for both the models.

We wanted to convert the annotations that we got done on your platform, to masks to train SAM. However, when I use the code that uses supervision library for(pasting it below). My masks are crooked and not like how they are annotated.

##  treat all labels as masks/polygons (vs how ultralytics converts some to bboxes)

# import supervision (open source set of cvutils)
import supervision as sv
# grab our data
project = rf.workspace("").project("")
dataset = project.version(3).download("yolov8")

# for each image, load YOLO annotations and require mask format for each
for subset in ["train", "test", "valid"]:
    ds = sv.DetectionDataset.from_yolo(
        images_directory_path=f"{dataset.location}/{subset}/images",
        annotations_directory_path=f"{dataset.location}/{subset}/labels",
        data_yaml_path=f"{dataset.location}/data.yaml",
        force_masks=True
    )
    ds.as_yolo(annotations_directory_path=f"{dataset.location}/{subset}/labels")

After doing this, I took a look at the txt file before and after.
This is before:
0 0.24367703857421874 0.5947080708007813 0.32577574609375 0.5946408608398438 0.3256521215820313 0.584001533203125 0.38616494580078126 0.5838778208007812 0.38616494580078126 0.44245848974609375 0.24366789990234375 0.4424227666015625 0.24367703857421874 0.5947080708007813
0 0.4681246420898437 0.576236357421875 0.5384933212890625 0.5761529790039063 0.5382433774414063 0.4423616875 0.3912994501953125 0.4423616875 0.39114250390625 0.594635390625 0.4681246420898437 0.5945798715820313 0.4681246420898437 0.576236357421875
0 0.19015809912109374 0.532453453125 0.179838498046875 0.5323361000976562 0.17960396533203124 0.49474487890625 0.049611380859375 0.49474487890625 0.04761782568359375 0.4960357807617187 0.02815130517578125 0.49615313818359374 0.028172552734375 0.6419591596679688 0.19027537451171875 0.6423729399414062 0.19015809912109374 0.532453453125
0 0.20617587744140625 0.6494069682617187 0.11866647216796875 0.649729455078125 0.11880823681640625 0.6924327993164062 0.09655086279296875 0.6925746689453125 0.0968343916015625 0.65015507421875 0.0251399677734375 0.6496627353515625 0.024852654296875 0.8023033012695312 0.24352058984375 0.8023752939453125 0.24407509423828125 0.6602119057617187 0.20595053466796875 0.6604566748046875 0.20617587744140625 0.6494069682617187
0 0.11568 0.6509393481445312 0.100116763671875 0.6509393481445312 0.10020867333984375 0.6883252021484375 0.1157992568359375 0.6883736821289063 0.11568 0.6509393481445312
This is after:
0 0.24365 0.44238 0.24365 0.59424 0.32568 0.59424 0.32568 0.58936 0.32520 0.58887 0.32520 0.58447 0.32568 0.58398 0.35547 0.58398 0.35596 0.58350 0.38574 0.58350 0.38574 0.44238
0 0.39111 0.44189 0.39111 0.59424 0.46777 0.59424 0.46777 0.57666 0.46826 0.57617 0.50293 0.57617 0.50342 0.57568 0.53809 0.57568 0.53809 0.44189
0 0.04932 0.49463 0.04883 0.49512 0.04834 0.49512 0.04785 0.49561 0.03809 0.49561 0.03760 0.49609 0.02783 0.49609 0.02783 0.64160 0.10889 0.64160 0.10938 0.64209 0.18994 0.64209 0.18994 0.53223 0.18018 0.53223 0.17969 0.53174 0.17969 0.51367 0.17920 0.51318 0.17920 0.49463
0 0.16260 0.64893 0.16211 0.64941 0.11865 0.64941 0.11865 0.69189 0.11816 0.69238 0.09668 0.69238 0.09619 0.69189 0.09619 0.67139 0.09668 0.67090 0.09668 0.64990 0.06104 0.64990 0.06055 0.64941 0.02490 0.64941 0.02490 0.72559 0.02441 0.72607 0.02441 0.80225 0.24316 0.80225 0.24316 0.73145 0.24365 0.73096 0.24365 0.66016 0.20605 0.66016 0.20557 0.65967 0.20557 0.65479 0.20605 0.65430 0.20605 0.64893
0 0.10010 0.65088 0.10010 0.68799 0.11572 0.68799 0.11572 0.66992 0.11523 0.66943 0.11523 0.65088
0 0.24854 0.64941 0.24854 0.80176 0.39014 0.80176 0.39014 0.73145 0.39062 0.73096 0.39062 0.66016 0.33008 0.66016 0.32959 0.65967 0.32959 0.64941

You can see that the values change quite a lot. However the problem is when I convert these into masks.

right side is the mask visualized before passing to sv library. left side is after passing through library. You can see my mask on the left is crooked, and not straight. This is due to int rounding, I guess. I am not sure.
I wanted some clarity on why the annotation values in txt change so much after using supervision. Also, I can generate the masks without passing to sv and get done for the use case of SAM. However, now that we are going to feed these to YOLOv8(I need to use sv as there is an ultralytics mismatch problem), the model could suffer due to the 1-2 pixel error that is happening.

please advice.

The supervision library is used to convert mask data into bounding box data and vice versa. It seems like you are experiencing some issues with the conversion process, specifically with the mask annotations appearing crooked after conversion.

The conversion process involves some mathematical operations that could potentially lead to rounding errors, especially when dealing with pixel-level data. This could explain the slight discrepancies you are observing in the mask annotations.

Here are two methods to convert mask data into bounding box data using the supervision library:

Method 1: Use the supervision.mask_to_xyxy utility

# Import Supervision
import supervision as sv

# Convert each polygon in the array of polygons to masks
bounding_boxes = sv.mask_to_xyxy(masks)

Method 2: Import detections into supervision and export from the mask property

# Import Supervision
import supervision as sv

# Import mask data
detections = sv.Detections.from_roboflow(prediction,classes)

# Export from detections as bounding box data
bounding_boxes = detections.xyxy

(Source: Roboflow Blog)

If you are still experiencing issues, it might be beneficial to review the parameters you are using when saving the annotations. For instance, the approximation_percentage parameter controls the simplification of the mask polygons. Adjusting this value can help balance the trade-off between the accuracy and complexity of the mask shapes.

If the issue persists, it might be a good idea to reach out to the Roboflow team directly for more specific guidance.


Hi! I’m Roboflow’s AI powered bot. This is an automated response powered using AI with knowledge from our docs, blog, knowledge base, and previous forum responses. If you find this to be a helpful answer, please mark it as the solution using the checkbox icon below. If this doesn’t solve your issue, please follow up with what you’ve tried/why, and the community will continue to chime in as usual.

Thanks much for the report.

I suspect you’re correct re: int rounding being the culprit. I suspect the minor change in values when you’re converting annotations is minimally impactful on model learning, though requires investigation.

I cross-posted this as an open issue for supervision: Possible issue in precision converting annotations with "force_mask=True" · Issue #369 · roboflow/supervision · GitHub

1 Like