Convert 9k image dataset from yolo v8 bounding box to instance segemntation

I have a dataset that consist of 9k images of human and ball classes which i trained using YOLO_v8. The model currently is instance detection with bounding boxes so when I inference an image i get results as boxes around the detection’s.

My project now requires me inference images and provide segmentation results of the actual humans and ball.

Rather than use the smart polygon tool to go through each image manually, is there a way I can use any ai tech to automatically convert the dataset to smart polygons? instead of having to do this manually?

As an example, i need this kind of image to be uploaded:

and the results to be like this:

a transparent png image with only the detections.

Currently i use opencv to remove the greens from the bounding boxes however when used in multi sports like basketball this isnt good as the humans and ball may be on different colour floors.

Hey Carl,

You could checkout Roboflow’s autodistill python library for this task. It uses large models like DETIC or GroundedSAM along with CLIP embeddings to annotate images and videos with the classes you provide. The Roboflow team has used it to create massive high-quality datasets for specific domains.
They have a google colab notebook demonstrating using it here: Google Colab

Otherwise I would recommend checking out Universe for player segmentation datasets. I know many people have worked on similar projects before.

Hey all! We have a blog post with a Colab to convert bounding boxes to polygons using SAM here: Google Colab

Full blog post on the topic here: How to Use the Segment Anything Model (SAM)

1 Like