Importing multiple Labelbox annotation formats to a deep learning model

Hi all,

I’m a beginner with Roboflow and deep learning, and I am a bit stuck on how to get started with my project using Roboflow, and would really appreciate any help or advice!

I’m unsure of the project type as yet. My operating system is MS Windows 10 Pro x64 and my browser is Opera.

I am trying to use a deep learning model to be able to scan dental radiographs/X-rays and hopefully be able to chart which teeth are present/not present on a radiograph, but also to be able to identify and highlight any areas of dental decay. I am using the “Tufts dental image database” which consists of 1000 dental panoramic radiograph images. They are all of a consistent size (1615 x 840 px) at a resolution of 72dpi. There are four folders, each containing various versions of the images as well as a series of JSON files containing annotations. There is a “bounding box” JSON and a “teeth polygon” JSON. These contain bounding box outlines of the teeth and the polygonal information relates to dental decay on the teeth. These have been labelled in Labelbox by dental experts.

I am trying to get started with Roboflow in order to hopefully use the Detectron2 model, because on the Roboflow page for the model it states that it does object detection, and semantic segmentation, which seem to be the two tasks that I need it to do? Object detection (using the bounding box information to outline the teeth) and semantic segmentation (using the teeth polygon information to highlight areas of decay) seem to be the information that I would need.

I would greatly appreciate any help regarding which Project Type that I should create for this? The options that seem relevant to my task are “Object Detection (Bounding Box)” but also “Semantic Segmentation”? The problem is that I can only pick one or the other. Ideally I need my model to be able to do both of these things with unseen radiograph images once the model is trained.

I would really, really appreciate any help or guidance on this! Even just a quick summary of the steps I need to take would be so helpful! My apologies if this is a long post - I have just tried to include as much information as I can! A quick summary of the Tufts dental database is below. Many thanks in advance to anybody who replies.

T

A quick summary of the Tufts Dental Database is:

  • Four folders each containing images:

  • The Expert folder contains “gaze maps” which werecreated by eye tracking equipment used when the dentists were looking at the radiographs. It also contains the annotations that were made by the dentists. These annotations include the bounding boxes of the teeth and the polygons of the caries.

  • The Radiographs folder contains the panoramic dental radiographs.

  • The Student folder contains the annotations that were made by the student dentists.

  • The Segmentation folder contains segmentation masks of the teeth as well as the “teeth polygon” and “bounding box” JSON files.

The dataset and a longer summary can be found here: http://tdd.ece.tufts.edu

Hello,

Based on the information you’ve provided and the extracts I have, here’s a general guide on how to get started with your project on Roboflow:

  1. Start your first project on Roboflow: You can start your first Object Detection project on Roboflow. This involves Dataset Upload, Dataset Generation, and Model Training with Roboflow Train. You’ll also gain insight into Dataset Health Check, selecting preprocessing & augmentation steps, exporting in different annotation formats, and deploying your model to run inference. Source

  2. Upload your dataset: You can upload your images or videos into Roboflow. Data can currently be added to Object Detection, Single-Label Classification, Multi-Label Classification, Instance Segmentation, and Semantic Segmentation projects. Source

  3. Label your images: After uploading your dataset, you’ll need to label your images. You mentioned that your dataset already contains bounding box and polygon annotations, which is great. Roboflow supports labeling with bounding boxes, polygons, and Smart Polygons. Source

  4. Train your model: Once your images are labeled and you’ve generated a dataset version, you’re ready to train a model. You mentioned wanting to use the Detectron2 model, which is a great choice for object detection and semantic segmentation tasks. Source

As for the project type, it seems like you’ll need to perform both Object Detection and Semantic Segmentation. However, as you’ve noted, you can only pick one project type. I would recommend starting with Object Detection to identify and outline the teeth, as this seems to be the primary task. Once you’ve trained a model for Object Detection, you could then consider training a separate model for Semantic Segmentation to highlight areas of decay.

Please note that this is a general guide and the exact steps may vary depending on your specific needs and the nature of your dataset. I hope this helps, and best of luck with your project!


Hi! I’m Roboflow’s AI powered bot. This is an automated response powered using AI with knowledge from our docs, blog, knowledge base, and previous forum responses. If you find this to be a helpful answer, please mark it as the solution using the checkbox icon below. If this doesn’t solve your issue, please follow up with what you’ve tried/why, and the community will continue to chime in as usual.

I have followed the instructions as best I can but I am hopelessly stuck and nothing appears to be working as it should.

On this page:

it says to upload the images and JSON file, then click “Generate New Version” of the dataset, but that button is not there on my screen. I upload the JSON file along with the images, and it just says the images are “unassigned” and that they need to be annotated. Why is it telling me this? I have uploaded the annotation information in the JSON file, so why would I need to annotate them again?

Any help would be very gratefully received.

T

Please could somebody help me…? I’d be willing to pay for help if I can get this working.

Anybody…???

Hi @TheHydra

Sorry to hear of the problems you’re experiencing. It looks like the annotations aren’t properly being imported into Roboflow from your LabelBox file.

Would you be comfortable sharing your LabelBox JSON file so that we could help find and solve your issues faster?

Thank you stellasphere, I really appreciate your help :slight_smile:

If you would like to have a look at the actual files/images, the following links may be of use.

The image dataset was compiled and annotated by Tufts University, and can be downloaded either from my Google drive at:

https://drive.google.com/drive/folders/1sjFF_d32BHD5oqnK3IHPwUtraS763jmN?usp=sharing

The JSON files are in a zip file at:

Or, the original image set can also be found at http://tdd.ece.tufts.edu/

In case you are unable to look at the files, I have created the above screenshot to give you an idea of the layout of the files/folders. There are four folders, “Expert”, “Radiographs”, “Segmentation” and “Student” each containing subfolders and JSON files. Because the dataset has no documentation or notes with it, I have been able to ascertain that the “Student” and “Expert” folders seem to relate to eye tracking data gathered as both the students and experts were looking at and annotating the images. My anticipation is that this “eye tracking” data may not be relevant to me, only the bounding box/polygonal information.

If you look at the images in the “Radiographs” folder in something like Photoshop, you can overlay them with the images from the other folders and they fit perfectly. This leads me to believe that they are all of an identical size, etc.

The annotation data within the “teeth_bbox” and “teeth_polygon” JSON files in the “Segmentation” folder should therefore work with the actual X-ray images in the “Radiographs” folder? The image titles are the same. WHat I did when logged into Roboflow was upload the “Radiographs” images alongside the “teeth_bbox” JSON file. Roboflow had all the images but it does not seem to know how to use the JSON information? I have checked the JSON files in several online JSON file integrity checkers and they have all said that the files are fine?

Ideally I would like to be able to use the annotation data that came with the dataset to train a model (Detectron2), both to chart the tooth positions (i.e the bounding box information) and to identify areas of dental decay (presumably using the polygonal information). As of yet though I have been unable to even get started because of this issue.

Many thanks for your time in helping me, I do appreciate it!

Best wishes,

T

Hi @TheHydra

I took a look at the dataset and an example of the LabelBox annotation format, and it looks like there are some distinct differences in how they are made.

Here’s a portion of the JSON for the Tufts dental dataset:

[
    {
        "Label": {
            "objects": [
                {
                    "title": "Periapical",
                    "value": "periapical",
                    "classifications": [
                        {
                            "featureId": "ck1s6uayb1f1p0811bh7lww38",
                            "schemaId": "cjy3co34rkid80738qor6ekch",
                            "title": "Level one",
                            "value": "level_one",
                            "answer": {
                                "featureId": "ck1s6ub181f250811dmqwa04z",
                                "schemaId": "cjy3co2xpj1ap07215qaxs8at",
                                "title": "Well Defined",
                                "value": "well_defined"
                            }
                        },

And a example of the LabelBox JSON format:

[{
    "ID": "a9b7c5d3e1f",
    "DataRow ID": "xy10z8a6b4c",
    "Labeled Data": "https://storage.labelbox.com/IMG_001.JPG",
    "Label": {
        "helmet": [{
            "geometry": [{
                "x": 690,
                "y": 1497
            }, {

Can you confirm that the dataset labels are in a LabelBox format or if I’m looking at the wrong dataset

Hi stellasphere,

Thank you for helping with this, I really appreciate it! There was a paper written by the developers of the Tufts dataset that states that the annotations were made using Labelbox. The paper can be found here if you’d like to take a look (Section III is the relevant part and it does contain a Figure image with some information about the JSON):

Might it be possible that they are Labelbox annotations but have somehow been customised?

Hi @TheHydra

It is possible. I see that they do mention it’s made with LabelBox. I don’t have experience using or importing LabelBox, but when I tried importing the file back into LabelBox, it gave me an error. It might be worth playing around with the format of the dataset.