Images/Annotations being added via API aren't immediately added to dataset

This category is for question related to accounts on https://app.roboflow.com/

Please share the following so we may better assist you:

  1. Project type: Object detection
  2. OS: Win 10 Pro v10, Browser: Firefox v112.0.2

When I upload image/annotation pairs via the API, they’re added as ‘unassigned’ annotation jobs, instead of being added directly to the dataset as I’d expect them to be, since they’re already annotated.

This is only happening intermittently, about half the time it seems to work perfectly. Here’s the code I’m using for this:

const uploadImage = async (filepath, projectUrl, apiKey, extraOptions) => {
  const filename = path.basename(filepath);

  // console.log(filename, split)
  const formData = new FormData();
  formData.append("name", filename);
  formData.append("file", fs.createReadStream(filepath));
  formData.append("split", extraOptions.split);

  try {
    const response = await axios({
      method: "POST",
      url: `https://api.roboflow.com/dataset/${projectUrl}/upload?overwrite=true`,
      params: {
        api_key: process.env.ROBOFLOW_API_KEY,
      },
      data: formData,
      headers: formData.getHeaders(),
    });

    console.log(response.data);
    return response.data;
  } catch (e) {
    if (e.response) {
      return e.response.data;
    }
    throw e;
  }
};

const uploadAnnotation = async (imageID, annotationFile, projectUrl) => {
  return new Promise(async (resolve, reject) => {
    // gets the actual file name, given a full path
    const filename = path.basename(annotationFile);

    // reading the contents of the XML file
    const annotationData = fs.readFileSync(annotationFile, "utf-8");

    await axios({
      method: "POST",
      url: `https://api.roboflow.com/dataset/${projectUrl}/annotate/${imageID}?overwrite=true`,
      params: {
        api_key: process.env.ROBOFLOW_API_KEY,
        name: filename,
      },
      data: annotationData,
      headers: {
        "Content-Type": "text/plain",
      },
    })
      .then((res) => {
        resolve(res.data);
      })
      .catch((e) => {
        console.log("Error uploading annotation: ", e);
        reject(e);
      });
  });
};

const uploadWithAnnotation = async (
  fileName,
  annotationFilename,
  projectUrl,
  apiKey,
  extraOptions
) => {
  const uploadPromise = uploadImage(fileName, projectUrl, apiKey, extraOptions);

  // uploadAnnotation requires imageId from uploadImage, so I have to wait for that to complete first
  const annotationPromise = await uploadPromise.then(async (uploadResult) => {
    const imageId = uploadResult.id;

    if (annotationFilename.includes("[filename]")) {
      annotationFilename = annotationFilename.replace(
        "[filename]",
        path.parse(fileName).name
      );
    }

    if (fs.existsSync(annotationFilename)) {
      return await uploadAnnotation(
        imageId,
        annotationFilename,
        projectUrl,
        apiKey
      )
        .then((annotationResult) => {
          return { uploadResult, annotationResult };
        })
        .catch((e) => {
          console.log(e);
        });
    }
  });

  // resolve if both uploadImage and uploadAnnotation succeed, but reject if either fail
  return Promise.all([uploadPromise, annotationPromise])
    .then(([uploadResult, { annotationResult } = {}]) => {
      return { uploadResult, annotationResult };
    })
    .catch((error) => {
      console.error(`  Error uploading ${fileName}: `, error);
      throw error;
    });
};

Images and annotations are saved locally first, then read back into these functions. The images are uploaded first, we await that, and then annotations are uploaded using the image ID received from the image upload.

Any idea what’s going wrong here? Or, if not, might it be possible instead to manually assign/approve all these images using the API, to add them to the dataset immediately?

Thank you,
Jake

I’m trying to replicate this so I can show a screenshot of it happening, but it’s just not happening every time. I tried it again just now with the same images and annotations, and they were added straight to the dataset as expected. Hard to debug :woman_shrugging:

Think I got it. If I remove ?overwrite=true from the image upload, but leave it on the annotation upload, it seems okay. Not sure why that would be?

Actually it looks like sometimes the annotation upload request will 404, despite having an accurate project name and image ID. It seems it can’t find the image I’m trying to annotate, despite it having been uploaded

E.g. in this case it 404’s, even though that image has been uploaded:

The uploaded image is weirdly sitting in ‘uploaded via api’ (not annotated via api) and has already been annotated?

image

Maybe my annotation code is running twice

Bump for support if anyone has a suggestion. Still haven’t solved this. Same odd issue. The first line of this image shows the image has been uploaded successfully, and here’s its ID. Then try to annotate that image, and it says it doesn’t exist.

image

And yet if I go to the project and have a look, I’ve got two annotated images (one of them being the one for which the error was thrown) - but they’re just in ‘uploaded via API’ instead of ‘annotated via API’

image

Another example. I’ve got this paused in the debugger just after it tried to upload an annotation. 404. And yet here’s the same image already annotated?.. :thinking:

Could this just be a caching problem? I’ve been testing this over and over with the same images, but different projects. I just tried it again with completely new images, and it’s fine

image

So I assume this was just a caching issue, after trying it again with some completely new data, it seems fine. However I am still having an issue with models trained via the API performing terribly compared to models trained from the Ui. Mentioned this a week or so ago, would still like some advice on it if possible.

Example being the attached model using over 200 images, very bad compared with another model of similar dataset size performing very well. Seems weird to me

Here’s something I think relates to the problem of the model performing very poorly. I’ve uploaded and annotated these 217 images using the API:

But when I try to generate a new version, it tells me there are 0 classes and 217 unnannotated images?

But they have all been annotated:

:thinking:

And yet, in the health check:

image

As it should be, 217 annotations

Workaround for this was just to include null values, I was filtering them out in preprocessing

Problem is back, despite being randomly fine 5 days ago. Nothing changed in my code since then.

Image with id k8WVKlyr9pM82svVqNqf successfully uploaded, and then try to annotate that image through the API, which says that image doesn’t exist

For the model performance issue, RE: the UI vs. API training, are you saying it was resolved with the addition of Null Images?

Hmm, are there any other errors being logged?

And this is the same code that was previously being used (linked above in the thread, and merged to our API Snippets repo)?

  • If so, I’ll use that code to attempt to reproduce the issue, the only added thing I’d need from you is a shared Google Drive folder that contains the 217 images, and the label files for them

Hey Mohamed. Yeah, that’s right - including the null values did the trick, whereas previously I was filtering out 100% of null values in preprocessing

Yeah it’s the same code which was merged to the API snippets repo, at least to an extent. I don’t think that’s where the issue lies, but could be wrong.

The /uploadWithAnnotation endpoint is where the image upload and annotation upload functions are called, which is here:

And the image upload/annotation upload functions themselves are here:

Finally, here are the files and annotations for those 217 images:

For the record, I tried this again today with 3 images, similar result from yesterday. Only one worked, and the others are unannotated:

image

Sorry for late response also. Busy day, final presentation :slight_smile:

Tried this to test, it seems to work fine, though the image upload runs twice. Not sure why that is

const path = require("path")
const FormData = require("form-data")
const axios = require('axios')
const fs = require('fs')

const uploadImage = async (filepath, projectUrl, apiKey, extraOptions) => {
    const filename = path.basename(filepath);

    const formData = new FormData();
    formData.append("name", filename);
    formData.append("file", fs.createReadStream(filepath));
    formData.append("split", extraOptions.split);

    try {
        const response = await axios({
            method: "POST",
            url: `https://api.roboflow.com/dataset/${projectUrl}/upload`,
            params: {
                api_key: process.env.ROBOFLOW_API_KEY || apiKey,
            },
            data: formData,
            headers: formData.getHeaders(),
        });

        console.log(response.data);
        return response.data;
    } catch (e) {
        if (e.response) {
            return e.response.data;
        }
        throw e;
    }
};

const uploadAnnotation = async (imageID, annotationFile, projectUrl, api_key) => {
    const filename = path.basename(annotationFile);
    const annotationData = fs.readFileSync(annotationFile, "utf-8");

    try {
        const response = await axios({
            method: "POST",
            url: `https://api.roboflow.com/dataset/${projectUrl}/annotate/${imageID}?overwrite=true`,
            params: {
                api_key: process.env.ROBOFLOW_API_KEY || api_key,
                name: filename,
            },
            data: annotationData,
            headers: {
                "Content-Type": "text/plain",
            },
        });

        return response.data;
    } catch (e) {
        if (e.response) {
            return e.response.data;
        }
    }
};

uploadImage(
    'C:/Users/Jake/WebstormProjects/y4-project-jakewarrenblack/raid-middleman-server/uploads/rosie.JPEG',
    'insomnia-test',
    '**************',
    {split: 'train'}
).then(r => {
    console.log(r)

    uploadAnnotation(
        r.id,
        'C:/Users/Jake/WebstormProjects/y4-project-jakewarrenblack/raid-middleman-server/uploads/test.xml',
        'insomnia-test',
        '**************'
    ).then(r => console.log(r))
})

Might be on to the source of this. I notice it works just fine if I use a project created a while ago (maybe an hour ago), but fails if I create a project and then immediately start uploading/annotating. I wait for status 200 from the request to create the project before allowing the user to start uploading images, I’m wondering is that endpoint similar to the dataset version generation one, where 200 just means ‘started’, but not necessarily finished?

I get around that in the case of version generation by calling it recursively until the response has a dataset version, but when creating a project, it’s just 200 OK and nothing else, will try just setting a timeout for a while after creating the project and see what happens.