How to call Python model.predict with images from OpenCV

Hello everyone, I am building an object detection project which is working great in app.roboflow.com and now I am trying to move to some clients. I am using Python on macOS moving to Raspberry Pi.

The problem is this - how can I read image frames from a video and get a prediction like the JSON below against the model? There are two common examples on the 'net, I’ve tried both and can’t make either work. Seems to be something with the format of the image data I am using.

I am opening the video file using

import cv2 as cv

f = cv.VideoCapture('myfile.mp4')
while(f.isOpened()):
    ret, frame = f.read()
    rf = Roboflow(api_key="my_api_key")
    project = rf.workspace().project("my-project-name")
    model = project.version(3).model

    # Encode image to base64 string
    retval, buffer = cv.imencode('.jpg', frame)
    project.upload(buffer)
    model.predict()

But each time I am getting

Exception has occurred: AttributeError
'numpy.ndarray' object has no attribute 'startswith'

The problem seems to be that project.upload() requires an image path, but I have an image buffer. Any ideas? My Google-fu has let me down.

The output I need is something like this - just the class and confidence.

{
  "predictions": [
    {
      "x": 469,
      "y": 214,
      "width": 182,
      "height": 304,
      "class": "my-class",
      "confidence": 0.748
    }
  ]
}

Thank you in advance for any help.

With the python package, you need to temporarily save it as a file for now, and then use that file in model.predict(). Project.upload() is for image upload to a project rather than running inference/making predictions.

Example for inference with python package code:

from roboflow import Roboflow
import cv2 as cv


rf = Roboflow(api_key="my_api_key")
project = rf.workspace().project("my-project-name")
model = project.version(3).model

f = cv.VideoCapture('my_file.mp4')

while(f.isOpened()):
  # f.read() methods returns a tuple, first element is a bool 
  # and the second is frame
    ret, frame = f.read()
    if ret == True:
        # save frame as a “temporary” jpeg file
        cv.imwrite('temp.jpg', frame)
        # run inference on “temporary” jpeg file (the frame)
        predictions = model.predict('temp.jpg')
        predictions_json = predictions.json()
        # printing all detection results from the image
        print(predictions_json)

        # accessing individual predicted boxes on each image
        for bounding_box in predictions:
            # x0 = bounding_box['x'] - bounding_box['width'] / 2#start_column
            # x1 = bounding_box['x'] + bounding_box['width'] / 2#end_column
            # y0 = bounding_box['y'] - bounding_box['height'] / 2#start row
            # y1 = bounding_box['y'] + bounding_box['height'] / 2#end_row
            class_name = bounding_box['class']
            confidence_score = bounding_box['confidence']
        
            detection_results = bounding_box
            class_and_confidence = (class_name, confidence_score)
            print(class_and_confidence, '\n')

    elif cv.waitKey(1) == ord('q'):
        break
    else:
        break

f.release()
cv.destroyAllWindows()
  • detection_results would be all detection bounding boxes, class_and_confidence would give you the class and confidence for each detected bounding box in the image, successively (as the for loop runs)

Here is a full example snippet for webcam object detection (with the Hosted API): roboflow-computer-vision-utilities/webcam-od.py at main · roboflow-ai/roboflow-computer-vision-utilities · GitHub and the config file schema

I also saw you said you were moving to Raspberry Pi - here’s how to accomplish this: Launch: Deploy to Raspberry Pi

For general guide on local deployment: Launch: Test Computer Vision Models Locally