Hello everyone, I am building an object detection project which is working great in app.roboflow.com and now I am trying to move to some clients. I am using Python on macOS moving to Raspberry Pi.
The problem is this - how can I read image frames from a video and get a prediction like the JSON below against the model? There are two common examples on the 'net, I’ve tried both and can’t make either work. Seems to be something with the format of the image data I am using.
I am opening the video file using
import cv2 as cv
f = cv.VideoCapture('myfile.mp4')
while(f.isOpened()):
ret, frame = f.read()
rf = Roboflow(api_key="my_api_key")
project = rf.workspace().project("my-project-name")
model = project.version(3).model
# Encode image to base64 string
retval, buffer = cv.imencode('.jpg', frame)
project.upload(buffer)
model.predict()
But each time I am getting
Exception has occurred: AttributeError
'numpy.ndarray' object has no attribute 'startswith'
The problem seems to be that project.upload() requires an image path, but I have an image buffer. Any ideas? My Google-fu has let me down.
The output I need is something like this - just the class and confidence.
{
"predictions": [
{
"x": 469,
"y": 214,
"width": 182,
"height": 304,
"class": "my-class",
"confidence": 0.748
}
]
}
Thank you in advance for any help.
With the python package, you need to temporarily save it as a file for now, and then use that file in model.predict(). Project.upload() is for image upload to a project rather than running inference/making predictions.
Example for inference with python package code:
from roboflow import Roboflow
import cv2 as cv
rf = Roboflow(api_key="my_api_key")
project = rf.workspace().project("my-project-name")
model = project.version(3).model
f = cv.VideoCapture('my_file.mp4')
while(f.isOpened()):
# f.read() methods returns a tuple, first element is a bool
# and the second is frame
ret, frame = f.read()
if ret == True:
# save frame as a “temporary” jpeg file
cv.imwrite('temp.jpg', frame)
# run inference on “temporary” jpeg file (the frame)
predictions = model.predict('temp.jpg')
predictions_json = predictions.json()
# printing all detection results from the image
print(predictions_json)
# accessing individual predicted boxes on each image
for bounding_box in predictions:
# x0 = bounding_box['x'] - bounding_box['width'] / 2#start_column
# x1 = bounding_box['x'] + bounding_box['width'] / 2#end_column
# y0 = bounding_box['y'] - bounding_box['height'] / 2#start row
# y1 = bounding_box['y'] + bounding_box['height'] / 2#end_row
class_name = bounding_box['class']
confidence_score = bounding_box['confidence']
detection_results = bounding_box
class_and_confidence = (class_name, confidence_score)
print(class_and_confidence, '\n')
elif cv.waitKey(1) == ord('q'):
break
else:
break
f.release()
cv.destroyAllWindows()
-
detection_results
would be all detection bounding boxes, class_and_confidence
would give you the class and confidence for each detected bounding box in the image, successively (as the for loop runs)
Here is a full example snippet for webcam object detection (with the Hosted API): roboflow-computer-vision-utilities/webcam-od.py at main · roboflow-ai/roboflow-computer-vision-utilities · GitHub and the config file schema
I also saw you said you were moving to Raspberry Pi - here’s how to accomplish this: Launch: Deploy to Raspberry Pi
For general guide on local deployment: Launch: Test Computer Vision Models Locally