- Project Type: Object Detection
- Operating System & Browser: Jetpack 6.2 Edge Deployment
- Project Universe Link or Workspace/Project ID: simplot -workflows - yolov8
- Do you grant Roboflow Support permission to access your Workspace for troubleshooting? (Yes/No): Yes
I did a fresh installation of complete system including flashing os and Jackpack 6.2 on my Nvidia Orin AGX. I want to run inference on my edge device in offline mode. For that I followed the instructions on Install on Jetson - Roboflow Inference
upon running inference server start it says:
GPU detected. Using a GPU image.
Pulling image: roboflow/roboflow-inference-server-gpu:latest
404 Client Error for http+docker://localhost/v1.53/images/create?tag=latest&fromImage=roboflow%2Froboflow-inference-server-gpu: Not Found (“no matching manifest for linux/arm64/v8 in the manifest list entries: no match for platform in manifest: not found”)
I continued my container setup using Manual Starting the Container for Jetpack 6.2 using
sudo docker run -d \
--name inference-server \
--runtime nvidia \
--read-only \
-p 9001:9001 \
--volume ~/.inference/cache:/tmp:rw \
--security-opt="no-new-privileges" \
--cap-drop="ALL" \
--cap-add="NET_BIND_SERVICE" \
roboflow/roboflow-inference-server-jetson-6.2.0:latest
For testing the installation I forked the yolov8 object detection example: Deploy YOLOv8 Object Detection Models to the NVIDIA Jetson . When I run the inference with my internet connected, it runs pretty fast (though the api_url=localhost:9001). So my first question, isn’t it supposed to run locally?
And when I disconnect the internet it takes 10 to 12 seconds to give the result and I do not see GPU usage either in jtop or nvidia-smi. So I was wondering why it is not using GPU and taking so long for the results?
my code is as follows:
import cv2
import os
import json
from dotenv import load_dotenv
from inference_sdk import InferenceHTTPClient
# Load env vars
load_dotenv("../.env")
def draw_boxes(image, predictions):
for pred in predictions:
x = int(pred["x"])
y = int(pred["y"])
w = int(pred["width"])
h = int(pred["height"])
label = pred["class"]
conf = pred["confidence"]
# Convert center → top-left
x1 = int(x - w / 2)
y1 = int(y - h / 2)
x2 = int(x + w / 2)
y2 = int(y + h / 2)
# Draw box
cv2.rectangle(image, (x1, y1), (x2, y2), (0, 255, 0), 2)
# Label text
text = f"{label} {conf:.2f}"
cv2.putText(
image,
text,
(x1, y1 - 10),
cv2.FONT_HERSHEY_SIMPLEX,
0.6,
(0, 255, 0),
2,
)
return image
# Connect to inference server
client = InferenceHTTPClient(
api_url="http://localhost:9001",
api_key=os.getenv("ROBOFLOW_API_KEY"),
)
# Open webcam (0 = default camera)
cap = cv2.VideoCapture(0)
if not cap.isOpened():
raise RuntimeError("Could not open camera")
print("Press 'c' to capture & run inference")
print("Press 'q' to quit")
while True:
ret, frame = cap.read()
if not ret:
break
# Show live feed
cv2.imshow("Camera", frame)
key = cv2.waitKey(1) & 0xFF
# Capture on 'c'
if key == ord("c"):
img_path = "../data/captured_frame.jpg"
cv2.imwrite(img_path, frame)
result = client.run_workflow(
workspace_name="simplot",
workflow_id="yolov8",
images={"image": img_path},
use_cache=False
)
# Extract predictions (this path is typical for workflows)
predictions = result[0]["model_predictions"]["predictions"]["predictions"]
# Draw boxes on a copy
output_frame = frame.copy()
output_frame = draw_boxes(output_frame, predictions)
# Show result window
cv2.imshow("Detections", output_frame)
# Optional: save result image
cv2.imwrite("../output/detection_result.jpg", output_frame)
# print(json.dumps(result, indent=2))
print("Detections shown")
# Quit on 'q'
elif key == ord("q"):
break
cap.release()
cv2.destroyAllWindows()