My team and I are building an object tracking project using the Jetson Nano chipset and a global shutter, monochrome 120fps 720p camera (running at 640x480). We only need the center point of the object detection, no segmentation/masks or bounding boxes necessary.
We are exploring different the different Yolo architectures and wanted to get a recommendation on which one to use. We are building a custom, from scratch, model that only has 1 class, that is based on a dataset from the Nano’s camera above.
We want to make sure we’re selecting the right yolo structure to have the fastest FPS inference for camera tracking, given the hardware stack we’re using.
Anyone have any recommendations on the best place to start for a small team new to ML?
We have a blog post coming soon on the right model selection, but here’s a little of my personal experience.
For the most part, accuracy is a tradeoff with speed and newer YOLO versions are generally better. Here’s a chart with a benchmark on the Microsoft COCO dataset:
I imagine what you’d want is a good mix of high FPS and high accuracy. I’d try starting with YOLOv8n.
You can train a custom YOLOv8n model by using our YOLOv8 notebook and changing the line for custom training to: