Object Detection and Classification Over Time

I’m evaluating Roboflow and had a few questions. I have video of hockey pucks being shot against a wall using a static camera position and I ultimately want to:

  1. Identify when a puck makes contact with the wall (I’m assuming I’ll have to manually annotate when the contacts occur in the training set)
  2. Identify the puck coordinates over time

Does Roboflow support this? I ask as it seems a blend of object detection and classification as opposed to one or the other for the Project Type.

As a (former) hocker player, I love where this app is headed.

This question here is less “is this the right problem for roboflow” and more “how do I structure the computer vision problem such that I can use the right methods to solve it.” It sounds tractable with object detection and a bit of post-processing logic, depending on what your goals are.

Is your objective to count when a puck hits the wall? Identify that specific moment? Measure the speed of the puck?

To your specific questions:

  1. Identify when a puck makes contact with the wall (I’m assuming I’ll have to manually annotate when the contacts occur in the training set)

This will likely depend on the camera position. For example, is your camera in a fixed position ‘horizontal’ to the flight of the puck? If so, you could identify when the puck (labeled as object detection) is reaching a fixed point (the wall) in your images/video.

If you don’t have a ‘horizontal’ view, you could infer that the puck (labeled as object detection) has hit the wall once its course of flight slows down/changes quickly. This would entail writing code to compare the position of the puck bounding box after each frame to see if there’s notable movements (changes in direction/speed)

  1. Identify the puck coordinates over time

This I would certainly do with object detection. In fact, if your camera is in a fixed position relative to the moving puck, you could do a bit of math to approximate the puck’s speed. If you know the camera is X feet from the puck and the puck’s position has moved Y to Z in ABC seconds of video, you can interpret its speed.

Ultimately, I would structure your problem as object detection of the puck’s location and use those coordinates to conduct other logic like position, speed, hitting the wall.

Awesome, thank you for the detailed reply! The “how do I structure the computer vision problem such that I can use the right methods to solve it,” is a very useful framing for me to adopt so thank you. Conceptually I know much of what I want and need, I just don’t know the best way to get there. I’ll dig in a bit more and post additional questions if they arise. Thanks again!