Hi everyone,
My name is Scott and I’m from Power House AFC, a local community Australian Rules Football club in Melbourne, Australia.
I’m currently building an in-house club management platform for our football club (players, volunteers, communications, fixtures, stats, etc.) and recently started experimenting with computer vision/video analysis using Roboflow.
I’m not formally a developer — I’ve primarily been building the platform using ChatGPT + Cursor — but I’ve managed to get a reasonably advanced prototype pipeline working and I’m now looking for guidance on best practices and architecture.
My current stack/prototype includes:
-
RF-DETR
-
ByteTrack / custom tracking
-
OCR for jersey numbers
-
custom match_profiles.py for team jersey/short colour classification
-
roster CSVs mapping player numbers to names
The goal is to analyse full AFL match footage from a single broadcast-style camera.
A typical AFL game includes:
-
18 players per side on the field
-
4 umpires
-
large oval ground dimensions
-
frequent occlusion/contact contests
-
players rotating body orientation constantly
-
long periods where jersey numbers are not visible
Current goals:
-
detect all players on field
-
detect umpires separately
-
classify players by team colours
-
identify jersey numbers reliably
-
map jersey numbers to player names
-
maintain persistent identity when:
-
follow a specific player by number/name
-
track the football itself
What currently works reasonably well:
Where I’m struggling:
-
reliable PH vs opposition classification
-
avoiding incorrect identity persistence
-
OCR instability/frame-to-frame switching
-
re-identification after occlusion
-
deciding how much should be solved by:
-
whether my architecture approach is fundamentally correct
One thing I’ve discovered is that trying to solve:
-
team classification
-
OCR
-
identity persistence
-
roster mapping
all simultaneously creates cascading errors.
I’m now experimenting with staged analysis modes:
-
player detection
-
team classification
-
jersey number OCR
-
persistent identity
I’d love guidance from anyone who has built similar sports-analysis pipelines using Roboflow or RF-DETR.
Specific questions:
-
Is this overall approach realistic for community-level sports footage?
-
Would you recommend custom training for team classification?
-
Is jersey OCR usually solved separately from player detection?
-
Are there recommended approaches for persistent player identity/re-identification?
-
Would segmentation help significantly here?
-
Has anyone solved similar sports-analysis workflows using Roboflow tooling?
I’m happy to share screenshots/videos/examples if useful.
Thanks so much,
Scott Gallagher
Power House AFC
Whoa! You’re doing great for being new to computer vision. First thing I’ll say as a general comment - even when computer vision experts build a pipeline like this and post about it everyone is like “Wow! I’m amazed how well you got that to work!”. So it’s definitely not an easy problem.
Some users out here might have targeted advice on issues they might guess you are facing, but in the meantime it might help if you look over a couple of these resources and see if you are able to be even more specific about where it’s breaking down for you.
This is a nice one that is basketball, but has most of the parts you are talking about and is recent. How to Detect, Track, and Identify Basketball Players with Computer Vision
This one is a year old (a LOT has changed) but might get you over some hurdles: https://www.youtube.com/watch?v=aBVGKoNZQUw
And then just as a general statement, I think even experts would struggle with occlusion and leave/re-enter the scene. But big steps have been taken with some of the bigger models lately. So if you do not need real-time data, you might be able to solve for a lot of these. (If you need real-time, those big models would not run fast enough against a live feed.)
Best of luck!
Hi Scott,
This is an incredibly ambitious project and you’ve already made impressive progress, especially considering you’re tackling one of the most challenging computer vision problems in sports analytics. AFL tracking presents unique difficulties with the large field, constant player rotation, and frequent occlusions that make this particularly tough even for experienced CV engineers.
Your staged analysis approach is smart. Trying to solve detection, classification, OCR, and identity persistence simultaneously does create cascading errors, and breaking it into discrete stages allows you to debug each component independently. The stack you’ve chosen (RF-DETR + ByteTrack) is solid for this type of work.
For team classification, you’ll likely need custom training rather than relying on color-based approaches alone. AFL jersey colors can be similar between teams, lighting conditions vary, and players often wear different colored shorts vs. jerseys. Training a classification model on cropped player detections would be more robust than pure color analysis. You could create a dataset by extracting player crops from your RF-DETR detections and labeling them by team.
Jersey OCR is typically handled as a separate pipeline from player detection. The approach is usually: detect player → extract jersey region → run OCR → apply temporal smoothing to reduce frame-to-frame jitter. For the instability you’re seeing, try implementing a confidence-based voting system across multiple frames rather than taking single-frame OCR results. If a player’s number switches between 12 and 15 across frames, use the most frequent detection over a sliding window.
For persistent identity, you’ll want to combine multiple signals: spatial tracking (ByteTrack), jersey number consistency, team classification, and potentially player appearance embeddings. When jersey numbers aren’t visible, rely more heavily on spatial continuity and team membership. Re-identification after occlusion is genuinely difficult , even commercial sports analytics systems struggle here.
Relevant Resources:
The architecture you’re building is definitely realistic for community-level footage, though expect it to be an iterative process. Focus on getting each stage working well independently before chaining them together.
Best,
Patrick
thanks Automatez for your response and the links
I’m not looking for real-time yet, just a post match review
very much appreciated for your help here
thanks for your reply **Patrick_Nihranz
**
and a big thanks for the tips and links 