Best tracker to pair with RF-DETR Seg for dense scenes?

pedestrianman · January 20, 2026, 12:48pm

Hi all!

I’m using RF-DETR Segmentation to track a dense group of sheep entering a barn (lots of occlusions + very similar-looking individuals) and I need stable per-animal trajectories with minimal ID switches.

Right now I’m running ByteTrack via Supervision, but after checking SkalskiP’s latest projects I noticed he often uses SAM/SAM2 tracking, and in practice it seems to work really well! (e.g. in his recent basketball project he uses RF-DETR for the initial detections and then SAM2 to handle the tracking). I’ve also been reading that SAM/SAM2 tracker can outperform more classic trackers in very dense situations.

In your experience, which approach works best with RF-DETR Seg when you care about identity-stable tracking in crowded scenes: ByteTrack, a SAM/SAM2-based tracker, and are there any key settings or reference pipelines you’d recommend?

Also, how does a SAM2-style tracker handle the case where not all objects are present in the first frame (e.g., some sheep enter later): do you periodically re-initialize with new detections, or is there a standard way to add new tracks on the fly?

Bar_Shimshon · January 30, 2026, 2:32pm

Great question! This is a nuanced tradeoff, and as it usually goes in such cases, the “right” answer depends on your latency requirements and how severe the occlusions are.

Let me list below a breakdown of considerations between ByteTrack vs SAM2-Based Tracking

Factor	ByteTrack	SAM2 Video Predictor
ID stability through occlusion	Moderate — relies on Kalman + IoU, struggles when sheep overlap heavily	Strong — memory bank maintains appearance features across occlusions
Visually similar objects	Weak — no appearance model by default	Better — learns per-object embeddings
Speed	Fast (~real-time)	Slower — memory propagation has overhead
New object handling	Native — just match new detections	Requires explicit re-initialization

Given sheep entering a barn (continuous entry, heavy occlusion at the doorway):

Use SAM2 as primary tracker for ID stability
Run RF-DETR detection frequently near the entry zone (maybe every 5-10 frames) to catch new sheep
Run detection less frequently once sheep are inside and tracked
Consider a spatial prior — if you know where the door is, only look for new tracks in that region

If latency is critical and you need real-time, you could also look at BoT-SORT (ByteTrack + appearance features) as a middle ground — it adds ReID embeddings to ByteTrack without the full SAM2 overhead.

Note that SAM2’s video predictor does not auto-discover new objects. It only propagates masks for objects you explicitly initialize. So yes, you need to periodically re-run detection and add new tracks manually.

We have some nifty models in Roboflow Universe you may want to check out as well: https://universe.roboflow.com/riis/aerial-sheep

let me know if this was helpful!

Bar Shimshon

pedestrianman:

using RF-DETR Segmentation to track a dense group of sheep entering a barn (lots of occlusions + very similar-looking individuals) and I need stable per-animal trajectories with minimal ID switches.

vlcsnap-2026-01-20-13h07m58s611704×576 75.8 KB

Right now I’m running ByteTrack via Supervision, but after checking SkalskiP’s latest projects I noticed he often uses SAM/SAM2 tracking, and in practice it seems to work really well! (e.g. in his recent basketball project he uses RF-DETR for the initial detections and then SAM2 to handle the tracking). I’ve also been reading that SAM/SAM2 tracker can outperform more classic trackers in very dense situations.

In your experience, which approach works best with RF-DETR Seg when you care about identity-stable tracking in crowded scenes: ByteTrack, a SAM/SAM2-based tracker, and are there any key settings or reference pipelines you’d recommend?

Also, how does a SAM2-style tracker handle the case where not all objects are present in the first frame (e.g., some sheep enter later): do you periodically re-initialize with new detections, or is there a standard way to add new tracks on the fly?

Topic		Replies	Views
Regarding the usage of SAM2 for annotating bboxes for tracking (dense similar)objects in a video 🤝 Community Help	0	62	August 9, 2024
Tracker_id mixedover similar objects 🤝 Community Help	3	129	May 13, 2024
Trackers fail compared to frame-by-frame Keypoint Detection 🤝 Community Help	5	64	May 4, 2026
Does RF-DETR make automatic data augmentations with my data and how to turn them off 🤝 Community Help	0	82	October 21, 2025
Building AFL Player Tracking + Jersey Recognition Pipeline with RF-DETR/ByteTrack — Need Guidance on Team Classification, OCR & Persistent Identity 🤝 Community Help	4	37	June 5, 2026

Best tracker to pair with RF-DETR Seg for dense scenes?

Related topics