Need Guidance on Finding a Person of Interest from Live CCTV/Webcam Feed

Osama_Altaf · March 17, 2025, 3:42pm

Hi everyone, I’m new here!

I’m working on a university project for my graduation and research. My goal is to identify a person of interest from a live CCTV or webcam feed by uploading a reference image. However, I’m unsure how to get started or which approach to take.

I’ve explored several models and workflows but can’t decide on the best one. I also have some key questions:

Do I need to train a model from scratch, or can I use an existing image embedding model for direct inference?
If I use embeddings, how should I compare them efficiently with frames from the live feed?
Would this approach be computationally expensive, and are there any optimizations to reduce the cost?

I’m feeling a bit stuck and would really appreciate any guidance or suggestions from experienced members. Thanks in advance for your help!

trevorhlynn · March 18, 2025, 11:21am

Hey there! It all depends on what you’re trying to accomplish in terms of accuracy, speed, cost, compute, etc. There is really no right or wrong answer here.

You can train from scratch if needed. Try pre-trained models like this one to see if it works for you. People Detection Object Detection Dataset and Pre-Trained Model by Leo Ueno

Comparing embeddings “efficiently” depends on what you mean by efficient. If you’re thinking about embeddings rather than object detection, you can see this post Launch: Embeddings in Workflows

I do not know what computationally intensive or expensive means to you but what I’ve shared so far is fairly lightweight across training and inference.

Osama_Altaf · March 20, 2025, 10:03pm

I want to detect and track person of Interest in real time Webcam or cctv feed.
Which approach should be good for me.

I want good results, for now I had devised a project in which, I create embedding using ViT & Dinov2 models, stored then and then compared these embeddings over the live feed from Webcam. It get intermediate results, but it shows other people as person of Interest more than often.

Can you guide me more on this path.

trevorhlynn · March 21, 2025, 11:48am

This is a good approach. Our team often uses CLIP for embeddings to do things like this but I’ll see if anyone has other ideas.

Osama_Altaf · March 21, 2025, 7:24pm

Great @trevorhlynn, Thanks for you response. I will add CLIP into pipeline. What do you think if I add Cascade or DeepFace like Algos for separate face comparisons, and bodily features by CLIP or DINOV2.

trevorhlynn · March 24, 2025, 6:21pm

Yes, I think if you have visibility of a face then something like DeepFace would be better.

system · April 14, 2025, 6:22pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Object detection and video summarization - doubt regarding tracking Community Help	2	173	March 27, 2024
Basic question about running computer vision models Community Help	2	27	August 20, 2024
Help in dataset for object detection in videos Community Help	1	473	June 14, 2022
Pose estimation after object detection or start over? Community Help	1	97	April 29, 2024
Live inference using webcam doesn't work Community Help bugs	1	557	April 28, 2023

Need Guidance on Finding a Person of Interest from Live CCTV/Webcam Feed

Related topics