How to save Time of the object when it detected in Video using YoloV5 or YoloV8

Hello, I hope you are doing well. The yolov5 and yolov8 save the output text file with the Class label and bounding box value. I want to save time when the object is detected and the class name is not labeled. Can any Body Help me with that

@Hamad_Younis, you can use the time and/or datetime modules from the standard python library.

Import the library and log a record of the current time when the detection was made.

Here’s my suggestions after looking at the YOLOv5 repo’s code


For YOLOv5, use the arguments in detect.py to create the txt file: yolov5/detect.py at master · ultralytics/yolov5 · GitHub

Utilize the --save-txt flag to create a txt file of your detections, and include the --save-conf flag to include the confidence level for the detctions

To include the time, modify the detect.py file to include a function for extracting the current time, and creating a record for it in string format:

  • Short example:
import time

# Initialize timer
t1 = time.time()

# Run inference
results = model(img)

# Calculate elapsed time
t2 = time.time()
dt = t2 - t1

# Print detection time
print(f"Detection time: {dt:.4f} seconds")

Press enter at the end of line 81 to create a new line, and then enter this in the new line:
save_time=False

  • Example:
def run(
  weights=ROOT / 'yolov5s.pt',  # model path or triton URL
  source=ROOT / 'data/images',  # file/dir/URL/glob/screen/0(webcam)
  data=ROOT / 'data/coco128.yaml',  # dataset.yaml path
  imgsz=(640, 640),  # inference size (height, width)
  conf_thres=0.25,  # confidence threshold
  iou_thres=0.45,  # NMS IOU threshold
  max_det=1000,  # maximum detections per image
  device='',  # cuda device, i.e. 0 or 0,1,2,3 or cpu
  view_img=False,  # show results
  save_txt=False,  # save results to *.txt
  save_conf=False,  # save confidences in --save-txt labels
  save_crop=False,  # save cropped prediction boxes
  nosave=False,  # do not save images/videos
  classes=None,  # filter by class: --class 0, or --class 0 2 3
  agnostic_nms=False,  # class-agnostic NMS
  augment=False,  # augmented inference
  visualize=False,  # visualize features
  update=False,  # update all models
  project=ROOT / 'runs/detect',  # save results to project/name
  name='exp',  # save results to project/name
  exist_ok=False,  # existing project/name ok, do not increment
  line_thickness=3,  # bounding box thickness (pixels)
  hide_labels=False,  # hide labels
  hide_conf=False,  # hide confidences
  half=False,  # use FP16 half-precision inference
  dnn=False,  # use OpenCV DNN for ONNX inference
  vid_stride=1,  # video frame-rate stride
  save_time=False # choose to save timestamp for the detection
):

Then, add a new line between line 247 and line 248 that reads:

parser.add_argument('--save_time', action='store_true', help='Save the timestamp of each detection')
  • This will create the argument to pass when running detect.py, to store the timestamps

Line 63 is where you’ll want to start making the updates for the timestamp logic:

Add these new lines after line 128 to create the logic for saving the detection timestamp records:

for detected_object in pred:
  if save_time:
    timestamp = time.strftime('%Y-%m-%d %H:%M:%S', time.localtime())
    detection_data.append({'object': detected_object, 'timestamp': timestamp})
  else:
    detection_data.append({'object': detected_object})

Finally, modify the code, beginning with line 168 to include the detection records, if the --save_time flag is True:

time_index = 0
if save_time: # save timestamp to file
  f.write(("%g " * len(line)).rstrip() % line + f", {detection_data[time_index]}" + "\n")

  time_index+=1
else:
  f.write(('%g ' * len(line)).rstrip() % line + '\n')

For YOLOv8, you’ll want to make the modifications in the predict.py file corresponding to your model type:

In the case of YOLOv8, try adding your new argument to the model.py file, and within the YOLO class:

After you’ve given implementation a try, please reply back to confirm if this works, or if you have any errors while running your code.

@Mohamed Thanks for your great reply, But I am facing an issue with this.
Can you share the final in which you have edit the code detect.py file for yolov5 and predict.py for yolov8

@Hamad_Younis can you provide more specifics on your error?

Such as the exact error, and a screenshot of it?

I provided the code/suggestions based on my reading of the repo, so I don’t have the direct files handy.

I’ve also only looked at the YOLOv5 code and provided suggested code for that. I haven’t yet had a chance to write full code for how to do this for YOLOv8.

I’ll try to do so for YOLOv8 when I have time, but if you’re pressed for time, it may be better to use the formatting from the YOLOv5 version of it, once its working, and update the YOLOv8 code.

@Mohamed I am unable to do that for yolov5, Can you please only modified the code for YOLOv5, not for YOLOv8.
If you can share detect.py for yolov5 with edit change i will be very thankful to you

@Mohamed Can you please help me i am kind of stuck with it for the past few days.
if you can modify detect.py code for YOLOv5 I will be very thankful to you

@Mohamed
Can you please help me i am kind of stuck with it for the past few days.
if you can modify detect.py code for YOLOv5 I will be very thankful to you

@Mohamed Getting Error detection_data is not defined

@Mohamed Error

Traceback (most recent call last):
File “/content/yolov5/detect.py”, line 277, in
main(opt)
File “/content/yolov5/detect.py”, line 272, in main
run(**vars(opt))
File “/usr/local/lib/python3.9/dist-packages/torch/autograd/grad_mode.py”, line 27, in decorate_context
return func(*args, **kwargs)
File “/content/yolov5/detect.py”, line 183, in run
f.write(('%g ’ * len(line)).rstrip() % line + ‘\n’)
UnboundLocalError: local variable ‘f’ referenced before assignment
terminate called without an active exception

@Mohamed It saving the timestamp like below

33 0.808984 0.968843 0.0539063 0.0623145, {‘timestamp’: ‘2023-03-25 07:02:08’}

But I want the time of the Video no the current time

1 Like

Adding this here so everything remains on the same thread (this one), since it’s the same inquiry, rather than a new one:

@Mohamed Okay thanks will be waiting

@Hamad_Younis Tested and looks like I got it to work with my face detection model on Roboflow:

To run it with YOLOv5 or YOLOv8, upload your model weights to Roboflow, and use the code below:

import time
import datetime
import cv2
from roboflow import Roboflow


rf = Roboflow(api_key="PRIVATE_API_KEY")
project = rf.workspace("WORKSPACE_ID").project("PROJECT_ID")
model = project.version(VERSION_NUMBER).model # the version number is an integer value

# create our dictionary to save the class names and index/numeric value for the class
model_classes = project.classes
label_dict = {}
for i, label_name in enumerate(model_classes):
    label_dict[label_name] = f"{i}"

cap = cv2.VideoCapture("REPLACE_WITH_PATH_TO_VIDEO")

# Get the total number of frames in the video
total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))

# Get the frame rate of the video
fps = int(cap.get(cv2.CAP_PROP_FPS))

# Calculate the duration of the video
duration = total_frames / fps

# Convert the video length to minutes and seconds
vid_minutes = duration // 60
vid_seconds = duration % 60
vid_length = datetime.time(minute=int(vid_minutes), second=int(vid_seconds))

print(f"The video is {duration} seconds long", '\n')

# process the video
while True:
    ret, frame = cap.read()
    t0 = time.time()
    
    if not ret:
        break

    # Convert the frame from BGR to RGB
    rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

    # Save the RGB frame as a JPEG file
    img = cv2.imwrite('temp.jpg', rgb_frame)
    
    # Detect objects in the frame
    detected_objects = model.predict('temp.jpg', confidence=40, overlap=30).json()
    
    # main bounding box coordinates from JSON response object
    # https://docs.roboflow.com/inference/hosted-api#response-object-format
    for bounding_box in detected_objects['predictions']:
        print(detected_objects)
        print(bounding_box)
        xmin = bounding_box['x'] - bounding_box['width'] / 2
        xmax = bounding_box['x'] + bounding_box['width'] / 2
        ymin = bounding_box['y'] - bounding_box['height'] / 2
        ymax = bounding_box['y'] + bounding_box['height'] / 2
        confidence = bounding_box['confidence']
        name = bounding_box['class']

        # Find the label name corresponding corresponding to the class ID index value
        for key, value in label_dict.items():
            if key == name:
                object_class = value

        # position coordinates: start = (x0, y0), end = (x1, y1)
        # color = RGB-value for bounding box color, (0,0,0) is "black"
        # thickness = stroke width/thickness of bounding box
        # draw and place bounding boxes
        start_point = (int(xmin), int(ymin))
        end_point = (int(xmax), int(ymax))
        cv2.rectangle(frame, start_point, end_point, color=(0,0,0), thickness=2)

        # calculate seconds per frame we are running inference
        t = time.time()-t0

        # Calculate the remaining duration of the video
        remaining_duration = total_frames / fps
        # Convert the remaining video length to minutes and seconds
        remaining_minutes = remaining_duration // 60
        remaining_seconds = remaining_duration % 60
        # print the remaining video duration
        remaining_duration = datetime.time(minute=int(remaining_minutes), second=int(remaining_seconds))

        elapsed_time = datetime.datetime.combine(
            datetime.date.min, vid_length) - datetime.datetime.combine(datetime.date.min, remaining_duration)
        
        time_remaining = datetime.datetime.combine(
            datetime.date.min, remaining_duration) - datetime.datetime.combine(datetime.date.min, vid_length)

        if remaining_duration < vid_length:      
            reported_elapsed_dt = datetime.datetime(2000, 1, 1) + elapsed_time
            timestamp = f"{reported_elapsed_dt.strftime('%M:%S')}"

            # Draw the timestamp on the frame
            cv2.putText(frame, f"{timestamp}", (10, 50), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)

            # If object(s) detected in the frame, create a variable to store their labels with the timestamp
            if xmin:
                label_with_timestamp = f"{xmin} {ymin} {xmax} {ymax} {confidence:.04f} {object_class} {name} {timestamp}"
        else:
            reported_elapsed_dt = datetime.datetime(2000, 1, 1) + elapsed_time
            timestamp = f"{reported_elapsed_dt.strftime('%M:%S')}"

            # Draw the timestamp on the frame
            cv2.putText(frame, f"{timestamp}", (10, 50), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)

            # If object(s) detected in the frame, create a variable to store their labels with the timestamp
            if xmin:
                label_with_timestamp = f"{xmin} {ymin} {xmax} {ymax} {confidence:.04f} {object_class} {name} {timestamp}"

        with open(f'inference_results.txt', 'a') as f:
            f.write(label_with_timestamp + '\n')

        # Logging purposes
        print('\n', f'Processing {1/t} Frames Per Second (FPS): ', '\n')
        print('\n', 'Written to txt file: ', label_with_timestamp, '\n')
        
    
    cv2.imshow("frame", frame)
    total_frames-=1
    
    if cv2.waitKey(1) == ord("q"):
        break
        
cap.release()
cv2.destroyAllWindows()

^ scroll down with your mouse hovered over the code to see/copy all of it

@Mohamed Thanks for your work. I will check it in a while. The Detected Video also save in folder as detect.py save the annotated video?

You can save the video with detections if you also use the cv2.VideoWriter() method from the OpenCV library:

You’ll just be saving the frames from the frame variable. Include the method for writing to the actual output stream just before the cv2.imshow() line in the code I provided above.