I have uploaded a labelled Image data using roboflow and extracted it as a COCO format.
I’m using detectron2 to do object detection on a video after training a custom dataset. The output video does the detection, but it indicates category ids of either 1 or 2. But I want my detection to show the class names of either ‘weed’ or ‘crop’ instead.
I have downloaded the config files and the model weights for my custom training and used them as shown below;
Thanks @leo for getting back. I was following the getting started notebook from detectron2 documentation. It worked when I run their documentation notebook, but when a similar code on a custom dataset, it failed to work as expected. The video inference code for my custom dataset is as below;
Though with more deep research I have been able to find a solution using the below code;
import detectron2
from detectron2.utils.logger import setup_logger
setup_logger()
# import some common libraries
import numpy as np
import tqdm
import cv2
# import some common detectron2 utilities
from detectron2 import model_zoo
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2.utils.video_visualizer import VideoVisualizer
from detectron2.utils.visualizer import ColorMode, Visualizer
from detectron2.data import MetadataCatalog
import time
# Extract video properties
video = cv2.VideoCapture('/content/crop-weed.mp4')
width = int(video.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(video.get(cv2.CAP_PROP_FRAME_HEIGHT))
frames_per_second = video.get(cv2.CAP_PROP_FPS)
num_frames = int(video.get(cv2.CAP_PROP_FRAME_COUNT))
# Initialize video writer
video_writer = cv2.VideoWriter('inst-seg.mp4', fourcc=cv2.VideoWriter_fourcc(*"mp4v"), fps=float(frames_per_second),
frameSize=(width, height), isColor=True)
# Initialize visualizer
v = VideoVisualizer(MetadataCatalog.get("my_dataset_train"), ColorMode.IMAGE)
def runOnVideo(video, maxFrames):
""" Runs the predictor on every frame in the video (unless maxFrames is given),
and returns the frame with the predictions drawn.
"""
readFrames = 0
while True:
hasFrame, frame = video.read()
if not hasFrame:
break
# Get prediction results for this frame
outputs = predictor(frame)
# Make sure the frame is colored
frame = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
# Draw a visualization of the predictions using the video visualizer
visualization = v.draw_instance_predictions(frame, outputs["instances"].to("cpu"))
# Convert Matplotlib RGB format to OpenCV BGR format
visualization = cv2.cvtColor(visualization.get_image(), cv2.COLOR_RGB2BGR)
yield visualization
readFrames += 1
if readFrames > maxFrames:
break
# Enumerate the frames of the video
for visualization in tqdm.tqdm(runOnVideo(video, num_frames), total=num_frames):
# Write test image
# cv2.imwrite('/content/33514.jpg', visualization)
# Write to video file
video_writer.write(visualization)
# Release resources
video.release()
video_writer.release()
cv2.destroyAllWindows()
The above code works as expected, but it’s very long & not the best.
It would be really nice to make the first code work and that’s something I’m searching. I would love your help on that