Error: Roboflow-Custom-Scaled-YOLOv4

I am getting this error :
Epoch gpu_mem GIoU obj cls total targets img_size
0% 0/9 [00:03<?, ?it/s]
Traceback (most recent call last):
File “/content/gdrive/MyDrive/Marlene_PhD/obj_det/ScaledYOLOv4/train.py”, line 443, in
train(hyp, opt, device, tb_writer)
File “/content/gdrive/MyDrive/Marlene_PhD/obj_det/ScaledYOLOv4/train.py”, line 260, in train
loss, loss_items = compute_loss(pred, targets.to(device), model) # scaled by batch_size
File “/content/gdrive/MyDrive/Marlene_PhD/obj_det/ScaledYOLOv4/utils/general.py”, line 446, in compute_loss
tcls, tbox, indices, anchors = build_targets(p, targets, model) # targets
File “/content/gdrive/MyDrive/Marlene_PhD/obj_det/ScaledYOLOv4/utils/general.py”, line 556, in build_targets
indices.append((b, a, gj.clamp_(0, gain[3]), gi.clamp_(0, gain[2]))) # image, anchor, grid indices
RuntimeError: result type Float can’t be cast to the desired output type long int
CPU times: user 161 ms, sys: 36.7 ms, total: 198 ms
Wall time: 20.1 s

I tried changing the the torch==1.8.0+cu101 torchvision==0.9.0+cu101 from 2.0.0 but then got this error
ImportError: /usr/local/lib/python3.9/dist-packages/mish_cuda-0.0.3-py3.9-linux-x86_64.egg/mish_cuda/_C.cpython-39-x86_64-linux-gnu.so: undefined symbol: _ZNK2at10TensorBase8data_ptrIdEEPT_v CPU times: user 9.91 ms, sys: 10.3 ms, total: 20.2 ms Wall time: 931 ms


Does anyone know what I might have got wrong, and what can be done to resolve this?

Hi @Marlene - this is an error I’ve had some trouble debugging, due to the fact the original repo also notes the Issue, and has not been updated since June 2021.
image

I also want to add that notebook errors are reported and tracked here: Issues · roboflow/notebooks · GitHub

Regarding the error you’re experiencing, from an initial look:

File “/content/gdrive/MyDrive/Marlene_PhD/obj_det/ScaledYOLOv4/utils/general.py”, line 556, in build_targets
indices.append((b, a, gj.clamp_(0, gain[3]), gi.clamp_(0, gain[2]))) # image, anchor, grid indices
**RuntimeError: result type Float can’t be cast to the desired output type long int**

It appears the error comes in here with indices.append(): ScaledYOLOv4/general.py at aea215d495056132f91391f8e618682bef376338 · WongKinYiu/ScaledYOLOv4 · GitHub

        # Define
        b, c = t[:, :2].long().T  # image, class
        gxy = t[:, 2:4]  # grid xy
        gwh = t[:, 4:6]  # grid wh
        gij = (gxy - offsets).long()
        gi, gj = gij.T  # grid xy indices

        # Append
        a = t[:, 6].long()  # anchor indices
        indices.append((b, a, gj.clamp_(0, gain[3]), gi.clamp_(0, gain[2])))  # image, anchor, grid indices
def build_targets(p, targets, model):
    # Build targets for compute_loss(), input targets(image,class,x,y,w,h)
    det = model.module.model[-1] if is_parallel(model) else model.model[-1]  # Detect() module
    na, nt = det.na, targets.shape[0]  # number of anchors, targets
    tcls, tbox, indices, anch = [], [], [], []
    gain = torch.ones(7, device=targets.device)  # normalized to gridspace gain
    ai = torch.arange(na, device=targets.device).float().view(na, 1).repeat(1, nt)  # same as .repeat_interleave(nt)
    targets = torch.cat((targets.repeat(na, 1, 1), ai[:, :, None]), 2)  # append anchor indices

    g = 0.5  # bias
    off = torch.tensor([[0, 0],
                        [1, 0], [0, 1], [-1, 0], [0, -1],  # j,k,l,m
                        # [1, 1], [1, -1], [-1, 1], [-1, -1],  # jk,jm,lk,lm
                        ], device=targets.device).float() * g  # offsets

    for i in range(det.nl):
        anchors = det.anchors[i]
        gain[2:6] = torch.tensor(p[i].shape)[[3, 2, 3, 2]]  # xyxy gain

        # Match targets to anchors
        t = targets * gain
        if nt:
            # Matches
            r = t[:, :, 4:6] / anchors[:, None]  # wh ratio
            j = torch.max(r, 1. / r).max(2)[0] < model.hyp['anchor_t']  # compare
            # j = wh_iou(anchors, t[:, 4:6]) > model.hyp['iou_t']  # iou(3,n)=wh_iou(anchors(3,2), gwh(n,2))
            t = t[j]  # filter

            # Offsets
            gxy = t[:, 2:4]  # grid xy
            gxi = gain[[2, 3]] - gxy  # inverse
            j, k = ((gxy % 1. < g) & (gxy > 1.)).T
            l, m = ((gxi % 1. < g) & (gxi > 1.)).T
            j = torch.stack((torch.ones_like(j), j, k, l, m))
            t = t.repeat((5, 1, 1))[j]
            offsets = (torch.zeros_like(gxy)[None] + off[:, None])[j]
        else:
            t = targets[0]
            offsets = 0

        # Define
        b, c = t[:, :2].long().T  # image, class
        gxy = t[:, 2:4]  # grid xy
        gwh = t[:, 4:6]  # grid wh
        gij = (gxy - offsets).long()
        gi, gj = gij.T  # grid xy indices

        # Append
        a = t[:, 6].long()  # anchor indices
        indices.append((b, a, gj.clamp_(0, gain[3]), gi.clamp_(0, gain[2])))  # image, anchor, grid indices
        tbox.append(torch.cat((gxy - gij, gwh), 1))  # box
        anch.append(anchors[a])  # anchors
        tcls.append(c)  # class

    return tcls, tbox, indices, anch

I found the Issue also reported in the YOLOv7 repo (creator of Scaled-YOLOv4 also made YOLOv7).

It seems they introduced the same error in YOLOv7. This comment notes how to fix it:

The fix is to convert to long the "gain" of the gridspace on declaration.

In all lines with
gain = torch.ones(7, device=targets.device)

change to
gain = torch.ones(7, device=targets.device).long()

For some reason newest versions of cuda/torch do not do this cast automatically when needed

image

gain = torch.ones(7, device=targets.device) is located here utils/general.py#L512

*** try updating utils/general.py#L512 to read:
gain = torch.ones(7, device=targets.device).long()

and utils/general.py#L1101 to read:

targets.append([i, cls, float(x.cpu()),   
              float(y.cpu()),   
              float(w.cpu()),   
              float(h.cpu()),   
              float(conf.cpu())])

^^ To do this, open the notebook in Colab, then “Save a Copy in Drive”
image
image

  1. Ensure your Runtime is set to GPU:
    image

  2. Run the first cell:

# clone Scaled_YOLOv4
!git clone https://github.com/roboflow-ai/ScaledYOLOv4.git  # clone repo
%cd /content/ScaledYOLOv4/
#checkout the yolov4-large branch
!git checkout yolov4-large
  1. Edit ScaledYOLOv4/utils/general.py by navigating to it in your Colab directory, and double-clicking on general.py to open it:

  2. Edit Line 512 to this:
    gain = torch.ones(7, device=targets.device).long() # normalized to gridspace gain
    image

and Edit Line 1101 to this:

targets.append([i, cls, float(x.cpu()),   
              float(y.cpu()),   
              float(w.cpu()),   
              float(h.cpu()),   
              float(conf.cpu())])
  1. CTRL+s to save your changes

  2. Begin running the cells the rest of the notebook, starting with this cell:

import torch
print('Using torch %s %s' % (torch.__version__, torch.cuda.get_device_properties(0) if torch.cuda.is_available() else 'CPU'))

  1. Be sure you generate your dataset version, ensure Resize is to 416x416 (default input size for the model, and offering the fastest inference speeds) – then export your dataset in YOLOv5 PyTorch TXT format, and copy/paste your code snippet in the cell highlighted in my screenshot, below (you can skip the cell just above it):

  2. Run the training cell, and it works! (at least it did for me, just now):

Updated ScaledYOLOv4/utils/general.py (Google Drive download)

Once this PR is merged to the yolov4-large branch of the Scaled-YOLOv4 repo fork, you’ll be able to run the notebook while avoiding making any of the manual changes noted above ^ : update line 512 and line 1101 of utils/general.py by mo-traor3-ai · Pull Request #2 · roboflow/ScaledYOLOv4 · GitHub

Hi @Mohamed, thanks for your detailed reply, I will try it out and let you know how it goes.