ARKit + Roboflow integration

Not sure who this will be useful for, but I’ve been working on integrating a Roboflow model into ARKit to detect objects based on the AR frames (rt-detr doesn’t seem to convert to coreml because of architecture as far as I can tell).

I have run into some issues with a handful of things, but the biggest issue seemed to be the orientation of the CVPixelBuffer from and ARFrame. It defaults (always is?) right oriented, so that orientation needs to be passed to the VNImageRequestHandler

(let handler = VNImageRequestHandler(cvPixelBuffer: buffer, orientation: .right)).

In turn, the coordinates of the detection need to be transposed back:

(let flippedBox = CGRect(x: 1-detectResult.boundingBox.maxY, y: 1 - detectResult.boundingBox.maxX, width: detectResult.boundingBox.height, height: detectResult.boundingBox.width)).

I’ve just made those changes locally to RFObjectDetectionModel and I get much higher confidence scores, so I’m guessing other people mind find this valuable.

6 Likes

Hi @Nicholas_Clark!
This is absolutely fantastic, thank you for sharing!!

Would love to hear more about your project.

@Nicholas_Clark I maintain the Roboflow iOS SDK and would love to make these changes natively for similar use-cases! Do you mind sharing your complete RFObjectDetectionModel or opening a PR on GitHub - roboflow/roboflow-swift ?

1 Like

Will send you full details in a couple days once I work through how to make everything work best. I am doing a few things:

  1. Cropping the pixel buffer before passing to the model (this is more for my use case, but if you tile the images in training, you probably want to crop)
  2. Setting the orientation correctly to .right
  3. Changing the imageCropAndScaleOption to do nothing (scaleFill stretches the image so not great for small objects)
  4. Transposing the coordinates back to portrait orientation (which only makes sense if you are doing things in portrait)