I am confused about the representation of data in the "Data in" view when testing my blocks

In developing our first custom python block, it took us some time to figure out how to reference attributes in the “face predictions” output from the “Gaze Detection” model, and this was due to the fact that the output as shown in the Data in / Data out portions of the blocks did not match the structure in the code.

Specifically, we found out eventually through some error messages and google searches, that the “face predictions” data was an sv.Detections object (Core - Supervision). However, in the Data in section (represented as the parameter “predictions”), it appears to be an array of dictionaries:

So, for example, getting the x & y of the bounding box of a prediction, from the info above, would appear to be done like this:

    for p in predictions:
        x1 = p["x"]
        y1 = p["y"]
        x2 = x1 + p["width"]
        y2 = y1 + p["height"]
        center_x = (x1 + x2) // 2
        center_y = (y1 + y2) // 2

This fails, though, because sv.Predictions stores the rectangles in the array xyxy, and height & width in the dictionary data (as the array “height” and array “width”), meaning we do this:

    for box in predictions.xyxy:
        x1, y1, x2, y2 = map(int, box)
        center_x = (x1 + x2) // 2
        center_y = (y1 + y2) // 2
        width = x2 - x1
        height = y2 - y1
        

Is there someplace else we should be looking for the real structure of the data that is produced by upstream modules? I see from this example with gaze detection that if you invoke it via your own python, it returns the results as json. Was it incorrect to configure our block with a parameter called “predictions” of type “any” that is linked to the “face predictions” output of the gaze detection model? (cannot include a second screenshot due to account restrictions).

EDIT: please ignore the bits about the width & height above, those are from our own model, and don’t exist in the gaze detection example I’m using.

Hi @sbroberg!
Fantastic question! I am going to loop in some team members to get the best possible answer.

Thank you for contributing to the Roboflow community!

Yes! I was just fighting this same issue the past few days. I thought I could get to pieces of data I wanted with some dictionary and list calls but it never worked.

Finally I figured out that if I put in a Property Definition block after the model, and set the Operation inside that block to “Convert to Dictionary” I was then able to access the data as I expected (based on my review of what was output from the Property Definition block.)

Hopefully the screenshots below help clarify my process for you and maybe this solution will work in your case as well.

Here’s the final version of my simple workflow (but hold off on adding in your Custom Python Block):

This is how I set up the Property Definition block:

Then BEFORE adding the Custom Python Block I ran to see what the Property Definition gave me as an output:

And then I added the Custom Python Block with this code and it pulled the pieces of data! (again, remember to add this block as an output to see its results):

with output…
image

Really hope this helps!

@Automatez, thanks, that certainly does help. It’s a bit mysterious to me exactly how/why it does, though. How does the “Convert to Dictionary” step know how to perform the translation? From the definition of sv.Detections, we see that there are a number of predefined members (xyxy, mask, class_id, etc), which are arrays of specific types (a 4 element array for xyxy, scalars for things like class_id, masks in masks, etc.), but then there’s “data” which presumably is where attributes unique to a particular model will land, which is a dictionary of arrays of, well, whatever model produced.

Obviously part of the “convert” logic is inverting nesting; after that, I’m unclear how things like xyxy get turned into dictionary keys. Presumably the keys in “data” map directly to keys in the arrayed dictionaries, but are the rest remapped based on rules hardcoded in Roboflow?

Also: in the image above, you can see that under face prediction, it says:

Data Type: keypoint detection prediction

What are these words referring to? If they are types, where are they documented?

From a learner’s perspective, it would be good to see a comment or other label on this panel to indicate that you need to use the “Convert to Dictionary” operation in order to access the data as it is presented there.