Localized Objects

In addition to face bounding boxes, Valossa AI provides spatial coordinates for certain types of visual detections through the visual.object.localized detection type. Currently, this feature is available for logo detection, with potential for custom detection models.

How It Works

Localized objects appear in two places:

Core metadata: The visual.object.localized detection (with label, occurrences, etc.) appears in the detections structure and in by_second and by_detection_type groupings.
seconds_objects metadata (or frames_objects metadata): Contains the per-second (or per-frame) bounding box coordinates for these detections.

The split design keeps the Core metadata file manageable in size while providing precise spatial data in a separate download.

Downloading Localized Object Coordinates

Seconds-Based (seconds_objects)

curl "https://api-eu.valossa.com/core/1.0/job_results?api_key=YOUR_API_KEY&job_id=JOB_ID&type=seconds_objects"

Frames-Based (frames_objects)

curl "https://api-eu.valossa.com/core/1.0/job_results?api_key=YOUR_API_KEY&job_id=JOB_ID&type=frames_objects"

seconds_objects Structure

The objects_by_second array is indexed by second number (0-based). Each second contains an array of detection items.

{
  "version_info": { "metadata_type": "seconds_objects", ... },
  "objects_by_second": [
    [],
    [],
    [
      {
        "d": "137",
        "o": ["314"],
        "b": [
          {
            "x": 0.267,
            "y": 0.747,
            "w": 0.081,
            "h": 0.066,
            "c": 0.995
          },
          {
            "x": 0.498,
            "y": 0.669,
            "w": 0.081,
            "h": 0.069,
            "c": 0.984
          }
        ]
      }
    ]
  ]
}

Detection Item Fields

Field	Description
`d`	Detection ID (references the `visual.object.localized` detection in Core metadata)
`o`	Array of occurrence IDs overlapping with this second
`b`	Array of bounding boxes for this detection in this second

Bounding Box Fields

Field	Description
`x`	X offset of the upper-left corner (fraction of frame width)
`y`	Y offset of the upper-left corner (fraction of frame height)
`w`	Width (fraction of frame width)
`h`	Height (fraction of frame height)
`c`	Confidence of the detection at this bounding box location

All coordinate values are relative to the frame size (0.0 to 1.0). Values may be slightly outside this range when an object is partially off-screen.

Multiple Bounding Boxes

A single detection can have multiple bounding boxes in the same second. For example, if two instances of the same logo appear simultaneously in the frame, the b array will contain two bounding box objects.

In the Core metadata's by_second, the c confidence for a visual.object.localized detection is the highest confidence among all simultaneously observed bounding boxes. To see individual bounding box confidences, read the seconds_objects metadata.

Confidence in Core vs. seconds_objects

Source	Confidence Meaning
Core metadata `by_second`	Maximum confidence across all bounding boxes in that second
`seconds_objects` bounding box `c`	Confidence for that specific bounding box instance

Code Example

Python: Extract Logo Positions

import json

with open("core_metadata.json", "r") as f:
    core = json.load(f)

with open("seconds_objects.json", "r") as f:
    objects = json.load(f)

for second_idx, second_data in enumerate(objects["objects_by_second"]):
    for item in second_data:
        det_id = item["d"]
        detection = core["detections"].get(det_id, {})
        label = detection.get("label", "Unknown")

        for bbox in item["b"]:
            print(
                f"Second {second_idx}: {label} "
                f"at ({bbox['x']:.3f}, {bbox['y']:.3f}), "
                f"size {bbox['w']:.3f}x{bbox['h']:.3f}, "
                f"confidence {bbox['c']:.3f}"
            )

Faces & Identity -- Face bounding boxes (frames_faces format)
Detection Types -- Full list of detection types

How It Works​

Downloading Localized Object Coordinates​

Seconds-Based (seconds_objects)​

Frames-Based (frames_objects)​

seconds_objects Structure​

Detection Item Fields​

Bounding Box Fields​

Multiple Bounding Boxes​

Confidence in Core vs. seconds_objects​

Code Example​

Python: Extract Logo Positions​

Related Resources​