Faces and Identity

Face detection in Valossa AI produces human.face detections in the Core metadata. This page covers face identity matching, face grouping, bounding box coordinates, and the specialized frames_faces metadata.

Face Detection Basics

Each detected face in the video is represented as a human.face detection with:

Field	Description
`label`	Always "face"
`occs`	Time segments when the face appears
`a.gender`	Detected gender (`value`: "male"/"female", `c`: confidence)
`a.s_visible`	Total screen time in seconds (actual frame-by-frame visibility, usually less than combined occurrence duration)
`a.quality`	Either "normal" or "low" (low means the face is not frontal enough or otherwise less reliable)
`a.under_18_years`	Minor detection data: `c_max` (peak confidence), `c_median` (median confidence), and `intervals` (time segments where detection was triggered). Present when the face is estimated to be under 18 years old. Also flagged via `categ.tags` with `"under_18_years"`.
`a.similar_to`	Array of gallery face matches (if any)

Gallery Matching (similar_to)

When a detected face matches one or more faces in a face gallery (Valossa Celebrities Gallery or your custom gallery), the similar_to array contains match details:

{
  "t": "human.face",
  "label": "face",
  "a": {
    "gender": { "c": 0.929, "value": "female" },
    "s_visible": 4.4,
    "similar_to": [
      {
        "c": 0.928,
        "name": "Jane Doe",
        "gallery": { "id": "a3ead7b4-8e84-43ac-9e6b-d1727b05f189" },
        "gallery_face": {
          "id": "f6a728c6-5991-47da-9c17-b5302bfd0aff",
          "name": "Jane Doe"
        }
      }
    ]
  }
}

Field	Description
`c`	Confidence that this face is the named person (0.0 to 1.0)
`name`	Person name
`gallery.id`	UUID of the face gallery
`gallery_face.id`	UUID of the face identity within the gallery
`gallery_face.name`	Name of the person in the gallery

Matches are sorted by confidence (highest first). A face may match multiple gallery faces with varying confidence levels.

note

Face occurrences do not have their own c_max confidence. Face confidence is only in the similar_to items.

Minor Detection (under_18_years)

When a face is estimated to be under 18 years old, the under_18_years attribute provides confidence scores and the time intervals where the detection was triggered:

{
  "t": "human.face",
  "label": "face",
  "a": {
    "gender": { "value": "male", "c": 0.5 },
    "s_visible": 8.267,
    "quality": "low",
    "under_18_years": {
      "c_max": 0.516,
      "c_median": 0.475,
      "intervals": [
        { "ss": 1834.333, "se": 1840.333, "c_max": 0.516 }
      ]
    }
  },
  "categ": {
    "tags": ["content_compliance", "under_18_years"]
  }
}

Field	Description
`c_max`	Peak confidence that the face is under 18 across all intervals
`c_median`	Median confidence across all intervals
`intervals`	Array of time segments with `ss` (start), `se` (end), and `c_max` (peak confidence in that interval)

The detection is also flagged in categ.tags with both "content_compliance" and "under_18_years" tags.

Multiple Detections of the Same Person

The AI may create multiple human.face detections for the same person if the face appears sufficiently different across the video (different angles, lighting, etc.). Each of these detections can independently match the same gallery face via similar_to.

Merged Occurrences (similar_to_face_id)

To make it easy to find all appearances of a recognized person, the by_detection_property grouping merges occurrences across all face detections that match the same gallery face.

{
  "detection_groupings": {
    "by_detection_property": {
      "human.face": {
        "similar_to_face_id": {
          "cb6f580b-fa3f-4ed4-94b6-ec88c6267143": {
            "moccs": [
              { "ss": 5.0, "se": 10.0 },
              { "ss": 21.0, "se": 35.0 },
              { "ss": 64.0, "se": 88.0 }
            ],
            "det_ids": ["3", "4"]
          }
        }
      }
    }
  }
}

Field	Description
Key (UUID)	Gallery face ID
`moccs`	Merged occurrences from all matching face detections
`det_ids`	Array of detection IDs that matched this gallery face

Use det_ids to look up the original detections and read the person's name from similar_to.

Minor Grouping (refined_from_multiple_detection_types)

The by_detection_property grouping also includes a refined_from_multiple_detection_types section that aggregates minor detections across detection types:

{
  "by_detection_property": {
    "human.face": {
      "similar_to_face_id": { ... }
    },
    "refined_from_multiple_detection_types": {
      "people_under_18_years": {
        "human.face": [
          {
            "det_id": "6",
            "intervals": [
              { "ss": 3151.667, "se": 3152.667, "c_max": 0.961 },
              { "ss": 3155.667, "se": 3211.133, "c_max": 0.979 }
            ]
          }
        ],
        "visual.context": []
      }
    }
  }
}

Field	Description
`people_under_18_years`	Groups detections where a person under 18 was identified
`human.face` / `visual.context`	Arrays of detections per type that contributed to the minor detection
`det_id`	Detection ID
`intervals`	Time segments with `ss` (start), `se` (end), and `c_max` (peak confidence)

This grouping allows you to find all under-18 detections in a single lookup, regardless of which detection type produced them.

Face Groups

human.face_group detections group faces that have high temporal correlation, meaning they likely appear together and may be interacting.

Per-Second Face Data

In the by_second structure, face detections include additional per-second attributes:

Face Size

{
  "d": "1",
  "o": ["1"],
  "a": {
    "sz": { "h": 0.188 }
  }
}

The h field is the face height as a fraction of the video frame height (1.0 = full frame height). The value is measured at the first frame within that second where the face is detected.

Face Emotions

When face emotion analysis is enabled, the by_second data includes sentiment and emotion data. See Sentiment & Emotion for details.

Face Bounding Boxes (frames_faces Metadata)

For per-frame face coordinates, download the frames_faces metadata:

curl "https://api-eu.valossa.com/core/1.0/job_results?api_key=YOUR_API_KEY&job_id=JOB_ID&type=frames_faces"

Structure

The frames_faces metadata contains a faces_by_frame array indexed by frame number (0-based):

{
  "version_info": { "metadata_type": "frames_faces", ... },
  "faces_by_frame": [
    [],
    [],
    [
      {
        "id": "1",
        "x": 0.445,
        "y": 0.194,
        "w": 0.120,
        "h": 0.214
      }
    ],
    [
      {
        "id": "1",
        "x": 0.435,
        "y": 0.196,
        "w": 0.120,
        "h": 0.215
      },
      {
        "id": "5",
        "x": 0.338,
        "y": 0.239,
        "w": 0.205,
        "h": 0.399
      }
    ]
  ]
}

Bounding Box Fields

Field	Description
`id`	Detection ID matching the `human.face` detection in Core metadata
`x`	X offset of the upper-left corner (fraction of frame width, 0.0 to 1.0)
`y`	Y offset of the upper-left corner (fraction of frame height, 0.0 to 1.0)
`w`	Width of the bounding box (fraction of frame width)
`h`	Height of the bounding box (fraction of frame height)

All coordinate values are relative to the frame dimensions. Values may be slightly less than 0.0 or greater than 1.0 when a face is partially outside the frame.

Code Example: Reading Face Bounding Boxes

import json

with open("frames_faces.json", "r") as f:
    faces_metadata = json.load(f)

with open("core_metadata.json", "r") as f:
    core = json.load(f)

fps = core["media_info"]["technical"]["fps"]

for frame_idx, faces in enumerate(faces_metadata["faces_by_frame"]):
    if faces:
        time_s = frame_idx / fps
        for face in faces:
            det = core["detections"].get(face["id"], {})
            name = "Unknown"
            if "a" in det and "similar_to" in det["a"]:
                name = det["a"]["similar_to"][0]["name"]
            print(f"Frame {frame_idx} ({time_s:.2f}s): {name} at ({face['x']:.3f}, {face['y']:.3f}), size {face['w']:.3f}x{face['h']:.3f}")

Face Training API -- Create custom face galleries
Face Recognition Guide -- End-to-end face recognition workflow
Sentiment & Emotion -- Face emotion and valence data

Face Detection Basics​

Gallery Matching (similar_to)​

Minor Detection (under_18_years)​

Multiple Detections of the Same Person​

Merged Occurrences (similar_to_face_id)​

Minor Grouping (refined_from_multiple_detection_types)​

Face Groups​

Per-Second Face Data​

Face Size​

Face Emotions​

Face Bounding Boxes (frames_faces Metadata)​

Structure​

Bounding Box Fields​

Code Example: Reading Face Bounding Boxes​

Related Resources​