Skip to main content

Metadata Overview

Valossa Metadata is the structured output of a video analysis job. It contains all detected concepts, time-coded occurrences, detection groupings, shot segmentations, and other information extracted from your video.

Metadata Types

Valossa Metadata is an umbrella term covering several distinct metadata types:

TypeDescriptionFormat
CoreThe primary and most comprehensive metadata. Contains all detections, detection groupings, segmentations, and temporal data.JSON
frames_facesPer-frame face bounding box coordinates for all detected faces.JSON
seconds_objectsPer-second object bounding box coordinates (currently for logos).JSON
frames_objectsPer-frame object bounding box coordinates.JSON
visual_captionsVisual scene description metadata.JSON
speech_to_text_srtSpeech-to-text transcript in SRT subtitle format.SRT text

The Core metadata is always the starting point. It contains the detection IDs that link to the spatial data in the specialized metadata files (frames_faces, seconds_objects, etc.).

How to Download Metadata

Use the job_results API endpoint with the type parameter:

# Core metadata (default)
curl "https://api-eu.valossa.com/core/1.0/job_results?api_key=YOUR_API_KEY&job_id=JOB_ID"

# Face bounding boxes
curl "https://api-eu.valossa.com/core/1.0/job_results?api_key=YOUR_API_KEY&job_id=JOB_ID&type=frames_faces"

# Object bounding boxes (seconds-based)
curl "https://api-eu.valossa.com/core/1.0/job_results?api_key=YOUR_API_KEY&job_id=JOB_ID&type=seconds_objects"

# Speech-to-text SRT
curl "https://api-eu.valossa.com/core/1.0/job_results?api_key=YOUR_API_KEY&job_id=JOB_ID&type=speech_to_text_srt"

Metadata is also downloadable from the Valossa Portal results page.

Versioning

Each metadata type has its own version number in x.y.z format:

  • x (major) -- Breaking changes to the structure
  • y (minor) -- New fields or features that may require parser updates
  • z (patch) -- Purely additive changes that will not break existing parsers

The version number is found in the version_info.metadata_format field of each metadata JSON file. Core metadata versioning is independent of frames_faces or seconds_objects versioning.

For a complete version history, see the Changelog.

Storage Recommendations

  • Save metadata locally. Download and store the metadata JSON in your database or file system after each analysis job completes. Metadata may not be stored permanently on Valossa servers, and download count limits may be imposed in the future.
  • File sizes vary from a few kilobytes to several megabytes depending on video length and the number of detections.
  • JSON is compact. The API returns JSON without line breaks or indentation. For human-readable viewing, use a pretty-printing tool:
    • Browser: JSONView plugin
    • Command line: cat metadata.json | python -m json.tool > pretty.json
    • Command line: jq . metadata.json > pretty.json

What Questions Does Metadata Answer?

Valossa Core metadata addresses four fundamental questions:

  1. What does the video contain? -- Browse detections by type (visual objects, faces, audio events, speech, topics).
  2. When does X appear in the video? -- Read the occs (occurrences) of a specific detection.
  3. What is happening at time X? -- Read the by_second array at the desired time index.
  4. What is the video about overall? -- Read topic.iab, topic.general, and topic.genre detections.

Next Steps