Metadata Overview

Valossa Metadata is the structured output of a video analysis job. It contains all detected concepts, time-coded occurrences, detection groupings, shot segmentations, and other information extracted from your video.

Metadata Types

Valossa Metadata is an umbrella term covering several distinct metadata types:

Type	Description	Format
Core	The primary and most comprehensive metadata. Contains all detections, detection groupings, segmentations, and temporal data.	JSON
frames_faces	Per-frame face bounding box coordinates for all detected faces.	JSON
seconds_objects	Per-second object bounding box coordinates (currently for logos).	JSON
frames_objects	Per-frame object bounding box coordinates.	JSON
visual_captions	Visual scene description metadata.	JSON
speech_to_text_srt	Speech-to-text transcript in SRT subtitle format.	SRT text

The Core metadata is always the starting point. It contains the detection IDs that link to the spatial data in the specialized metadata files (frames_faces, seconds_objects, etc.).

How to Download Metadata

Use the job_results API endpoint with the type parameter:

# Core metadata (default)
curl "https://api-eu.valossa.com/core/1.0/job_results?api_key=YOUR_API_KEY&job_id=JOB_ID"

# Face bounding boxes
curl "https://api-eu.valossa.com/core/1.0/job_results?api_key=YOUR_API_KEY&job_id=JOB_ID&type=frames_faces"

# Object bounding boxes (seconds-based)
curl "https://api-eu.valossa.com/core/1.0/job_results?api_key=YOUR_API_KEY&job_id=JOB_ID&type=seconds_objects"

# Speech-to-text SRT
curl "https://api-eu.valossa.com/core/1.0/job_results?api_key=YOUR_API_KEY&job_id=JOB_ID&type=speech_to_text_srt"

Metadata is also downloadable from the Valossa Portal results page.

Versioning

Each metadata type has its own version number in x.y.z format:

x (major) -- Breaking changes to the structure
y (minor) -- New fields or features that may require parser updates
z (patch) -- Purely additive changes that will not break existing parsers

The version number is found in the version_info.metadata_format field of each metadata JSON file. Core metadata versioning is independent of frames_faces or seconds_objects versioning.

For a complete version history, see the Changelog.

Storage Recommendations

Save metadata locally. Download and store the metadata JSON in your database or file system after each analysis job completes. Metadata may not be stored permanently on Valossa servers, and download count limits may be imposed in the future.
File sizes vary from a few kilobytes to several megabytes depending on video length and the number of detections.
JSON is compact. The API returns JSON without line breaks or indentation. For human-readable viewing, use a pretty-printing tool:
- Browser: JSONView plugin
- Command line: cat metadata.json | python -m json.tool > pretty.json
- Command line: jq . metadata.json > pretty.json

What Questions Does Metadata Answer?

Valossa Core metadata addresses four fundamental questions:

What does the video contain? -- Browse detections by type (visual objects, faces, audio events, speech, topics).
When does X appear in the video? -- Read the occs (occurrences) of a specific detection.
What is happening at time X? -- Read the by_second array at the desired time index.
What is the video about overall? -- Read topic.iab, topic.general, and topic.genre detections.

Next Steps

JSON Structure -- Detailed breakdown of the Core metadata format
Detection Types -- All 36+ detection types explained
Occurrences -- How temporal data is represented

Metadata Types​

How to Download Metadata​

Versioning​

Storage Recommendations​

What Questions Does Metadata Answer?​

Next Steps​