Content Moderation Guide

Available on Transcribe Pro Vision MAX

Content moderation (visual, audio, and speech compliance detection) is fully included in the Transcribe Pro Vision MAX free trial. Start free — no sales call →

This guide walks you through using Valossa AI to automatically detect inappropriate or sensitive content in videos. Content moderation covers visual, speech-based, and audio-based detections for use in brand safety, platform compliance, and age-rating workflows.

Quick Exploration with Metadata Reader

Before writing custom code, you can quickly inspect compliance detections using the Metadata Reader CLI tool:

# List all content compliance category tags
python -m metareader list-categories core_metadata.json

# List all detections and filter by compliance type
python -m metareader list-detections --type "audio.keyword.compliance" core_metadata.json

Overview

Valossa AI detects sensitive content through three channels:

Channel	What It Detects	Key Detection Types
Visual	Nudity, violence, weapons, substance use, etc.	`visual.context` with `content_compliance` category tag
Speech	Profanity, references to violence/substances	`audio.keyword.compliance`
Audio sounds	Gunshots, explosions, violent sounds	`audio.context` with `content_compliance` category tag

Step 1: Analyze Your Video

Submit the video for analysis:

import requests

response = requests.post(
    "https://api-eu.valossa.com/core/1.0/new_job",
    json={
        "api_key": "YOUR_API_KEY",
        "media": {
            "video": {"url": "https://example.com/video.mp4"},
            "language": "en-US"
        }
    }
)
job_id = response.json()["job_id"]

Wait for completion, then download the results:

import time

while True:
    status = requests.get(
        "https://api-eu.valossa.com/core/1.0/job_status",
        params={"api_key": "YOUR_API_KEY", "job_id": job_id}
    ).json()
    if status["status"] == "finished":
        break
    time.sleep(10)

metadata = requests.get(
    "https://api-eu.valossa.com/core/1.0/job_results",
    params={"api_key": "YOUR_API_KEY", "job_id": job_id}
).json()

Step 2: Find All Content Compliance Detections

The content_compliance category tag is present on all sensitive detections. Filter for it across all detection types:

compliance_issues = []

for det_id, detection in metadata["detections"].items():
    if "categ" in detection:
        tags = detection["categ"]["tags"]
        if "content_compliance" in tags:
            compliance_issues.append({
                "id": det_id,
                "type": detection["t"],
                "label": detection["label"],
                "tags": tags,
                "occurrences": detection.get("occs", [])
            })

print(f"Found {len(compliance_issues)} content compliance detections")

Step 3: Filter by Specific Sensitive Categories

For more granular control, filter by specific sensitive tags:

# Define which categories to flag
flagged_categories = {
    "sexual",
    "violence",
    "act_of_violence",
    "substance_use",
    "gun_weapon",
    "explicit_content",
    "bad_language"
}

severe_issues = []
for issue in compliance_issues:
    matched_tags = set(issue["tags"]) & flagged_categories
    if matched_tags:
        severe_issues.append({
            **issue,
            "matched_categories": list(matched_tags)
        })

for issue in severe_issues:
    print(f"[{issue['type']}] {issue['label']}")
    print(f"  Categories: {', '.join(issue['matched_categories'])}")
    for occ in issue["occurrences"]:
        print(f"  Time: {occ['ss']:.1f}s - {occ['se']:.1f}s")

Step 4: Check Speech-Based Compliance

Speech compliance detections use a separate detection type:

speech_compliance_ids = metadata["detection_groupings"]["by_detection_type"].get(
    "audio.keyword.compliance", []
)

for det_id in speech_compliance_ids:
    detection = metadata["detections"][det_id]
    print(f"Speech compliance: '{detection['label']}'")
    for occ in detection.get("occs", []):
        print(f"  Time: {occ['ss']:.1f}s - {occ['se']:.1f}s")

Step 5: Check Audio Context for Sensitive Sounds

Some audio events (explosions, gunshots) may be flagged:

audio_ids = metadata["detection_groupings"]["by_detection_type"].get("audio.context", [])

for det_id in audio_ids:
    detection = metadata["detections"][det_id]
    if "categ" in detection and "content_compliance" in detection["categ"]["tags"]:
        print(f"Sensitive audio: '{detection['label']}'")
        for occ in detection.get("occs", []):
            print(f"  Time: {occ['ss']:.1f}s - {occ['se']:.1f}s")

Step 6: Generate a Compliance Report

Combine all findings into a structured report:

import json

report = {
    "job_id": metadata["job_info"]["job_id"],
    "duration_s": metadata["media_info"]["technical"]["duration_s"],
    "visual_flags": [],
    "speech_flags": [],
    "audio_flags": [],
    "is_safe": True
}

for issue in compliance_issues:
    entry = {
        "label": issue["label"],
        "categories": issue["tags"],
        "time_segments": [
            {"start": o["ss"], "end": o["se"]}
            for o in issue["occurrences"]
        ]
    }

    if issue["type"] == "visual.context":
        report["visual_flags"].append(entry)
    elif issue["type"].startswith("audio.keyword"):
        report["speech_flags"].append(entry)
    elif issue["type"] == "audio.context":
        report["audio_flags"].append(entry)

    report["is_safe"] = False

print(json.dumps(report, indent=2))

Using Shot Boundaries for Clipping

When you need to extract the full shot containing a flagged detection (for human review or automated removal):

shots = metadata["segmentations"]["detected_shots"]

for issue in compliance_issues:
    for occ in issue["occurrences"]:
        shot_start_idx = occ.get("shs", 0)
        shot_end_idx = occ.get("she", shot_start_idx)
        shot_start = shots[shot_start_idx]
        shot_end = shots[shot_end_idx]
        print(
            f"Flagged: '{issue['label']}' -> "
            f"Full shot clip: {shot_start['ss']:.2f}s - {shot_end['se']:.2f}s"
        )

Detection Categories -- Full list of sensitive and non-sensitive categories
Detection Types -- All detection types reference
Occurrences -- How to work with time-coded data
Metadata Reader -- CLI tool for quick extraction of compliance detections

Overview​

Step 1: Analyze Your Video​

Step 2: Find All Content Compliance Detections​

Step 3: Filter by Specific Sensitive Categories​

Step 4: Check Speech-Based Compliance​

Step 5: Check Audio Context for Sensitive Sounds​

Step 6: Generate a Compliance Report​

Using Shot Boundaries for Clipping​

Related Resources​