Content Moderation Guide
Content moderation (visual, audio, and speech compliance detection) is fully included in the Transcribe Pro Vision MAX free trial. Start free — no sales call →
This guide walks you through using Valossa AI to automatically detect inappropriate or sensitive content in videos. Content moderation covers visual, speech-based, and audio-based detections for use in brand safety, platform compliance, and age-rating workflows.
Before writing custom code, you can quickly inspect compliance detections using the Metadata Reader CLI tool:
# List all content compliance category tags
python -m metareader list-categories core_metadata.json
# List all detections and filter by compliance type
python -m metareader list-detections --type "audio.keyword.compliance" core_metadata.json
Overview
Valossa AI detects sensitive content through three channels:
| Channel | What It Detects | Key Detection Types |
|---|---|---|
| Visual | Nudity, violence, weapons, substance use, etc. | visual.context with content_compliance category tag |
| Speech | Profanity, references to violence/substances | audio.keyword.compliance |
| Audio sounds | Gunshots, explosions, violent sounds | audio.context with content_compliance category tag |
Step 1: Analyze Your Video
Submit the video for analysis:
import requests
response = requests.post(
"https://api-eu.valossa.com/core/1.0/new_job",
json={
"api_key": "YOUR_API_KEY",
"media": {
"video": {"url": "https://example.com/video.mp4"},
"language": "en-US"
}
}
)
job_id = response.json()["job_id"]
Wait for completion, then download the results:
import time
while True:
status = requests.get(
"https://api-eu.valossa.com/core/1.0/job_status",
params={"api_key": "YOUR_API_KEY", "job_id": job_id}
).json()
if status["status"] == "finished":
break
time.sleep(10)
metadata = requests.get(
"https://api-eu.valossa.com/core/1.0/job_results",
params={"api_key": "YOUR_API_KEY", "job_id": job_id}
).json()
Step 2: Find All Content Compliance Detections
The content_compliance category tag is present on all sensitive detections. Filter for it across all detection types:
compliance_issues = []
for det_id, detection in metadata["detections"].items():
if "categ" in detection:
tags = detection["categ"]["tags"]
if "content_compliance" in tags:
compliance_issues.append({
"id": det_id,
"type": detection["t"],
"label": detection["label"],
"tags": tags,
"occurrences": detection.get("occs", [])
})
print(f"Found {len(compliance_issues)} content compliance detections")
Step 3: Filter by Specific Sensitive Categories
For more granular control, filter by specific sensitive tags:
# Define which categories to flag
flagged_categories = {
"sexual",
"violence",
"act_of_violence",
"substance_use",
"gun_weapon",
"explicit_content",
"bad_language"
}
severe_issues = []
for issue in compliance_issues:
matched_tags = set(issue["tags"]) & flagged_categories
if matched_tags:
severe_issues.append({
**issue,
"matched_categories": list(matched_tags)
})
for issue in severe_issues:
print(f"[{issue['type']}] {issue['label']}")
print(f" Categories: {', '.join(issue['matched_categories'])}")
for occ in issue["occurrences"]:
print(f" Time: {occ['ss']:.1f}s - {occ['se']:.1f}s")
Step 4: Check Speech-Based Compliance
Speech compliance detections use a separate detection type:
speech_compliance_ids = metadata["detection_groupings"]["by_detection_type"].get(
"audio.keyword.compliance", []
)
for det_id in speech_compliance_ids:
detection = metadata["detections"][det_id]
print(f"Speech compliance: '{detection['label']}'")
for occ in detection.get("occs", []):
print(f" Time: {occ['ss']:.1f}s - {occ['se']:.1f}s")
Step 5: Check Audio Context for Sensitive Sounds
Some audio events (explosions, gunshots) may be flagged:
audio_ids = metadata["detection_groupings"]["by_detection_type"].get("audio.context", [])
for det_id in audio_ids:
detection = metadata["detections"][det_id]
if "categ" in detection and "content_compliance" in detection["categ"]["tags"]:
print(f"Sensitive audio: '{detection['label']}'")
for occ in detection.get("occs", []):
print(f" Time: {occ['ss']:.1f}s - {occ['se']:.1f}s")
Step 6: Generate a Compliance Report
Combine all findings into a structured report:
import json
report = {
"job_id": metadata["job_info"]["job_id"],
"duration_s": metadata["media_info"]["technical"]["duration_s"],
"visual_flags": [],
"speech_flags": [],
"audio_flags": [],
"is_safe": True
}
for issue in compliance_issues:
entry = {
"label": issue["label"],
"categories": issue["tags"],
"time_segments": [
{"start": o["ss"], "end": o["se"]}
for o in issue["occurrences"]
]
}
if issue["type"] == "visual.context":
report["visual_flags"].append(entry)
elif issue["type"].startswith("audio.keyword"):
report["speech_flags"].append(entry)
elif issue["type"] == "audio.context":
report["audio_flags"].append(entry)
report["is_safe"] = False
print(json.dumps(report, indent=2))
Using Shot Boundaries for Clipping
When you need to extract the full shot containing a flagged detection (for human review or automated removal):
shots = metadata["segmentations"]["detected_shots"]
for issue in compliance_issues:
for occ in issue["occurrences"]:
shot_start_idx = occ.get("shs", 0)
shot_end_idx = occ.get("she", shot_start_idx)
shot_start = shots[shot_start_idx]
shot_end = shots[shot_end_idx]
print(
f"Flagged: '{issue['label']}' -> "
f"Full shot clip: {shot_start['ss']:.2f}s - {shot_end['se']:.2f}s"
)
Related Resources
- Detection Categories -- Full list of sensitive and non-sensitive categories
- Detection Types -- All detection types reference
- Occurrences -- How to work with time-coded data
- Metadata Reader -- CLI tool for quick extraction of compliance detections