Face Recognition Guide
Face detection, celebrity recognition, and custom face gallery training are all included in the Transcribe Pro Vision MAX free trial. Start free — no sales call →
This guide covers detecting faces in videos, identifying known people, and training custom face galleries for your own face identities.
Before writing custom code, you can quickly inspect face detections using the Metadata Reader CLI tool:
# List face detections with identity matches
python -m metareader list-detections --type "human.face" core_metadata.json
# See when each face appears in the video
python -m metareader list-occurrences --type "human.face" core_metadata.json
# Generate facial sentiment charts (requires matplotlib)
python -m metareader plot --sentiment core_metadata.json
How Face Recognition Works
- Valossa AI detects all visible faces in the video and creates
human.facedetections. - Each detected face is compared against face galleries (Valossa Celebrities Gallery and/or your custom galleries).
- Matches are reported in the
similar_tofield with confidence scores. - Face grouping (
human.face_group) identifies people who appear together.
Step 1: Analyze a Video
import requests
import time
response = requests.post(
"https://api-eu.valossa.com/core/1.0/new_job",
json={
"api_key": "YOUR_API_KEY",
"media": {
"video": {"url": "https://example.com/video.mp4"}
}
}
)
job_id = response.json()["job_id"]
# Wait for completion
while True:
status = requests.get(
"https://api-eu.valossa.com/core/1.0/job_status",
params={"api_key": "YOUR_API_KEY", "job_id": job_id}
).json()
if status["status"] == "finished":
break
time.sleep(10)
metadata = requests.get(
"https://api-eu.valossa.com/core/1.0/job_results",
params={"api_key": "YOUR_API_KEY", "job_id": job_id}
).json()
Step 2: List All Detected Faces
face_ids = metadata["detection_groupings"]["by_detection_type"].get("human.face", [])
print(f"Total faces detected: {len(face_ids)}")
for det_id in face_ids:
detection = metadata["detections"][det_id]
attrs = detection.get("a", {})
gender = attrs.get("gender", {})
screen_time = attrs.get("s_visible", 0)
quality = attrs.get("quality", "normal")
print(f"\nFace (ID: {det_id}):")
print(f" Gender: {gender.get('value', 'N/A')} (confidence: {gender.get('c', 0):.2f})")
print(f" Screen time: {screen_time:.1f}s")
print(f" Quality: {quality}")
# Check for identity matches
if "similar_to" in attrs:
for match in attrs["similar_to"]:
print(f" Identified as: {match['name']} (confidence: {match['c']:.2f})")
if "gallery" in match:
print(f" Gallery: {match['gallery']['id']}")
else:
print(" Identity: Unknown (no gallery match)")
Step 3: Find All Appearances of a Recognized Person
Use the similar_to_face_id grouping for merged occurrences:
face_property = (
metadata["detection_groupings"]
.get("by_detection_property", {})
.get("human.face", {})
.get("similar_to_face_id", {})
)
for face_uuid, data in face_property.items():
# Look up the name from one of the detections
det_id = data["det_ids"][0]
detection = metadata["detections"][det_id]
name = "Unknown"
for match in detection.get("a", {}).get("similar_to", []):
if match.get("gallery_face", {}).get("id") == face_uuid:
name = match["name"]
break
total_time = sum(m["se"] - m["ss"] for m in data["moccs"])
print(f"\n{name} (gallery face: {face_uuid}):")
print(f" Total appearances: {len(data['moccs'])}")
print(f" Total time: {total_time:.1f}s")
print(f" Detection IDs: {data['det_ids']}")
for mocc in data["moccs"]:
print(f" Segment: {mocc['ss']:.1f}s - {mocc['se']:.1f}s")
Step 4: Get Face Bounding Boxes (Optional)
For spatial face coordinates, download the frames_faces metadata:
faces_metadata = requests.get(
"https://api-eu.valossa.com/core/1.0/job_results",
params={
"api_key": "YOUR_API_KEY",
"job_id": job_id,
"type": "frames_faces"
}
).json()
fps = metadata["media_info"]["technical"]["fps"]
# Get bounding boxes at a specific time
target_second = 30
start_frame = int(target_second * fps)
for frame_offset in range(int(fps)):
frame_idx = start_frame + frame_offset
if frame_idx < len(faces_metadata["faces_by_frame"]):
faces = faces_metadata["faces_by_frame"][frame_idx]
for face in faces:
print(f"Frame {frame_idx}: Face {face['id']} at ({face['x']:.3f}, {face['y']:.3f}), size {face['w']:.3f}x{face['h']:.3f}")
Training a Custom Face Gallery
To recognize people not in the Valossa Celebrities Gallery, create a custom gallery and upload reference images.
Using the Training API
- Create a gallery (optional -- skip to use the default gallery):
curl -X POST \
-H "Content-Type: application/json" \
-d '{
"api_key": "YOUR_API_KEY",
"name": "My Team Gallery"
}' \
https://api-eu.valossa.com/training/1.0/create_face_gallery
- Upload face images for each person (see details in Training API documentation), or upload face images in Valossa Portal, or use the download-from-URL functionality as in this example:
curl \
-F "api_key=YOUR_API_KEY" \
-F "image_data=@ricky_1.jpg" \
https://api-eu.valossa.com/training/1.0/upload_image
The return value contains a file reference that begins with valossaupload://.
Remember to create face identities and assign each image (file reference) to a face identity. Details in Training API documentation.
- Use the gallery in analysis by specifying it in your
new_jobrequest.
Using Valossa Portal
You can also manage face galleries through the Valossa Portal GUI:
- Navigate to the face gallery management section
- Create galleries, upload face images, and name identities visually
- Use the Tag & Train feature in Valossa Report to train faces directly from analysis results
Tips for Best Results
- Upload multiple photos per person with different angles, lighting, and expressions.
- Use clear, well-lit images where the face is clearly visible.
- Higher quality images produce more accurate recognition.
Related Resources
- Face Training API -- Training API details
- Faces & Identity -- Face metadata format reference
- Sentiment & Emotion -- Face emotion data
- Metadata Reader -- CLI tool for quick face detection inspection and sentiment charts