Athena Protobufs Documentation¶
Welcome to the Athena Protobufs documentation. This repository contains the Protocol Buffer definitions for the Athena API, which provides CSAM (Child Sexual Abuse Material) detection capabilities through gRPC services.
Contents:
Service Overview¶
Athena is an enterprise-grade CSAM detection platform, for both known and unknown CSAM images, that delivers scalable, secure image classification capabilities through a modern gRPC streaming API.
Core Architecture
Athena operates as a bi-directional streaming gRPC service, enabling clients to submit image classification requests and receive results. This streaming architecture ensures optimal performance and scalability, allowing for continuous processing of large image volumes while maintaining low latency response times.
Security & Access Control
Athena implements comprehensive authentication and authorization controls to ensure secure operations:
Client Authentication: All clients must be properly authenticated before accessing the service
Affiliate Authorization: The system validates that each client is authorized to submit requests for specific affiliates
Deployment Isolation
Each client operates within isolated deployment ids, ensuring independence and security:
Dedicated Processing: Each client submit requests to their own dedicated deployment ids
Operational Independence: Each client’s processing pipeline operates independently, preventing cross-contamination
Resource Isolation: Deployment-level separation ensures consistent performance and security boundaries
Quick Start¶
The Athena API consists of two main RPC methods:
Classify: Stream images for classification with deployment-based session management
ListDeployments: Get information about active deployments and their backlogs
Deployments¶
The Classify endpoint uses deployment IDs to group classification requests. Multiple clients can join the same deployment to share responses. Clients can use different deployment IDs to process images seperately.
Image Pre-processing¶
Images must be resized to 448x448 pixels before sending to the the API.
Image hashes should be generated for known CSAM detection, multiple hashes for the same image can be added to the request (SHA1, MD5) if e.g. the image was saved with a different extension.
For example, if an image was saved as a JPEG and a PNG, both hashes could be included in the request.
For example Image1 hashes might be:
Image1.jpg -> SHA1:
1234567890abcdef1234567890abcdef12345678
Image1.png -> SHA1:
456789abcdef123456789abcdef1234567812312
These can be used in a single request, the response will tell you if any of the hashes match a known CSAM image.
Requests must have at least one of the following, with both preferred:
Image hashes (SHA1, MD5) - for known CSAM detection
Image data - for unknown CSAM detection
Data Handling and Behavior¶
Client Library Responsibilities¶
The client library handles image preparation before transmission:
Image Hashing: Generates hashes for CSAM detection
Image Resizing: Sets the image dimensions for processing
Metadata Creation: Prepares correlation ID tracking
Server-Side Processing¶
The Athena service processes images with strict data handling policies:
Data Reception: Only receives image data, hash metadata, and correlation metadata
Ephemeral Processing: Images are processed in memory and immediately discarded after classification
No Storage: Images are never stored on our servers - only used for the classification call
Audit Creation: Creates audit records for each processed image for billing purposes (no image data retained)
Data Retention Policies¶
Response Availability: Classification responses remain available on their deployment for up to 1 hour, or until sent to a client. Whichever comes sooner.
Response Expiry: After 1 hour, responses that are not retrieved by a client are automatically purged and no longer accessible
Deployment Lifecycle: Deployments are automatically removed after 24 hours of inactivity
Privacy Protection: No source image data is retained beyond the classification process. Source image data is stored ephemerally and not persisted once classification is complete.
Key Features¶
Multi-Format Support¶
Athena supports various image formats including JPEG, PNG, WebP, TIFF, and many others. Images can be sent compressed (Brotli) or uncompressed for bandwidth optimization.
Correlation Tracking¶
Each image in a request includes a client provided correlation ID that allows clients to match responses with their original requests.
Getting Started¶
Check the protocol buffer documentation for information on how to generate code for your language of choice. Review the API Reference for detailed message and service definitions