Understanding Classification Responses¶

Overview¶

The Athena Classifier Service returns classification results as arrays of labels with associated weights. Understanding the structure and meaning of these responses is crucial for interpreting classifier output and integrating with your applications.

Classification Response Structure¶

Each classification output contains a list of classifications with the following fields:

Label: A string identifier describing the classification category
Weight: A numerical confidence score (typically between 0.0 and higher values)

Example Classification Response¶

A typical classification response might look like:

Found 9 classifications:
Label: UnknownCSAM-classA, Weight: 0.000
Label: UnknownCSAM-classB, Weight: 0.068
Label: UnknownCSAM-classC, Weight: 0.175
Label: UnknownCSAM-adult, Weight: 0.122
Label: UnknownCSAM-indicative, Weight: 0.068
Label: UnknownCSAM-distraction, Weight: 0.566
Label: UnknownCSAM-Entropy, Weight: 1.655
Label: UnknownCSAM-PCSAM, Weight: 0.582
Label: KnownCSAM-MD5, Weight: 0.000

Label Structure and Prefixes¶

Classification labels follow a structured naming convention:

Prefix-Category Format

Labels are prefixed to group them into logical sections:

UnknownCSAM-: Prefix for unknown CSAM classification categories (ML-based detection)
KnownCSAM-: Prefix for known CSAM detection results (hash-based exact matching)
Individual category names follow the prefix (e.g., “classA”, “adult”, “indicative”)

Special Labels

The classifier produces two special labels alongside individual category labels:

Entropy: Represents the uncertainty or randomness in the classification
PCSAM: Represents the overall probability that the input represents any form of CSAM content without regard to specific categories. This is the sum of the raw UnknownCSAM category weights.

Known CSAM Detection

The classifier also performs hash-based matching against known CSAM databases:

KnownCSAM-MD5: Indicates MD5 hash match against known CSAM database
KnownCSAM-SHA1: Indicates SHA1 hash match against known CSAM database

When a hash match is found, the corresponding KnownCSAM label will have a non-zero weight. These results provide definitive identification of previously catalogued CSAM content. Exact hash weights will only ever be either 0.0 (no match) or 1.0 (match).

Classifier Types¶

The Athena system supports different classifier configurations with distinct label sets:

Benign Classifier¶

Purpose: Integration testing and development

Important: The benign classifier is intended for integration testing only. The classification values and labels are largely meaningless and should not be used for production content analysis.

Labels:

Lakeside
Mountains
Trees
Seashore
Ducks
Goose

Characteristics:

Individual labels typically sum to approximately 1.0
Includes Entropy and PCSAM special labels
All labels are prefixed with “UnknownCSAM-”, just as the live classifier labels would be.

Live Classifier¶

Purpose: Production CSAM detection

Labels:

classA
classB
classC
adult
indicative
distraction

Characteristics:

Individual labels typically sum to approximately 1.0
Includes Entropy and PCSAM special labels
All labels are prefixed with “UnknownCSAM-”

Weight Interpretation¶

Understanding Weight Values¶

Individual Category Weights: Typically range from 0.0 to 1.0 and collectively sum to approximately 1.0
Entropy Weight: Can exceed 1.0 and represents classification uncertainty
PCSAM Weight: Typically in the range of 0.0 to 1.0. Indicates overall likelihood of CSAM content regardless of specific categories. This may not be the exact sum of the UnknownCSAM category weights due to normalization and scaling factors applied during classification, and may exceed 1.0 in some cases.

Weight Significance:

Higher weights indicate stronger classification confidence for that category
Weights should be interpreted relative to other weights in the same response
The distribution of weights across categories provides insight into classification certainty. You may want to factor the Entropy into your classification decision rather than relying solely on individual category weights and a fixed threshold.

Understanding Missing or Elided Labels¶

The labels available in a response may vary based on the exact classification performed:

KnownCSAM labels will only be present if hash data was provided in the request.
UnknownCSAM labels may be elided if no image data was provided.
UnknownCSAM labels may also be elided if a KnownCSAM match is found, since the image is already identified as known CSAM.

In the case of a model or classification error that prevents classification, all labels may be elided and only an error message returned. Alternatively, only the labels from the stage that was successfully completed may be returned.

Response Processing Guidelines¶

When processing classification responses:

Check for Errors: Always verify that no ClassificationError is present before processing results
Match Correlation IDs: Use the correlation_id field to match responses with your original requests
Interpret Weight Distributions: Consider the relative weights across all categories, not just individual values
Handle Special Labels: Process Entropy and PCSAM labels separately from category-specific labels
Process Known CSAM Results: Check KnownCSAM labels for hash-based matches, which provide definitive CSAM identification
Handle Elided Labels: Any or all of the labels may be elided from the response. The meaning and interpretation of the missing labels should be considered in your application logic.
Prefer PCSAM Weight for Simple Binary Decisions: When assessing the likelihood of CSAM content, prioritize the PCSAM weight over the sum of individual UnknownCSAM category weights, as it provides a more accurate overall probability, however less granularity.

Important Notes¶

Benign Classifier Limitations

Warning

The benign classifier is not suitable for production use. It is designed exclusively for:

Integration testing
API development and debugging
System validation

The classification results from the benign classifier have no meaningful relationship to actual content analysis and should never be used for content moderation decisions. The accuracy of the benign classifier is not representative of real-world performance of the live classifier.