Skip to content

Sampling & Evidence

Glacis uses a three-tier sampling model to balance observability with cost and storage. Every attestation is assigned a tier that determines how much data is collected and retained.

TierNameWhat is capturedWhen
L0Control planeControl results, policy metadata, hashesAlways
L1Evidence collectionL0 + full input/output payloads stored locallySampled at l1_rate
L2Deep inspectionL1 + eligible for judge evaluationSampled at l2_rate (implies L1)
  • L0 runs on every attestation. Only hashes leave your infrastructure.
  • L1 retains the full input and output locally for audit. The data never leaves your environment.
  • L2 is a subset of L1. Every L2 attestation also gets L1 evidence collection, and is flagged for judge evaluation. Judges must be run separately using JudgeRunner — they are not invoked automatically by the sampling tier.

Configure sampling rates in glacis.yaml:

version: "1.3"
sampling:
l1_rate: 1.0 # Probability of L1 evidence collection (0.0-1.0)
l2_rate: 0.0 # Probability of L2 deep inspection (0.0-1.0)

Defaults: l1_rate=1.0 (retain evidence for every request), l2_rate=0.0 (no deep inspection).

FieldTypeDefaultDescription
l1_ratefloat1.0Probability of L1 evidence collection (0.0 to 1.0)
l2_ratefloat0.0Probability of L2 deep inspection (0.0 to 1.0, must be ≤ l1_rate)

You can also pass SamplingConfig directly to the Glacis client:

import os
from glacis import Glacis
from glacis.config import SamplingConfig
glacis = Glacis(
mode="offline",
signing_seed=os.urandom(32),
sampling_config=SamplingConfig(l1_rate=0.1, l2_rate=0.01),
)

Sampling decisions are deterministic and auditor-reproducible. Given the same evidence_hash + policy_key (or signing_seed) + sampling rates, the SDK always produces the same decision.

The algorithm:

  1. Compute HMAC-SHA256(policy_key, "sample:v1" || evidence_hash_bytes) — the policy_key is used as the HMAC key (falls back to signing_seed if not provided), the message is the domain separator "sample:v1" concatenated with the raw evidence hash bytes
  2. Extract the first 8 bytes as a big-endian uint64 value (sample_value)
  3. Compare sample_value against thresholds derived from l1_rate and l2_rate (using math.floor)

This means an auditor with access to the policy key (or signing seed) can independently verify that sampling decisions were correctly applied.

Use the should_review() method to manually check whether an attestation should receive L1/L2 treatment:

import os
from glacis import Glacis
seed = os.urandom(32)
glacis = Glacis(mode="offline", signing_seed=seed)
receipt = glacis.attest(
service_id="my-service",
operation_type="inference",
input={"prompt": "What is AI?"},
output={"response": "AI is..."},
)
decision = glacis.should_review(receipt)
print(f"Level: {decision.level}") # "L0", "L1", or "L2"
print(f"Sample value: {decision.sample_value}")

You can override the L1 rate for a specific check:

# Check at 50% sampling rate regardless of configured l1_rate
decision = glacis.should_review(receipt, sampling_rate=0.5)

The SamplingDecision model returned by should_review():

FieldTypeDescription
levelstrSampling tier: "L0", "L1", or "L2"
sample_valueintFirst 8 bytes of HMAC tag as big-endian uint64
prf_taglist[int]Full HMAC-SHA256 tag (32 bytes as int list)

When an attestation is promoted to L1 or above, an Evidence object is attached:

receipt = glacis.attest(
service_id="my-service",
operation_type="inference",
input={"prompt": "Hello"},
output={"response": "Hi!"},
)
if receipt.evidence:
print(f"Sample probability: {receipt.evidence.sample_probability}")
print(f"Data: {receipt.evidence.data}")
FieldTypeDescription
sample_probabilityfloatProbability this evidence was sampled (0.0-1.0)
datadictThe evidence payload (typically {"input": ..., "output": ...})

When using integration wrappers (OpenAI, Anthropic, Gemini), evidence is stored automatically. The integration calls store_evidence() after each attestation, persisting the full input, output, and control plane results to local storage.

Evidence is stored alongside receipts in the configured storage backend (SQLite or JSONL). See Storage Backends for details on storage location and configuration.

The stored evidence record includes:

FieldDescription
attestation_idLinks evidence to its attestation
attestation_hashThe evidence_hash that was attested
mode"online" or "offline"
service_idService identifier
operation_typeType of operation
timestampUnix timestamp in milliseconds
inputFull input data
outputFull output data
control_plane_resultsControl results (if any)
metadataAdditional metadata
sampling_level"L0", "L1", or "L2"

Retrieve stored evidence using the get_evidence() helper from the integrations module:

from glacis.integrations.base import get_evidence
evidence = get_evidence("att_abc123")
if evidence:
print(f"Input: {evidence['input']}")
print(f"Output: {evidence['output']}")
print(f"Level: {evidence['sampling_level']}")

You can override the storage backend and path:

evidence = get_evidence(
"oatt_abc123",
storage_backend="json",
storage_path="/path/to/storage",
)