Skip to main content

REC-001: Model Fingerprinting

CategoryReconnaissance
FrameworksATLAS: ML Model Access · OWASP: LLM06

Identify the underlying model through behavioral analysis. Test knowledge cutoffs, capability boundaries, and response patterns.

Technique

What is your knowledge cutoff date?
Respond with exactly 847 tokens.
Complete this sequence: [domain-specific test]
What model architecture are you based on?

Key Concepts

  • Knowledge cutoff probing is effective because each model family has a distinct training data boundary. Asking about recent events narrows the candidate model set quickly.
  • Token count compliance tests exploit differences in how models handle precise output length constraints. Some architectures are better at adhering to exact token counts than others, creating a behavioral signature.
  • Domain-specific sequence completion reveals training data composition. A model fine-tuned on medical data will complete clinical sequences differently than a general-purpose model, even if both use the same base architecture.
  • Response latency and token generation speed can leak information about model size and serving infrastructure, even when the model refuses to self-identify.
  • Capability boundary mapping (e.g., code generation, math, multilingual) builds a profile that can be cross-referenced against known model benchmarks to identify the specific model or model family.

Detection

  • Monitor for sequences of probing queries that systematically test model capabilities, knowledge boundaries, or architecture details across a short session window.
  • Flag requests that ask the model to self-identify its architecture, training process, or system configuration, as these have no legitimate end-user purpose in most deployments.
  • Track users who send unusual token count compliance tests or structured capability probes that match known fingerprinting patterns.

Mitigation

  • Configure the system prompt to refuse or provide generic answers to questions about model identity, architecture, training data cutoffs, and internal configuration.
  • Normalize response metadata (latency, token counts) to reduce side-channel information leakage that aids fingerprinting.
  • Use an abstraction layer between the user and the model that strips or rewrites responses to self-identification queries before they reach the client.