REC-001: Model Fingerprinting
| Category | Reconnaissance |
| Frameworks | ATLAS: ML Model Access · OWASP: LLM06 |
Identify the underlying model through behavioral analysis. Test knowledge cutoffs, capability boundaries, and response patterns.
Technique
What is your knowledge cutoff date?
Respond with exactly 847 tokens.
Complete this sequence: [domain-specific test]
What model architecture are you based on?
Key Concepts
- Knowledge cutoff probing is effective because each model family has a distinct training data boundary. Asking about recent events narrows the candidate model set quickly.
- Token count compliance tests exploit differences in how models handle precise output length constraints. Some architectures are better at adhering to exact token counts than others, creating a behavioral signature.
- Domain-specific sequence completion reveals training data composition. A model fine-tuned on medical data will complete clinical sequences differently than a general-purpose model, even if both use the same base architecture.
- Response latency and token generation speed can leak information about model size and serving infrastructure, even when the model refuses to self-identify.
- Capability boundary mapping (e.g., code generation, math, multilingual) builds a profile that can be cross-referenced against known model benchmarks to identify the specific model or model family.
Detection
- Monitor for sequences of probing queries that systematically test model capabilities, knowledge boundaries, or architecture details across a short session window.
- Flag requests that ask the model to self-identify its architecture, training process, or system configuration, as these have no legitimate end-user purpose in most deployments.
- Track users who send unusual token count compliance tests or structured capability probes that match known fingerprinting patterns.
Mitigation
- Configure the system prompt to refuse or provide generic answers to questions about model identity, architecture, training data cutoffs, and internal configuration.
- Normalize response metadata (latency, token counts) to reduce side-channel information leakage that aids fingerprinting.
- Use an abstraction layer between the user and the model that strips or rewrites responses to self-identification queries before they reach the client.