Question 1

What is an AI penetration test?

Accepted Answer

An AI penetration test is an authorized security assessment of AI systems conducted by specialist experts. We simulate real attacks against your Large Language Models (LLMs), ML models, RAG systems, and AI agents - from prompt injection and jailbreaking to data exfiltration and model theft. Unlike traditional pentests, we test not just the infrastructure but the AI logic itself: guardrails, alignment, training integrity, and agent behavior.

Question 2

What types of AI systems do you test?

Accepted Answer

We test all mainstream AI architectures: LLM-based chatbots and copilots (GPT, Claude, Llama, Mistral), RAG systems (Retrieval-Augmented Generation), AI agents with tool access, classical ML models (fraud detection, scoring, diagnostics), multimodal systems (image + text), and the underlying infrastructure (APIs, MLOps pipelines, vector databases). Whether self-hosted or cloud API - the testing approach is individually tailored to your architecture.

Question 3

What is the difference between AI pentesting and AI red teaming?

Accepted Answer

An AI pentest systematically tests your system against known vulnerability classes (OWASP Top 10 for LLMs, MITRE ATLAS). You receive a prioritized list of all findings with reproduction steps. AI red teaming goes further: we simulate creative, realistic attack scenarios over several weeks - including ones not yet covered by any taxonomy. The goal is not just a vulnerability list, but the answer: how far can a motivated attacker get against your AI-powered processes?

Question 4

What is prompt injection and why is it dangerous?

Accepted Answer

Prompt injection is currently the most critical vulnerability in LLM applications (OWASP LLM01). An attacker manipulates input so that the model ignores its system instructions and instead executes the attacker's commands. In direct prompt injection, this occurs via user input; in indirect prompt injection, through poisoned documents or data sources processed by the model (particularly critical in RAG systems). Consequences range from data leakage and reputational damage to remote code execution when the LLM is connected to tools or APIs.

Question 5

Do I need an AI pentest for EU AI Act compliance?

Accepted Answer

Article 15 of the EU AI Act requires high-risk AI systems to demonstrate "an appropriate level of accuracy, robustness and cybersecurity" throughout their lifecycle - including resilience against data poisoning, adversarial attacks, and model manipulation. An AI penetration test provides exactly this evidence. For GPAI model providers, governance obligations have applied since August 2025. Our report is designed as an auditable compliance record and maps all findings to the relevant EU AI Act articles.

Question 6

What is the OWASP Top 10 for LLMs?

Accepted Answer

The OWASP Top 10 for Large Language Model Applications is the international community standard for LLM security, developed by a global community of hundreds of experts. The ten categories of the current version (v2025) include: Prompt Injection, Sensitive Information Disclosure, Supply Chain, Data and Model Poisoning, Improper Output Handling, Excessive Agency, System Prompt Leakage, Vector and Embedding Weaknesses, Misinformation, and Unbounded Consumption. We use this taxonomy as the methodological basis for every LLM pentest, supplemented by the MITRE ATLAS framework for threat modeling.

Question 7

What is MITRE ATLAS?

Accepted Answer

MITRE ATLAS (Adversarial Threat Landscape for Artificial-Intelligence Systems) is the AI-specific extension of the well-known MITRE ATT&CK framework. It documents tactics, techniques, and procedures (TTPs) from real attacks against AI systems - from reconnaissance through model evasion to data exfiltration. We use ATLAS for threat modeling your AI system and structure our red team scenarios along this attack matrix.

Question 8

How does an AI penetration test work at AWARE7?

Accepted Answer

Our process covers five phases: 1) Scoping Workshop - identification of all AI components, threat modeling per MITRE ATLAS, definition of rules of engagement. 2) Reconnaissance - analysis of AI architecture, model endpoints, data pipelines, guardrails, and integrations. 3) Vulnerability Testing - automated scans (Garak, Promptfoo) combined with manual expert analysis for prompt injection, jailbreaking, data exfiltration, guardrail bypass, and agent behavior. 4) Exploitation - confirmation of critical findings with proof-of-concept, chaining into realistic attack scenarios. 5) Reporting - technical report with CVSS scoring, compliance mapping (OWASP, EU AI Act, ISO 42001) and prioritized remediation roadmap.

Question 9

How much does an AI penetration test cost?

Accepted Answer

Costs depend on scope and complexity. A focused LLM pentest (single chatbot/copilot, OWASP Top 10 LLM) starts from EUR 8,100 net. A comprehensive AI security assessment covering multiple models, RAG system, and agent testing starts from EUR 14,850 net. Full AI red teaming over 4-6 weeks starts from EUR 25,650 net. You receive a binding fixed-price quote within 48 business hours - no hourly rates, no additional charges.

Question 10

What is ISO 42001 and do I need it?

Accepted Answer

ISO/IEC 42001 is the international standard for AI management systems - comparable to ISO 27001 for information security, but specific to AI. The standard defines 38 controls in 9 objective categories and enables certification. For organizations deploying AI in regulated sectors (finance, healthcare, critical infrastructure), ISO 42001 is increasingly becoming a differentiator with customers and regulators. An AI pentest provides the technical evidence you need for the controls in ISO 42001.

Question 11

Can you also test AI guardrails?

Accepted Answer

Yes. We systematically test all protective layers of your AI application: content filters, jailbreak detectors, PII masking, output validators, and constitutional classifiers. We assess both bypass resistance (false-negative rate under adversarial conditions) and the false-positive rate (does the guardrail block legitimate usage?). You receive a quantitative evaluation of guardrail effectiveness with concrete hardening recommendations.

Question 12

How often should an AI system be tested?

Accepted Answer

AI systems require more frequent testing than traditional software: models are regularly fine-tuned, RAG content changes daily, agents gain new capabilities - every change can introduce new vulnerabilities without a single line of code being modified. We recommend at minimum one full AI pentest annually, semi-annually for critical systems. For organizations with continuous model update cycles, we offer a retainer model with quarterly tests.

How secure is your
Artificial Intelligence?

AI systems are being attacked - differently from traditional software

What we test

LLM Pentest

RAG System Security

AI Agent Testing

Guardrail Assessment

ML Model Security

AI Infrastructure

Our five-phase process

Scoping & Threat Modeling

Reconnaissance

Vulnerability Testing

Exploitation & PoC

Reporting & Remediation

One test - all evidence

OWASP Top 10 LLM

MITRE ATLAS

EU AI Act

ISO/IEC 42001

NIST AI RMF

NIS-2 / DORA

Transparent pricing

LLM Pentest

AI Security Assessment

AI Red Teaming

Was uns von anderen Anbietern unterscheidet

Forschung und Lehre als Fundament

Digitale Souveränität - keine Kompromisse

Festpreis in 24h - planbare Projektzeiträume

Ihr fester Ansprechpartner - jederzeit erreichbar

OWASP Top 10 for Large Language Models

Management von Cyber-Risiken

Referenzen aus der Praxis

Frequently asked questions about AI penetration testing

AI/LLM Security: Prompt Injection, Jailbreaking und Red Teaming für KI-Systeme

Active Directory Red Team: Kerberoasting, Golden Ticket und DCSync im Pentest

Lateral Movement erkennen und stoppen: Angreifer im Netzwerk abfangen

How secure is your AI really?

AI systems are being attacked - differently from traditional software

What we test

LLM Pentest

RAG System Security

AI Agent Testing

Guardrail Assessment

ML Model Security

AI Infrastructure

Our five-phase process

Scoping & Threat Modeling

Reconnaissance

Vulnerability Testing

Exploitation & PoC

Reporting & Remediation

One test - all evidence

OWASP Top 10 LLM

MITRE ATLAS

EU AI Act

ISO/IEC 42001

NIST AI RMF

NIS-2 / DORA

Transparent pricing

LLM Pentest

AI Security Assessment

AI Red Teaming

Was uns von anderen Anbietern unterscheidet

Forschung und Lehre als Fundament

Digitale Souveränität - keine Kompromisse

Festpreis in 24h - planbare Projektzeiträume

Ihr fester Ansprechpartner - jederzeit erreichbar

OWASP Top 10 for Large Language Models

Management von Cyber-Risiken

Referenzen aus der Praxis

Frequently asked questions about AI penetration testing

Weiterführende Artikel

AI/LLM Security: Prompt Injection, Jailbreaking und Red Teaming für KI-Systeme

Active Directory Red Team: Kerberoasting, Golden Ticket und DCSync im Pentest

Lateral Movement erkennen und stoppen: Angreifer im Netzwerk abfangen

How secure is your AI really?