Disclaimer:
All information provided on this page was developed by TPimenta LAB. Please consult the official documentation for the most accurate and up-to-date information. We are not responsible for any issues, damages, or data loss that may occur from using this information.

USE AT YOUR OWN RISK.
Enterprise Add-on
AI Security for Apps AI Gateway WAF

Firewall for AI & LLMs

Complete reference: WAF-integrated LLM threat detection for prompt injection, PII exposure, and unsafe topics — plus AI Gateway security features including Guardrails, DLP, rate limiting, and caching.

🛈 Naming note: "Firewall for AI" was the original product name. It is now officially called AI Security for Apps in Cloudflare documentation, but the underlying capabilities are identical.

What It Is

AI Security for Apps extends the Cloudflare WAF with detections specifically designed for LLM-powered applications. It is model-agnostic — it works regardless of which LLM you use (OpenAI, Anthropic, Google Gemini, Workers AI, self-hosted, etc.).

It is complemented by AI Gateway, a proxy layer (available on all plans) that adds Guardrails, DLP, rate limiting, caching, and full prompt/response logging between your app and the LLM provider.

🔐

Prompt Injection

Detect attackers trying to hijack the LLM's behavior by overriding its system instructions or extracting the system prompt.

👤

PII Detection

Catch users inadvertently or maliciously sending sensitive personal data (SSNs, credit cards, emails) in prompts.

🚫

Unsafe Topics

Block prompts covering harmful subjects — violent crimes, hate speech, self-harm, CSAM, weapons — across 14 categories.

🎯

Custom Topics

Define up to 20 organization-specific topics — competitors, legal advice, internal HR matters — using zero-shot classification.

Plan Availability

Requires WAF enabled on the zone. AI detection fields require Enterprise plan + paid add-on. Contact your account team to enable. AI Gateway is available on all plans at no cost.
Capability Free Pro Business Enterprise
LLM endpoint discovery (cf-llm auto-label) Yes Yes Yes Yes
AI Security Log Mode Ruleset (full prompt logging) No No No Paid add-on
AI detection fields — PII, injection score, unsafe topics, custom topics No No No Paid add-on
AI Gateway (rate limiting, caching, logging, Guardrails, DLP) Yes Yes Yes Yes

Architecture & Traffic Flow

Cloudflare's AI security is a layered stack. Requests pass through WAF-level LLM detection first, then optionally through AI Gateway before reaching the LLM provider.

User / Client Browser or API Consumer | v ┌─────────────────────────────────────────────────────────────┐ │ Cloudflare Edge (anycast — nearest data center) │ │ │ │ 1. LLM Discovery ─→ heuristics ─→ cf-llm label │ │ 2. AI Detection Engine (Enterprise add-on) │ │ ├── PII Detection (fuzzy AI + regex) │ │ ├── Prompt Injection (score 1–99) │ │ └── Unsafe / Custom Topics │ │ 3. WAF Rule Evaluation (cf.llm.* fields populated) │ │ ├── Log / Monitor ──→ Security Analytics │ │ └── Mitigate ──→ Block / Challenge / Rate Limit │ └─────────────────────────────────────────────────────────────┘ | v ┌─────────────────────────────────────────────────────────────┐ │ AI Gateway (all plans) │ │ ├── Authenticated Gateway (cf-aig-authorization header) │ │ ├── Rate Limiting (fixed / sliding window) │ │ ├── DLP (Beta) (prompt + response scanning) │ │ ├── Guardrails (Beta) (content moderation) │ │ ├── Caching (identical prompt→response) │ │ └── Dynamic Routing (model fallback / retry) │ └─────────────────────────────────────────────────────────────┘ | v LLM Provider (OpenAI / Anthropic / Workers AI / Gemini …)
Steps 1–3 (WAF layer) require the Enterprise paid add-on. AI Gateway is available on all plans and can be used independently without the WAF add-on.

LLM Endpoint Discovery

Cloudflare automatically detects LLM endpoints using traffic heuristics — no manual configuration required. Once detected, endpoints are labeled cf-llm via API Shield, enabling filtering in Security Analytics and scoping of WAF rules.

How Heuristics Work

Signal Detail
Response time LLM endpoints typically respond in >1 second
Effective bitrate 80% of LLM endpoints operate at <4 KB/s (streaming tokens)
False positive filtering GraphQL endpoints, device heartbeats, QR/OTP generators are filtered out automatically
You can also manually apply the cf-llm label to specific endpoints via Security > Web Assets > Endpoints > Edit endpoint labels, or bulk apply via API Shield's endpoint management API.
AI Security for Apps currently only scans requests with Content-Type: application/json. Non-JSON LLM requests are not scanned.

PII Detection

Two complementary approaches that can be combined for layered protection:

Fuzzy Detection (AI-powered)

Uses Microsoft Presidio to detect PII even in natural language or unexpected formats. Supports 40+ categories:

CREDIT_CARD US_SSN US_PASSPORT US_DRIVER_LICENSE EMAIL_ADDRESS PHONE_NUMBER IP_ADDRESS IBAN_CODE PERSON LOCATION DATE_TIME URL IN_AADHAAR UK_NHS AU_TFN SG_NRIC_FIN + 24 more
🚨
Never block on cf.llm.prompt.pii_detected alone. Broad categories like PERSON, DATE_TIME, and LOCATION appear in normal conversation and will generate large numbers of false positives. Always filter by specific categories using cf.llm.prompt.pii_categories.

Exact Detection (Regex)

Use WAF custom rules with http.request.body.raw matches "PATTERN" for organization-specific formats:

Custom PII type Example format Regex pattern
Employee ID EMP-482910 EMP-[0-9]{6}
Patient record number PAT/2024/00391 PAT/[0-9]{4}/[0-9]{5}
Internal account ID ACCT-XX-99999 ACCT-[A-Z]{2}-[0-9]{5}
Custom API key prefix sk_live_abc123... sk_live_[a-zA-Z0-9]{20,}

Prompt Injection Detection

Score-based system using the cf.llm.prompt.injection_score field. Range: 1–99. Lower score = higher injection risk.

1 (Most Dangerous)99 (Safest)
BlockChallengeAllow
Score range Risk level Recommended action
1 – 19 High — strongly resembles known injection patterns Block
20 – 49 Moderate — some injection characteristics, may be ambiguous Challenge or Log
50 – 99 Low — likely safe, normal user input Allow

A score-based approach is used rather than a binary result because injection exists on a spectrum. A creative writing request may superficially resemble an injection attempt without actually being one.

💡
Start with a Log action at threshold lt 40. Review results in Security Analytics, then tune down to lt 30 or lt 20 based on actual false positive rates before switching to Block.

Unsafe & Custom Topic Detection

Predefined Unsafe Topics (14 categories)

Category Description
S1 Violent crimes
S2 Non-violent crimes
S3 Sex-related crimes
S4 Child sexual exploitation
S5 Defamation
S6 Specialized advice
S7 Privacy
S8 Intellectual property
S9 Indiscriminate weapons
S10 Hate
S11 Suicide and self-harm
S12 Sexual content
S13 Elections
S14 Code interpreter abuse

Custom Topic Detection

Define up to 20 custom topics using zero-shot classification — no model training required. Each topic has a label (used in rules) and a topic string (used by the AI classifier). Scores follow the 1–99 scale (lower = more relevant to the topic).

Constraints

Parameter Limit
Maximum number of topics 20
Topic string length 2–50 printable ASCII characters
Label length 2–20 characters
Label format Lowercase letters, numbers, and hyphens only

Define Custom Topics via API

curl "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/firewall-for-ai/custom_topics" \
  --request PUT \
  --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
  --json '{
    "topics": [
      { "label": "competitors", "topic": "asking about competitor products and pricing" },
      { "label": "legal-advice", "topic": "asking for legal counsel or regulatory guidance" },
      { "label": "hr-internal", "topic": "internal HR policies and employee matters" }
    ]
  }'
This PUT request replaces your entire topic list. Always include all topics you want to keep, not just new ones.

Topic String Best Practices

Style Example Verdict
Verb phrase (recommended) asking for investment advice Best precision
Sentence-like a user seeking financial guidance Good
Noun phrase investment advice Acceptable
Single keyword finance Too broad
Vague phrase bad things Ineffective

Example Custom Topics

Label Topic string Use case
competitors asking about Acme Corp products and pricing Block chatbot discussing rival offerings
legal-advice asking for legal counsel or regulatory compliance guidance Block prompts soliciting legal advice
student-data requesting student personal information or academic records EdTech — prevent student data exposure
crypto-advice asking for cryptocurrency trading or investment recommendations FinTech — block crypto investment tips
exec-internal discussing internal executive decisions or leadership changes Prevent internal matter leakage

Detection Fields Reference

These fields are populated by AI Security for Apps on requests hitting cf-llm labeled endpoints and can be used in WAF Custom Rules and Rate Limiting Rules.

Field Type Description
cf.llm.prompt.detected Boolean LLM prompt was detected in the request
cf.llm.prompt.pii_detected Boolean Any PII found in the prompt — do not block on this alone
cf.llm.prompt.pii_categories Array<String> PII types found (CREDIT_CARD, US_SSN, EMAIL_ADDRESS, etc.)
cf.llm.prompt.injection_score Number (1–99) Injection likelihood — lower = more dangerous
cf.llm.prompt.unsafe_topic_detected Boolean Any predefined unsafe topic detected in prompt
cf.llm.prompt.unsafe_topic_categories Array<String> Which unsafe categories detected (S1–S14)
cf.llm.prompt.custom_topic_categories Map<Number> Custom topic relevance scores by label (1–99, lower = more relevant)
cf.llm.prompt.token_count Number Estimated token count of the prompt — useful for cost-based rate limiting

Log Mode vs Production Mode

Feature Log Mode Production Mode
How it works Pre-built managed ruleset Custom WAF rules using detection fields
Prompt logging Yes (encrypted payload logging) No — metadata only
Response logging No No — use AI Gateway
Policy flexibility Limited — 3 fixed rules Full — scores, categories, combined signals
Blocking behavior Default WAF block page Fully customizable responses
Best for Evaluation and threshold tuning Production enforcement

Enable Log Mode via API

curl "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/rulesets/phases/http_request_firewall_managed/entrypoint" \
  --request PUT \
  --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
  --json '{
    "rules": [{
      "action": "execute",
      "action_parameters": { "id": "b7cd52df92f74c848cec0c2ed385e336" },
      "expression": "true"
    }]
  }'

Enable via Dashboard

Security > Settings > AI Security for Apps > Managed Ruleset > Enable
Action: Log
Configure payload logging to allow decryption of prompts in Security Analytics

Setup Steps — AI Security for Apps (WAF)

  1. Enable AI Security for Apps Dashboard: Security > Settings > filter by "Detection tools" > Toggle AI Security for Apps On.

    Or via API:
    curl "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/firewall-for-ai/settings" \
      --request PUT \
      --header "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
      --json '{ "pii_detection_enabled": true }'
  2. Label LLM endpoints with cf-llm Auto-discovery handles this in most cases. To manually apply: Security > Web Assets > Endpoints > Edit endpoint labels > cf-llm.
  3. Enable the AI Security Log Mode Ruleset Turn on with action = Log. Enable payload logging so you can decrypt and read actual prompts in Security Analytics.
  4. Review detections in Security Analytics Filter by the cf-llm label. Decrypt payloads. Note injection scores, PII categories, and unsafe topic rates across your traffic baseline.
  5. Define custom topics (if needed) Use the dashboard or API to define up to 20 custom topics. Use verb-phrase topic strings for best precision.
  6. Build production custom rules Create WAF Custom Rules using cf.llm.* fields with tuned thresholds. Start with Log action to validate before blocking.
  7. Switch to Block and iterate Once rules are validated, change actions to Block. Monitor continuously and adjust thresholds as traffic patterns evolve.

Example WAF Rules

Block High-Confidence Prompt Injection

(cf.llm.prompt.injection_score lt 20)

Challenge Moderate-Risk Injection

(cf.llm.prompt.injection_score lt 40)

Block Specific PII Categories Only

(any(cf.llm.prompt.pii_categories[*] in {"CREDIT_CARD" "US_SSN"}))

Log Emails — Block Credit Cards and SSNs

Rule 1 — Block:

(any(cf.llm.prompt.pii_categories[*] in {"CREDIT_CARD" "US_SSN"}))

Rule 2 — Log:

(any(cf.llm.prompt.pii_categories[*] in {"EMAIL_ADDRESS"}))

Block Specific Unsafe Topics

(any(cf.llm.prompt.unsafe_topic_categories[*] in {"S1" "S10"}))

Layered — Injection + Bot Score + Geo (Low False Positives)

(cf.llm.prompt.injection_score lt 25
  and cf.bot_management.score lt 10
  and ip.geoip.country ne "US")

Block Injection from Automated Sources

(cf.llm.prompt.injection_score lt 30 and cf.bot_management.score lt 20)

Combined — Injection + PII (Common Attack Pattern)

(cf.llm.prompt.injection_score lt 40 and cf.llm.prompt.pii_detected)

Block Custom Topic (Competitors)

(cf.llm.prompt.custom_topic_categories["competitors"] lt 30)

Scope Rule to Specific Endpoint + Custom PII Format

(http.request.uri.path eq "/api/chat"
  and http.request.body.raw matches "EMP-[0-9]{6}")

Rate Limit High-Token Prompts (Cost Control)

(cf.llm.prompt.token_count gt 2000
  and http.request.uri.path eq "/api/chat")

Allow Financial PII Only from Internal Network

(cf.llm.pii.detected eq true
  and not ip.src in {10.0.0.0/8 172.16.0.0/12 192.168.0.0/16})

AI Gateway — Complementary Security Layer

AI Gateway is available on all Cloudflare plans at no cost. It sits between your app and LLM providers, adding security, observability, and performance features that the WAF layer cannot provide (e.g., response scanning, caching, model fallback).

🛡

Guardrails BETA

Content moderation on prompts AND model responses — block or flag by hazard category.

🔐

DLP BETA

Scan prompts and responses for PII, credentials, source code, jailbreak intent, financial data.

📍

Rate Limiting

Fixed or sliding window limits per gateway. Returns 429 when exceeded.

Caching

Cache identical prompt→response pairs. cf-aig-cache-status: HIT/MISS header.

🔃

Dynamic Routing

Model fallback and request retry — define alternate providers in JSON config.

🔒

Authenticated Gateway

Require cf-aig-authorization header — prevents direct bypass.

📝

Logging

Full prompt/response logging with conversation_id for audit trail reconstruction.

💸

Cost Analytics

Token counts and estimated costs per provider — visible in AI Gateway Analytics.

AI Gateway — Guardrails

Guardrails intercept and evaluate both user prompts and model responses for harmful content before they reach the user or LLM. The feature works as a proxy between your application and model providers.

Configuration Options

Setting Description
Evaluation scope User prompts only, model responses only, or both
Hazard categories Select which categories to monitor — e.g., violence, hate, self-harm
Action per category Block (return 400) or Flag (log but allow through)

Setup Steps

  1. Dashboard → AI → AI Gateway → select your gateway → Settings
  2. Enable Guardrails
  3. Set evaluation scope: user prompts, model responses, or both
  4. Select hazard categories to monitor and set action per category (block or flag)

AI Gateway — Data Loss Prevention (DLP)

AI Gateway DLP uses the same detection engine as Cloudflare's enterprise DLP product to scan AI traffic in real-time. It scans both incoming prompts and outgoing model responses.

DLP Detection Categories

Type What It Detects
Content: PII Names, SSNs, email addresses in prompts
Content: Credentials & Secrets API keys, passwords, tokens, connection strings
Content: Source Code Code snippets, algorithms, proprietary logic
Content: Customer Data Customer names, projects, confidential business context
Content: Financial Information Financial numbers, confidential business data
Intent: PII Prompt requesting specific personal information about individuals
Intent: Code Abuse Prompt requesting malicious code, exploits, or attack tools
Intent: Jailbreak Prompt attempting to circumvent AI safety policies
Bidirectional scanning: Enable DLP on both prompts and responses. LLM responses can leak PII from training data — scanning only prompts leaves you exposed on the response side.

AI Gateway — Setup

Create a Gateway

  1. Dashboard → AI → AI GatewayCreate Gateway
  2. Name your gateway (64 character limit)
  3. Connect your application by routing AI provider calls through the gateway URL
  4. Configure settings: Authentication, Rate Limiting, Caching, Guardrails, DLP

Enable DLP

  1. Select your gateway → Firewall tab
  2. Toggle Data Loss Prevention (DLP) to On
  3. Add DLP policies — select detection entries and set action (block / log)

Enable Rate Limiting

  1. Select your gateway → Settings
  2. Enable Rate-limiting
  3. Set rate, time period, and strategy (fixed window or sliding window)

Enable Caching

  1. Select your gateway → Settings
  2. Enable Cache Responses
  3. Set default cache TTL. Override per-request by passing cf-aig-cache-ttl header
Check cf-aig-cache-status: HIT or MISS in response headers to verify caching behavior. Currently caching applies only to identical requests with text or image responses.

OWASP Top 10 for LLMs — Coverage Map

OWASP LLM Risk Cloudflare Feature WAF Field / Tool
LLM01 Prompt Injection AI Security for Apps: Injection Detection cf.llm.prompt.injection_score
LLM02 Sensitive Info Disclosure AI Security for Apps: PII Detection + AI Gateway DLP cf.llm.prompt.pii_categories + DLP profiles
LLM06 Excessive Agency / Misuse WAF Rate Limiting + AI Gateway Rate Limiting Rate limiting rules + cf.llm.prompt.token_count
LLM08 Vector and Embedding Weaknesses AI Gateway Guardrails (response scanning) Guardrails hazard categories
LLM09 Misinformation / Unsafe Output AI Security for Apps: Unsafe Topic Detection + Guardrails cf.llm.prompt.unsafe_topic_categories
Jailbreak Policy Bypass AI Gateway DLP: Intent: Jailbreak + Injection Score DLP intent detection + injection score < 20

Recommended Deployment Workflow

1. Label endpoints
Apply cf-llm label via API Shield (auto-discovery + manual)
2. Enable Log Mode
Turn on AI Security Log Mode Ruleset — action = Log. Enable payload logging.
3. Review Security Analytics
Decrypt payloads, correlate prompts with scores. Understand baseline traffic patterns.
4. Define custom topics
Add org-specific topics via API using verb-phrase topic strings.
5. Build custom rules (Log action)
Create WAF Custom Rules with tuned thresholds. Keep on Log to validate.
6. Switch to Block
Once validated, change custom rule actions to Block. Disable Log Mode or keep for monitoring.
7. Set up AI Gateway
Add Guardrails, DLP (prompt + response), rate limiting, caching, and authenticated access.
8. Monitor and iterate
Continuously review Security Analytics. Adjust thresholds and topic strings as needed.
Custom rules (evaluated earlier in the pipeline) run before the managed ruleset. Set custom rules to Log during the transition period to run both modes in parallel before committing to Block.

Optimization Best Practices

Security

Practice Reason
Never block on pii_detected alone Generates massive false positives — PERSON, DATE_TIME, LOCATION appear in normal conversation
Start injection threshold at lt 30 lt 50 is too aggressive; tune based on log review before switching to Block
Use verb-phrase topic strings "asking for financial advice" is far more precise than "financial advice" — avoids passive-mention false positives
Layer injection score + bot score + geo Each signal alone may produce false positives; combined they identify high-confidence attack patterns
Enable bidirectional DLP LLM responses can leak PII from training data — scanning only prompts leaves you exposed on the output side
Use Log Mode + payload logging first Lets you see actual prompts alongside detection scores before enforcing blocking
Scope rules to specific URI paths Avoids unnecessary scanning of non-LLM endpoints and reduces false positives
Avoid semantically overlapping topics "financial advice" and "investment guidance" cover the same thing — wastes your 20-topic budget
Authenticate your AI Gateway Require cf-aig-authorization header to prevent direct bypass of AI Gateway controls

Cost & Performance

Practice Reason
Enable caching in AI Gateway Identical prompts return cached responses — major cost savings for support bots with limited prompt options
Use token_count for rate limiting High-token prompts are expensive — limit them to control LLM inference costs
Set rate limits at gateway level Prevents runaway costs from abuse or application bugs
Use dynamic routing / model fallback Increases resilience without manual intervention on provider downtime
Monitor token usage in analytics AI Gateway tracks token counts and estimated costs per provider — identify expensive patterns early

Operational Visibility

Practice Reason
Use conversation_id in logs Reconstruct full interaction context during incident investigation — filter by ID in Gateway logs
Enable encrypted payload logging Log full prompts securely — decrypt only when needed for forensic review, protecting user data at rest
Review Security Overview alerts Suspicious AI traffic is automatically surfaced — set up alerts for anomalous spikes
Monitor and iterate continuously Threat patterns and traffic baselines evolve — static thresholds degrade in precision over time

Documentation Links