Back to Projects

AI Security Scanner

AI-powered API security scanner with a 5-stage pipeline: OWASP ZAP passive spider → concurrent endpoint discovery (sitemap + 150 common paths, async httpx) → custom rule checks (missing auth, SQLi, open admin, headers, sensitive data, prompt injection) → GPT-4 AI risk analysis → JSON + Markdown report. Redis queue for background deep scanning via ai-sec-worker.

Python 3.11+httpxtyperrichOWASP ZAPGPT-4 / OpenAIRedispython-dotenvruffpytestpytest-asyncioPoetryGitHub ActionsMakefile

Role

Security Engineer

Team

Solo

Company/Organization

Personal Project

The Problem

Security teams scanning APIs and web targets needed a unified workflow combining passive recon, rule-based checks, and AI-driven analysisbut no...

OWASP ZAP alone only covers passive/active scanning patterns but misses application-level logic issues like missing authentication on sensitive paths...

Custom rule-based scanners (manual scripts) are inconsistent, miss nuanced risks, and produce unstructured output that's hard to triage. No...

Discovered endpoints needed deep scanning without blocking the main scan pipelineno background worker queue meant sequential processing and slow...

GPT-4 had no integration path into security tooling for flagging risks that rule-based systems miss (e.g., publicly accessible database schema files,...

The Solution

Built a modular Python CLI security scanner with a 5-stage pipeline and optional background worker.

CLI Entry Point (ai_sec_scan/cli.py)

`ai-sec-scan scan <target>`orchestrates the full 5-stage pipeline with rich terminal output.

`ai-sec-worker`runs the Redis background worker for deep scanning queued endpoints.

typer for CLI definition, rich for colored terminal output with progress indicators.

Stage 1 — OWASP ZAP Passive Spider (scanner/zap_scanner.py)

Connects to ZAP daemon via python-owasp-zap-v2.4 API client.

Runs passive spider on target URL, waits for completion.

Fetches ZAP alerts (Content-Security-Policy missing, directory browsing, anti-CSRF, vulnerable JS libs, insecure cookies, server version leak).

Gracefully skips if ZAP is not runningscanner continues without ZAP findings.

Stage 2 — Endpoint Discovery (scanner/discovery.py)

Fetches and parses sitemap.xml for known URLs.

Concurrently probes 150 common paths (/admin, /api, /login, /config, /debug, etc.) using httpx async with configurable concurrency.

Returns list of live endpoints (HTTP 200/301/302 responses).

Configurable timeout and headers in config.py.

Stage 3 — Custom Rule Checks (checks/)

basic_checks.py — Missing authentication detection (checks if sensitive paths return 200 without auth headers), SQL injection signal detection...

advanced_checks.py — Security headers check (X-Frame-Options, X-Content-Type-Options, Strict-Transport-Security, Content-Security-Policy),...

Each check returns list of findings with severity (HIGH/MEDIUM/LOW) and endpoint URL.

Stage 4 — AI Analysis (llm/analyzer.py)

GPT-4o-mini reads raw HTTP responses from discovered endpoints.

Prompt instructs GPT-4 to identify nuanced risks: publicly accessible files, misconfigured access controls, information disclosure, logic flaws not...

Returns AI findings with severity and detailed reason text.

Skips gracefully if OPENAI_API_KEY not set.

Stage 5 — Report Generation (reports/generator.py)

Aggregates findings from all stages (ZAP + custom + AI) into unified structure.

Writes security_report.json (machine-readable, all findings with severity/endpoint/reason).

Writes security_report.md (human-readable Markdown with tables and finding details).

Reports total finding count and endpoint count queued.

Redis Background Worker (scanner/queue.py + scanner/worker.py)

scanner/queue.py`enqueue(endpoint)` pushes discovered endpoints to Redis list. `dequeue()` pops for processing.

scanner/worker.pyai-sec-worker polls Redis queue, deep-scans each endpoint (full checks pipeline), appends findings to report.

Optional: scanner runs without Redis, endpoints just not queued.

Config (config.py)

Global timeout, default headers (User-Agent, Accept), ZAP API URL, Redis connection settings.

OPENAI_API_KEY, REDIS_HOST, REDIS_PORT loaded from .env via python-dotenv.

Makefile Automation (10 commands)

`make install`Poetry install all deps including dev.

`make lint` / `make lint-fix`ruff check / auto-fix.

`make format`ruff format.

`make test` / `make test-cov`pytest / pytest with coverage.

`make ci`full lint + test pipeline.

`make scan TARGET=<url>`run scanner against target.

`make worker`start background worker.

`make clean`remove build artifacts.

GitHub Actions CI (.github/workflows/ci.yml)

Runs on every push and PR to main/master: 1.

Lintruff check on Python 3.11.

Testpytest across Python 3.11 and 3.12.

Design Decisions

5-stage pipeline with graceful degradationZAP and Redis are optional; if ZAP is not running or Redis is unavailable, the scanner skips those...

Separated checks into basic_checks.py (auth/SQLi/admin) and advanced_checks.py (headers/sensitive data/prompt injection)keeps each file focused...

Redis queue for background deep scanningmain scan pipeline stays fast; discovered endpoints are queued and processed asynchronously by...

GPT-4 reads raw HTTP responses rather than structured dataLLM can identify patterns humans and rules miss (publicly accessible schema files,...

typer + rich for CLItyper provides clean command definition with type hints and --help generation; rich enables colored, structured terminal...

python-owasp-zap-v2.4 API client over ZAP REST directlyofficial Python client handles ZAP API authentication, polling, and response parsing,...

Unified JSON + Markdown report outputJSON for programmatic consumption (CI/CD integration, SIEM ingestion), Markdown for human review in GitHub...

Poetry for dependency managementlockfile ensures reproducible installs across environments; pyproject.toml separates dev dependencies (ruff,...

ruff over flake8 + blacksingle tool for both linting and formatting, 10-100x faster than legacy tools, single configuration in pyproject.toml.

MIT licenseenables security researchers, students, and teams to use, modify, and integrate the scanner without restrictions.

Tradeoffs & Constraints

GPT-4 analysis costeach scan calls OpenAI API for every discovered endpoint. Cost scales with endpoint count. Mitigated by optional OPENAI_API_KEY...

ZAP daemon must be running separatelyrequires user to start ZAP with `zap.sh -daemon` before scanning. Automated ZAP startup would add Docker...

Rule-based checks are heuristicSQL injection detection looks for error message patterns, not actual injection confirmation. False positives...

Concurrent endpoint probing may trigger rate limiting or WAF blocks on targetconfigurable concurrency helps but aggressive scanning can still be...

Redis worker processes endpoints sequentiallyparallel workers would improve throughput but require distributed locking to avoid duplicate...

Would improve: Active injection probing mode (send crafted payloads for confirmed SQLi/XSS), authenticated scanning (session cookie/API key...

Outcome & Impact

Production CLI security scanner running a unified 5-stage pipeline (ZAP → discovery → custom checks → AI analysis → report) with a single command:...

13 security check categories across ZAP passive scanning and custom rule checks: Content-Security-Policy missing, X-Frame-Options/HSTS missing,...

GPT-4 AI risk analysis layer flags issues rule-based systems misspublicly accessible database schema files, directory listings with no access...

Structured output in two formats: security_report.json (machine-readable, CI/CD integration ready) and security_report.md (human-readable Markdown...

Redis background worker (ai-sec-worker) enables async deep scanning of discovered endpoints without blocking the main pipelineendpoints queued via...

Graceful degradationZAP not running: skips ZAP stage, continues with discovery + checks + AI. Redis unavailable: skips queuing, endpoints not...

GitHub Actions CI on every push and PR: ruff lint on Python 3.11, pytest across Python 3.11 and 3.12, ensuring clean code and passing tests across...

Makefile automation with 10 commands covers full development lifecycle: install, lint, format, test, ci, scan, worker, clean.

Tech Stack

Language: Python 3.11+, Poetry (dependency management + packaging)

HTTP: httpx (async HTTP requests for concurrent endpoint discovery and checks)

CLI: typer (command definition with type hints), rich (colored terminal output, progress indicators)

Security: python-owasp-zap-v2.4 (ZAP API client for passive spider + alerts)

AI Analysis: OpenAI GPT-4o-mini (nuanced vulnerability detection from raw HTTP responses)

Queue: Redis (endpoint queue for background deep scanning via ai-sec-worker)

Config: python-dotenv (.env loading for OPENAI_API_KEY, REDIS_HOST, REDIS_PORT)

Linting: ruff (lint + format, replaces flake8 + black)

Testing: pytest, pytest-asyncio (async test support)

CI/CD: GitHub Actions (ruff lint + pytest on Python 3.11 and 3.12)

Automation: Makefile (10 commands: install, lint, lint-fix, format, test, test-cov, ci, scan, worker, clean)

Back to Projects