Semantic codebase understanding for intelligent security analysis.
AppMapper builds deep, queryable knowledge of application structure — routes, authentication patterns, authorization models, data flows, and ownership semantics. It provides the contextual foundation that vulnerability scanners need to perform targeted, meaningful analysis instead of blind pattern matching.
AppMapper indexes a codebase and answers structural questions about it:
Q: "Does this app have user registration?"
A: Yes. Registration is handled in AuthController.java:45 (POST /api/auth/register).
Stores users in UserRepository with bcrypt password hashing.
Email verification required via VerificationService.
Q: "List endpoints that don't require authentication"
A: Found 5 unauthenticated endpoints:
1. POST /api/auth/login
2. POST /api/auth/register
3. GET /api/public/products
4. GET /health
5. GET /metrics (RISK: should be protected)
Q: "Show ownership patterns for the Order resource"
A: Order ownership validation:
- Owner field: Order.userId
- Validation: OrderService.java:78 checks order.getUserId().equals(currentUser.getId())
- Missing in: GET /api/orders/{id} — potential IDOR
AppMapper is not a vulnerability scanner. It does not detect SQL injection, XSS, or command injection. It maps what the application is and does, so that specialized tools can scan with full context.
┌──────────────────────────────────────────────────────────────┐
│ AppMapper │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Parser │──▶│ Enricher │──▶│ Indexer │──▶│ Query │ │
│ │ (AST) │ │ (Rules) │ │ (Vector) │ │ Agent │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
│ │ │ │ │ │
│ Tree-sitter Tag-based ChromaDB LLM-powered │
│ extraction enrichment embeddings synthesis │
│ │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ Threat Modeling Engine │ │
│ │ Language profiles · Domain detection · Architecture │ │
│ │ Story-driven analysis · Knowledge graph RAG │ │
│ └────────────────────────────────────────────────────────┘ │
│ │
│ Output: shared_context.json │
└──────────────────────────────────────────────────────────────┘
- Parse — Tree-sitter AST extraction of code units (functions, classes, routes)
- Enrich — Rule-based tagging with semantic metadata (auth patterns, route handlers, data access)
- Index — Vector embeddings stored in ChromaDB for semantic search
- Query — Natural language questions resolved via semantic search + LLM synthesis
- Export — Structured
shared_context.jsonfor downstream tool consumption
Extracts HTTP endpoints across frameworks — Spring Boot, Express, Flask, Django, FastAPI, Go net/http, and more. Each route includes path, method, handler location, auth requirements, and role restrictions.
Identifies auth middleware, JWT validation, session management, OAuth integration points, and login/logout flows. Detects whether auth is cookie-based, token-based, or uses framework-specific mechanisms.
Maps role definitions, permission hierarchies, access control decorators (@PreAuthorize, @login_required, etc.), and resource-to-role relationships.
Discovers ownership validation patterns — which field links a resource to its owner, how the application verifies resource.owner == currentUser, and where those checks are missing.
Language-aware, domain-aware, architecture-aware threat generation:
- 16 universal threat categories (not limited to OWASP web risks)
- 10 language profiles with memory safety and dangerous pattern data
- 12 domain profiles (web API, image processing, cryptography, networking, embedded, etc.)
- Story-driven analysis that understands business context ("What would an attacker want?")
- Knowledge graph RAG grounded in OWASP Top 10 and CWE data
Traces how user input flows through the application — from HTTP parameters to database storage, identifying sanitization gaps and taint propagation.
- Python 3.10+
- An Anthropic API key (for LLM-powered queries and threat modeling)
git clone https://github.com/chasingimpact/appmap.git
cd appmap
pip install -r requirements.txtCreate a .env file in the project root:
ANTHROPIC_API_KEY=your-key-here
python run_server.pyThe web UI launches at http://localhost:8000.
curl -X POST http://localhost:8000/api/v2/scan-repo \
-H "Content-Type: application/json" \
-d '{"repo_path": "/path/to/repo"}'curl -X POST http://localhost:8000/api/v2/generate-shared-context \
-H "Content-Type: application/json" \
-d '{"repo_path": "/path/to/repo"}'curl -X POST http://localhost:8000/api/v2/export-shared-context \
-H "Content-Type: application/json" \
-d '{"repo_path": "/path/to/repo"}'curl -X POST http://localhost:8000/api/v2/threat-model/generate \
-H "Content-Type: application/json" \
-d '{"repo_path": "/path/to/repo"}'curl -X POST http://localhost:8000/api/v2/classify-directories \
-H "Content-Type: application/json" \
-d '{"repo_path": "/path/to/repo"}'AppMapper produces a shared_context.json designed for consumption by downstream security tools:
{
"scan_id": "...",
"repo_path": "/path/to/repo",
"primary_language": "java",
"frameworks_detected": ["spring-boot"],
"routes": [
{
"path": "/api/users/{id}",
"method": "GET",
"handler": "getUser",
"auth_required": true,
"roles": ["USER"],
"ownership_check": "user.id == request.user.id"
}
],
"ownership_patterns": [
{
"resource": "Order",
"owner_field": "userId",
"validation_pattern": "order.getUserId().equals(currentUser.getId())"
}
],
"auth_enforcement": [
{
"type": "annotation",
"name": "@PreAuthorize",
"location": "controllers/*",
"protects": ["admin endpoints"]
}
],
"unprotected_endpoints": ["/api/debug", "/metrics"],
"is_multi_tenant": true,
"tenant_isolation": {
"field": "organizationId",
"level": "MODERATE"
}
}src/
├── appmapper/
│ ├── service.py # Core service orchestration
│ ├── route_scanner.py # Multi-framework route extraction
│ ├── auth_discovery.py # Authentication pattern detection
│ ├── directory_classifier.py # Directory purpose classification
│ ├── shared_context.py # Structured context export
│ ├── query_agent.py # LLM-powered natural language queries
│ ├── reachability.py # Data flow reachability analysis
│ ├── dataflow_tracer.py # Input-to-storage data flow tracing
│ ├── threat_modeling/ # Universal threat model engine
│ │ ├── models.py # Threat types and data structures
│ │ ├── languages.py # Language security profiles
│ │ ├── domains.py # Domain threat profiles
│ │ ├── architectures.py # Architecture risk profiles
│ │ ├── component_analyzer.py # Codebase classification
│ │ ├── threat_enumerator.py # Threat generation
│ │ ├── semantic_analyzer.py # Story-driven analysis
│ │ ├── llm_validator.py # LLM-based validation
│ │ └── knowledge_graph/ # OWASP/CWE RAG retrieval
│ └── ui/
│ ├── app.py # Flask web application
│ └── templates/
├── parser/ # Tree-sitter AST parsing
├── enricher/ # Rule-based semantic enrichment
└── indexer/ # ChromaDB vector indexing
| Language | Frameworks |
|---|---|
| Java | Spring Boot, Spring MVC, JAX-RS |
| JavaScript | Express, Fastify, Koa, Hapi |
| TypeScript | NestJS, Express |
| Python | Flask, Django, FastAPI |
| Go | net/http, Gin, Echo, Chi |
| PHP | Laravel, Symfony |
| Ruby | Rails, Sinatra |
| C# | ASP.NET Core |
Private repository. All rights reserved.