Quality & Governance

Multi-layered quality assurance: 5 pipeline gates, 3-tier verdicts, circuit breaker pattern, and a safety net that prevents runaway autonomous execution.

AI Code Quality is a Coin Flip

AI-generated code is either brilliant or broken, and you rarely know which until production. Without structured review, security vulnerabilities slip through, edge cases get ignored, and code that 'works' today becomes tech debt tomorrow. Manual review doesn't scale when AI writes thousands of lines per session.

How Chati.dev solves this

5 Pipeline Gates

Quality gates are pipeline checkpoints. Each evaluates the previous stage's output and produces a verdict. Failed gates send work back for revision.

Quality gate pipeline: gates 2 and 4 have strict 95% thresholds, failed evaluations loop back.
1

Planning Complete

All DISCOVER + PLAN agents finished

All agents completed
2

QA Planning

QA-Planning agent validates plan coherence

95%
3

Implementation

Dev agent completes all assigned tasks

All tasks done
4

QA Implementation

Tests pass, SAST clean, coverage adequate

95%
5

Deploy Ready

All gates passed, ready for production

All gates passed

3-Tier Verdict System

APPROVED

Score meets or exceeds the threshold. Work proceeds to the next pipeline stage without intervention.

Trigger: score >= threshold

NEEDS_REVISION

Score is within 5 points below the threshold. Work returns to the agent for targeted fixes before re-evaluation.

Trigger: score >= threshold - 5

BLOCKED

Score is significantly below threshold or a critical blocker exists. Pipeline halts and requires human intervention.

Trigger: score < threshold - 5 OR critical blocker

Scoring Thresholds

QA gates require 95% to pass; other gates use 90%. These thresholds enforce consistently high quality across the pipeline.

QA Planning
95%
QA Implementation
95%
All Other Agents
90%

SAST: Code Security

Static Application Security Testing scans source code for known vulnerability patterns across 10 OWASP categories, without executing the application. Runs before deployment.

10 Scanned Categories (OWASP)

1. SQL Injection
2. Command Injection
3. XSS (Cross-Site Scripting)
4. Hardcoded Secrets
5. Path Traversal
6. SSRF (Server-Side Request Forgery)
7. Insecure Deserialization
8. Weak Cryptography
9. Prototype Pollution
10. Insecure Configuration

Severity Classification

CRITICAL- Risk of immediate exploitation

Ex: SQL injection in public endpoint without sanitization

HIGH- Exploitable with some effort

Ex: XSS in user input rendered without escaping

MEDIUM- Potential risk, lower probability

Ex: Dependency with known low-impact CVE

LOW- Best practices improvement

Ex: Console.log in production, missing security headers

Threshold: 0 Critical + 0 High for approval. Medium/Low must be documented and acknowledged.

Triple Review Protocol

Before the adversarial structural checks, QA-Implementation runs a three-pass review protocol. Each pass examines the code from a different perspective, ensuring comprehensive coverage.

Shadow Review

Silent review during code generation. Tracks patterns, identifies potential issues as they emerge without interrupting the Dev agent.

Sentinel Review

Post-implementation structural analysis. Examines architecture adherence, Design System token usage, error handling patterns, and code organization.

Compliance Review

Constitutional and specification adherence. Verifies acceptance criteria satisfaction, security requirements, and governance rule compliance.

Shell Security (23-Point Defense)

The constitution-guard.js hook applies 23 regex-based security checks to every Bash tool call, blocking shell injection patterns in under 5ms. This includes command chaining, backtick injection, process substitution, environment variable manipulation, and network exfiltration attempts.

Adversarial Review

Five mandatory structural checks, each producing a classified finding (ERROR/WARNING/SUGGESTION/ATTESTATION). The QA agent acts as a red team, actively working to find hidden flaws in every implementation. Zero findings triggers mandatory re-review.

1

Input Boundary

Null/undefined inputs, empty strings, very large inputs, negative numbers, special characters.

2

Dependency Audit

Unused imports, circular dependencies, packages with known vulnerabilities.

3

Error Path Coverage

Try/catch blocks tested, error callbacks exercised, failure modes documented.

4

Concurrency Safety

Shared mutable state, race conditions in async code, missing locks for file I/O.

5

Security Scan

OWASP Top 10 beyond basic SAST: ReDoS, prototype pollution, timing attacks, path traversal.

Devil's Advocate Pass

After the initial review concludes APPROVED, the QA agent flips its stance: 'This code has a hidden flaw.' One focused pass looking for race conditions, memory leaks, bad security assumptions, and performance bottlenecks under 10x load. Always included in the report.

Circuit Breaker

Circuit breaker pattern prevents repeated failures from burning resources. After 3 consecutive failures, the circuit opens and requests get rejected until a recovery probe succeeds.

Circuit breaker state machine: CLOSED (normal) → OPEN (failures) → HALF_OPEN (probe).

CLOSED

Normal operation. All requests flow through. Failure counter tracks consecutive failures, resets on success.

OPEN

Failure threshold exceeded. All requests rejected immediately. 60-second cooldown before recovery attempt.

HALF_OPEN

Recovery probe: one test request allowed through. Success resets to CLOSED; failure returns to OPEN.