Architecture Documenter Examples

Externalized from the agent definition per the few-shot-examples rule (#1587).

Architecture Documenter — Worked Examples

Externalized from the agent definition per the few-shot-examples rule (#1587).

Step 1: Software Architecture Document (SAD) Creation

Read template from aiwg install:

~/.local/share/ai-writing-guide/agentic/code/frameworks/sdlc-complete/templates/analysis-design/software-architecture-doc-template.md

Structure SAD:

---
title: Software Architecture Document
version: 1.0
status: DRAFT | REVIEWED | APPROVED | BASELINED
date: 2025-10-15
project: {project-name}
phase: Elaboration
primary-author: architecture-designer
reviewers: [security-architect, test-architect, requirements-analyst]
---

# Software Architecture Document

## 1. Architectural Drivers

### Quality Attributes

**Performance:**
- Response time: p95 < 500ms for API requests
- Throughput: 1,000 requests/second sustained
- Database queries: p99 < 200ms

**Scalability:**
- Horizontal scaling: 10,000 concurrent users
- Auto-scaling: Triggers at 70% CPU utilization
- Data volume: 100M transactions/month

**Security:**
- Authentication: OAuth 2.0 with JWT tokens
- Authorization: Role-based access control (RBAC)
- Encryption: TLS 1.3 in transit, AES-256 at rest

**Availability:**
- Uptime: 99.9% SLA (43 min/month downtime)
- Multi-region: Active-active deployment
- Failover: Automatic within 30 seconds

### Constraints

**Technical:**
- Must use existing PostgreSQL database
- Must integrate with legacy SOAP API
- Cloud provider: AWS only

**Organizational:**
- Team expertise: Node.js, Python
- Budget: $10K/month infrastructure
- Timeline: 6 months to production

**Compliance:**
- GDPR compliance required
- SOC 2 Type II certification

## 2. Component Decomposition

### Logical View

\```text
┌─────────────────────────────────────────┐
│         Presentation Layer              │
│  ┌─────────┐  ┌─────────┐  ┌─────────┐ │
│  │ Web App │  │ Mobile  │  │ Admin   │ │
│  │ (React) │  │ (Native)│  │ Portal  │ │
│  └─────────┘  └─────────┘  └─────────┘ │
└─────────────────────────────────────────┘
                  ↓
┌─────────────────────────────────────────┐
│           API Gateway Layer             │
│        (Authentication, Rate Limiting)  │
└─────────────────────────────────────────┘
                  ↓
┌─────────────────────────────────────────┐
│         Application Services            │
│  ┌──────────┐  ┌──────────┐  ┌────────┐│
│  │ User     │  │ Product  │  │Payment ││
│  │ Service  │  │ Service  │  │Service ││
│  └──────────┘  └──────────┘  └────────┘│
└─────────────────────────────────────────┘
                  ↓
┌─────────────────────────────────────────┐
│            Data Layer                   │
│  ┌──────────┐  ┌──────────┐  ┌────────┐│
│  │PostgreSQL│  │  Redis   │  │  S3    ││
│  │   (RDS)  │  │ (Cache)  │  │(Files) ││
│  └──────────┘  └──────────┘  └────────┘│
└─────────────────────────────────────────┘
\```

### Physical View

**Microservices:**
1. **User Service**
   - Responsibilities: Authentication, user management, profiles
   - Technology: Node.js (Express)
   - Database: PostgreSQL (users table)
   - APIs: REST, Internal gRPC

2. **Product Service**
   - Responsibilities: Product catalog, inventory, search
   - Technology: Python (FastAPI)
   - Database: PostgreSQL (products table), Redis (cache)
   - APIs: REST, GraphQL

3. **Payment Service**
   - Responsibilities: Payment processing, subscriptions
   - Technology: Node.js (Express)
   - Integrations: Stripe API, legacy SOAP billing system
   - APIs: REST (internal only)

**Shared Components:**
- **API Gateway:** Kong (rate limiting, authentication, routing)
- **Message Queue:** RabbitMQ (async events)
- **Cache:** Redis (session, data caching)
- **Storage:** S3 (user uploads, backups)

## 3. Deployment Architecture

### Environments

**Development:**
- Deployed: Local Docker Compose
- Database: Local PostgreSQL
- Purpose: Developer testing

**Test/Staging:**
- Deployed: AWS ECS (Fargate)
- Database: RDS PostgreSQL (t3.medium)
- Purpose: Integration testing, UAT

**Production:**
- Deployed: AWS ECS (Fargate), Multi-AZ
- Database: RDS PostgreSQL (r6g.xlarge), Multi-AZ
- Regions: us-east-1 (primary), us-west-2 (failover)
- Load Balancer: Application Load Balancer (ALB)
- CDN: CloudFront (static assets)

### Deployment Diagram

\```mermaid
graph TB
    Users[Users] --> CF[CloudFront CDN]
    CF --> ALB[Application Load Balancer]
    ALB --> ECS1[ECS Cluster us-east-1]
    ALB --> ECS2[ECS Cluster us-west-2]

    ECS1 --> User1[User Service]
    ECS1 --> Product1[Product Service]
    ECS1 --> Payment1[Payment Service]

    User1 --> RDS1[(RDS Primary)]
    Product1 --> RDS1
    Product1 --> Redis1[Redis Cache]
    Payment1 --> Stripe[Stripe API]

    RDS1 -.Replication.-> RDS2[(RDS Replica us-west-2)]

    ECS2 --> User2[User Service]
    ECS2 --> Product2[Product Service]
    User2 --> RDS2
    Product2 --> RDS2
\```

## 4. Technology Stack

| Layer | Technology | Rationale |
|-------|------------|-----------|
| **Frontend** | React 18 | Team expertise, component reusability |
| **API Gateway** | Kong | Open-source, plugin ecosystem, rate limiting |
| **Backend** | Node.js, Python | Team expertise, async I/O (Node), ML libs (Python) |
| **Database** | PostgreSQL 15 | ACID compliance, existing expertise, JSON support |
| **Cache** | Redis 7 | High performance, pub/sub, session management |
| **Message Queue** | RabbitMQ | Reliable delivery, dead-letter queues |
| **Container** | Docker | Standardization, portability |
| **Orchestration** | AWS ECS Fargate | Managed, no server management, auto-scaling |
| **CI/CD** | GitHub Actions | Integrated with repo, free for open source |
| **Monitoring** | Datadog | APM, logs, metrics, alerts |

## 5. Integration Architecture

### External Systems

| System | Protocol | Purpose | SLA |
|--------|----------|---------|-----|
| Stripe API | REST (HTTPS) | Payment processing | 99.99% |
| Legacy Billing | SOAP | Subscription management | 99.5% |
| Email Service | REST (SendGrid) | Transactional emails | 99.9% |
| Analytics | REST (Segment) | User behavior tracking | 99.0% |

### Integration Patterns

**API Integration:**
- REST for synchronous requests
- gRPC for service-to-service (internal)
- GraphQL for flexible client queries (Product Service)

**Event-Driven:**
- RabbitMQ for async events (order placed, user registered)
- Publish/subscribe pattern
- Dead-letter queue for failed messages

**Legacy Integration:**
- SOAP adapter service for legacy billing system
- Fallback to manual processing if SOAP unavailable

## 6. Security Architecture

### Authentication Flow

\```mermaid
sequenceDiagram
    participant U as User
    participant G as API Gateway
    participant A as Auth Service
    participant DB as User DB

    U->>G: POST /login (username, password)
    G->>A: Forward credentials
    A->>DB: Verify credentials (bcrypt hash)
    DB-->>A: User validated
    A->>A: Generate JWT token
    A-->>G: Return JWT (expires 24h)
    G-->>U: Return JWT
    U->>G: GET /api/profile (JWT in header)
    G->>G: Validate JWT signature
    G->>Product: Forward request (JWT validated)
\```

### Authorization (RBAC)

**Roles:**
- **Admin:** Full access
- **Manager:** Read/write products, read users
- **User:** Read own profile, read products

**Implementation:**
- JWT claims include user roles
- API Gateway validates roles before routing
- Services enforce role-based permissions

### Data Protection

- **At Rest:** AES-256 encryption (RDS, S3)
- **In Transit:** TLS 1.3 (all external APIs), TLS 1.2 (service-to-service)
- **Secrets:** AWS Secrets Manager (API keys, DB credentials)
- **PII:** Masked in logs, encrypted in database

## 7. Data Architecture

### Data Model

\```sql
-- Users table
CREATE TABLE users (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    username VARCHAR(50) UNIQUE NOT NULL,
    email VARCHAR(255) UNIQUE NOT NULL,
    password_hash VARCHAR(255) NOT NULL,
    role VARCHAR(20) NOT NULL,
    created_at TIMESTAMPTZ DEFAULT NOW(),
    updated_at TIMESTAMPTZ DEFAULT NOW()
);

-- Products table
CREATE TABLE products (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name VARCHAR(255) NOT NULL,
    description TEXT,
    price DECIMAL(10, 2) NOT NULL,
    inventory INT NOT NULL DEFAULT 0,
    created_at TIMESTAMPTZ DEFAULT NOW(),
    updated_at TIMESTAMPTZ DEFAULT NOW()
);

-- Orders table
CREATE TABLE orders (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id UUID REFERENCES users(id),
    total DECIMAL(10, 2) NOT NULL,
    status VARCHAR(20) NOT NULL,
    created_at TIMESTAMPTZ DEFAULT NOW()
);
\```

### Data Flow

1. **Write Path:** Client → API Gateway → Service → PostgreSQL → Invalidate Redis cache
2. **Read Path (Cache Hit):** Client → API Gateway → Service → Redis → Return cached data
3. **Read Path (Cache Miss):** Client → API Gateway → Service → PostgreSQL → Write to Redis → Return data

### Migration Strategy

- **Tools:** Flyway for schema migrations
- **Process:** Blue-green deployment for schema changes
- **Rollback:** Migration down scripts for every up script

## 8. Key Decisions (ADRs)

| ADR | Title | Status | Date |
|-----|-------|--------|------|
| ADR-001 | Use PostgreSQL for primary database | Accepted | 2025-10-01 |
| ADR-002 | Use Kong for API Gateway | Accepted | 2025-10-05 |
| ADR-003 | Microservices vs. Monolith | Accepted | 2025-10-02 |
| ADR-004 | gRPC for service-to-service communication | Accepted | 2025-10-08 |

**See:** `.aiwg/architecture/adr/` for detailed ADR documents

## Sign-Off

**Required Approvals:**
- [ ] Software Architect: {APPROVED | CONDITIONAL | PENDING} - {name, date}
- [ ] Security Architect: {APPROVED | CONDITIONAL | PENDING} - {name, date}
- [ ] Test Architect: {APPROVED | CONDITIONAL | PENDING} - {name, date}
- [ ] Requirements Analyst: {APPROVED | CONDITIONAL | PENDING} - {name, date}

**Conditions (if conditional):**
1. {Condition description} - Owner: {role} - Due: {date}

**Outstanding Concerns:**
1. {Concern description} - Raised by: {role} - Severity: {HIGH | MEDIUM | LOW}

Step 2: Architecture Decision Records (ADRs)

Read template from aiwg install:

~/.local/share/ai-writing-guide/agentic/code/frameworks/sdlc-complete/templates/analysis-design/architecture-decision-record-template.md

Create ADR:

# ADR-001: Use PostgreSQL for Primary Database

## Status
**Accepted** (2025-10-01)

## Context
We need to select a primary database for the application. Key requirements:
- ACID compliance for financial transactions
- Support for complex queries (joins, aggregations)
- JSON data type support for flexible schemas
- Team has existing PostgreSQL expertise
- Budget constraint: $500/month database costs

## Decision
Use PostgreSQL 15 (AWS RDS) as the primary database.

## Rationale
**Considered alternatives:**
1. **MongoDB:** Flexible schema, but lacks ACID for multi-document transactions (required for payments)
2. **MySQL:** ACID compliant, but team has stronger PostgreSQL expertise
3. **DynamoDB:** AWS-native, but limited query flexibility, higher learning curve

**Why PostgreSQL:**
- ACID compliance meets financial transaction requirements
- JSON/JSONB support provides schema flexibility where needed
- Team has 3+ years PostgreSQL experience (reduces risk)
- RDS provides managed service (backups, failover, scaling)
- Cost: $300/month for r6g.xlarge (within budget)

## Consequences

**Positive:**
- Strong consistency for transactions
- Mature ecosystem (ORMs, tools, extensions)
- Vertical scaling path (up to 32 vCPUs, 256GB RAM)
- Multi-AZ replication for high availability

**Negative:**
- Vertical scaling limits (may need sharding for > 10M users)
- Less suited for unstructured data (vs. document databases)
- Requires careful index management for performance

**Mitigations:**
- Use read replicas for scaling reads
- Implement caching layer (Redis) for hot data
- Plan for horizontal sharding if scale exceeds vertical limits

## References
- [PostgreSQL vs. MongoDB](https://example.com/comparison)
- [AWS RDS PostgreSQL Pricing](https://aws.amazon.com/rds/postgresql/pricing/)
- [Team Expertise Assessment](./team-skills.md)

## Related Decisions
- ADR-002: API Gateway selection
- ADR-004: gRPC for service-to-service communication

Step 4: Feedback and Annotations

## 3. Deployment Architecture

<!-- ARCH-DOC: EXCELLENT - Clear multi-region deployment strategy -->

### Production
- Deployed: AWS ECS (Fargate), Multi-AZ
- Database: RDS PostgreSQL (r6g.xlarge), Multi-AZ
- Regions: us-east-1 (primary), us-west-2 (failover)

<!-- ARCH-DOC: QUESTION - How is failover triggered? Automatic or manual? Please specify failover time (RTO/RPO). -->

<!-- ARCH-DOC: SUGGESTION - Add disaster recovery section with backup strategy and restoration procedures. -->

## 4. Technology Stack

| Layer | Technology | Rationale |
|-------|------------|-----------|
| **Backend** | Node.js, Python | Team expertise <!-- ARCH-DOC: APPROVED - Clear rationale -->
| **Database** | PostgreSQL 15 | ACID compliance <!-- ARCH-DOC: GOOD - Should reference ADR-001 -->
| **Cache** | Redis 7 | High performance <!-- ARCH-DOC: NEEDS DETAIL - What specific Redis features used? (pub/sub, sessions, data cache?) -->

<!-- ARCH-DOC: WARNING - No monitoring/observability tools listed. Add section on Datadog, CloudWatch, or equivalent. -->