The Practical Guide to C4 Deployment Diagrams

Guillermo Quiros by Guillermo Quiros

A hands-on, example-driven guide to creating effective Deployment Diagrams using the C4 Model with real-world case studies across industries and cloud platforms.


Introduction: Where Architecture Meets Reality

The Context diagram describes the world around a system. The Container diagram describes the logical structure of the system its independently deployable units and how they communicate. But neither of those diagrams answers the questions that matter most when something goes wrong at 3am: Where is this thing actually running? What infrastructure does it depend on? If this availability zone goes down, what breaks? How many replicas of this service are running, and how is traffic distributed between them?

These are the questions the Deployment diagram answers.

The Deployment diagram is the C4 Model's bridge between the logical architecture the world of containers, services, and communication patterns and the physical reality of running infrastructure. It shows where each container from the Container diagram actually runs: which cloud region, which cluster, which node, which managed service. It shows how traffic enters the system, how it is routed between components, and how the system is protected against hardware, network, and availability zone failures.

In an era of cloud-native architecture, infrastructure-as-code, and multi-region deployments, the Deployment diagram has become more important and more complex than ever. A modern production system might run across three availability zones in two cloud regions, with containers orchestrated by Kubernetes, traffic managed by a global load balancer, data replicated asynchronously across region boundaries, and a CDN sitting in front of all static assets. Understanding this topology not in the abstract, but in the specific, real configuration actually running in production requires exactly the kind of diagram the C4 Model defines at the deployment level.

Despite its importance, the Deployment diagram is the most frequently skipped of the four C4 diagram types. Teams that have invested in Context and Container diagrams often treat the Deployment diagram as an optional addition something for the infrastructure team to worry about, not a first-class architecture artifact. This is a mistake. The Deployment diagram is not an infrastructure inventory. It is an architectural diagram that captures the decisions about availability, scalability, security topology, and operational configuration that determine how the system behaves under real-world conditions.

This guide teaches you to draw Deployment diagrams that are accurate, informative, and genuinely useful for operations teams responding to incidents, for architects evaluating reliability risks, for security reviewers analyzing network topology, and for engineers onboarding onto the infrastructure side of a system. Every concept is grounded in real-world examples drawn from recognizable cloud architectures.


What Is a Deployment Diagram?

The Definition

A Deployment diagram in the C4 Model shows how the containers from the Container diagram are mapped onto the physical or virtual infrastructure on which they run. It answers the question: "Where does this system actually live, and how is it configured to run reliably?"

A Deployment diagram has two primary concerns that distinguish it from all other C4 diagrams:

Physical placement. Where does each container run? On which node, in which cluster, in which availability zone, in which region? Physical placement determines failure domains which components fail together and which fail independently.

Operational configuration. How many replicas are running? How is traffic distributed? What are the scaling policies? How are health checks configured? How is data replicated? These operational details determine the system's availability, performance, and resilience characteristics.

The Relationship to the Container Diagram

The Deployment diagram is not a replacement for the Container diagram it is a companion. The Container diagram shows the logical architecture (what containers exist and how they communicate). The Deployment diagram shows the physical architecture (where those containers run and how they are configured).

Every container in the Container diagram should appear in at least one Deployment diagram typically the production Deployment diagram. Conversely, every deployable unit in the Deployment diagram should correspond to a container in the Container diagram.

This correspondence is what makes the C4 Model a coherent hierarchy rather than a collection of independent diagrams. A reader who understands the Container diagram can navigate to the Deployment diagram and immediately see how the logical structure maps onto the physical infrastructure.

Multiple Deployment Diagrams for Multiple Environments

A system typically runs in multiple environments: production, staging, development, and possibly others (load testing, disaster recovery, preview environments). Each environment may have a different infrastructure topology production might be multi-region with full redundancy, staging might be single-region with reduced redundancy, and development might run entirely in Docker Compose on a developer's laptop.

Each distinct environment topology warrants its own Deployment diagram, clearly labeled with the environment name. This is not duplication the diagrams capture genuinely different configurations with different availability and reliability characteristics.

The production Deployment diagram is the most important. It must be accurate, current, and detailed enough to be useful for incident response. Staging and development Deployment diagrams provide value for onboarding and for understanding the gap between development and production environments.


The Elements of a Deployment Diagram

The Deployment diagram introduces new element types not seen in other C4 diagrams: deployment nodes and infrastructure nodes. Understanding these elements precisely is essential for drawing accurate Deployment diagrams.

Element 1: Deployment Nodes

A deployment node is a computational infrastructure element on which containers or other deployment nodes can be deployed. Deployment nodes are the physical and virtual infrastructure of the system.

Deployment nodes come in several categories:

Physical servers and virtual machines. A bare metal server in a data center. An EC2 instance, a GCP Compute Engine instance, or an Azure Virtual Machine in a cloud environment. These are the fundamental compute units.

Container orchestration clusters. A Kubernetes cluster, an ECS cluster, a Nomad cluster. These are deployment nodes that can contain other deployment nodes (individual nodes in the cluster) and the containers running on them.

Cluster nodes. Individual nodes within an orchestration cluster. A Kubernetes worker node, an ECS container instance. These are deployment nodes within the cluster deployment node.

Managed cloud services. Amazon RDS, Google Cloud SQL, Azure Database for PostgreSQL managed database services. Amazon ElastiCache, Memorystore managed cache services. Amazon SQS, Google Pub/Sub managed messaging services. These are deployment nodes on which data-storing containers run.

Cloud regions and availability zones. At the highest level of physical abstraction, cloud regions (us-east-1, eu-west-1, ap-southeast-1) and availability zones within those regions (us-east-1a, us-east-1b, us-east-1c) are deployment nodes. Showing containers deployed into specific availability zones makes failure isolation explicit.

CDN edge networks. A CloudFront distribution, a Cloudflare network, an Akamai edge network. These are deployment nodes that host cached content and route traffic.

Deployment nodes are nested. A Kubernetes cluster deployment node contains Kubernetes worker node deployment nodes, which contain Kubernetes pod deployment nodes, which contain the containers. A cloud region deployment node contains availability zone deployment nodes, which contain cluster deployment nodes. This nesting reflects the real-world hierarchy of infrastructure.

Element 2: Infrastructure Nodes

An infrastructure node is a supporting infrastructure element that is not a container but is relevant to the deployment architecture. Infrastructure nodes are distinct from deployment nodes in that they do not host containers they provide supporting services to the deployment.

Examples of infrastructure nodes:

Load balancers. An AWS Application Load Balancer, a Google Cloud Load Balancer, an Azure Application Gateway, an Nginx instance configured as a load balancer. These route traffic to containers but do not host them.

DNS. Route 53, Cloud DNS, Cloudflare DNS. These resolve hostnames to IP addresses but do not host containers.

Firewalls and security groups. AWS Security Groups, GCP Firewall Rules, Azure Network Security Groups. These control network access but do not host containers.

API gateways (as infrastructure). AWS API Gateway (the managed service, distinct from a containerized API gateway like Kong). These route and manage API traffic.

VPNs and private connectivity. AWS Direct Connect, Azure ExpressRoute, a VPN gateway. These provide private network connectivity.

Certificate managers. AWS Certificate Manager, Let's Encrypt. These manage TLS certificates.

Infrastructure nodes appear in the Deployment diagram to show the supporting infrastructure that makes the deployment work the traffic routing, security controls, and network configuration that are not visible in the Container diagram but are essential to understanding the deployed system.

Element 3: Containers (Deployed Instances)

Containers from the Container diagram appear inside deployment nodes to show where they are running. In the Deployment diagram, a container is shown as an instance running on specific infrastructure not as a logical concept, but as a concrete deployment.

Container instances in a Deployment diagram should be annotated with:

Instance count or replica configuration. "3 replicas," "Auto-scaling: 2-10 instances," "Single instance." The number of running instances is one of the most important pieces of information in a Deployment diagram it determines availability, capacity, and blast radius.

Resource configuration. Memory and CPU limits for containerized workloads. Instance type for VM-based workloads. These annotations are optional in the Container diagram but often valuable in the Deployment diagram for capacity planning discussions.

Health check configuration. What health check is configured? What does it check? This is particularly relevant for Kubernetes deployments where liveness and readiness probes determine how the orchestrator handles unhealthy instances.

Element 4: Relationships

Relationships in the Deployment diagram describe how infrastructure elements connect network connections, traffic routing, data replication. They differ from Container diagram relationships in their focus on physical connectivity rather than logical communication.

Deployment diagram relationships should be labeled with:

Network protocol. HTTPS, TCP, UDP, AMQP, PostgreSQL wire protocol.

Port numbers. Where operationally relevant port 443 for HTTPS, port 5432 for PostgreSQL, port 6379 for Redis.

Traffic routing rules. "Routes 80% of traffic to v2, 20% to v1" (canary deployment). "Weighted round-robin across 3 AZs." "Health-check-based failover."

Replication configuration. "Synchronous replication to standby." "Asynchronous replication to read replica." "Multi-region replication with 30-second lag."


Building a Deployment Diagram: Step by Step

Step 1: Choose the Environment

Decide which environment the Deployment diagram will document. Start with production it is the most important and most complex, and it is the one that matters for reliability analysis and incident response.

Label the diagram clearly with the environment name: "Production Deployment," "Staging Deployment," "Development Deployment." This label should be prominent a reader who picks up a Deployment diagram must immediately know which environment they are looking at.

Step 2: Identify the Infrastructure Topology

For the chosen environment, document the infrastructure topology:

  • Which cloud provider(s) and regions are used?
  • Which availability zones within each region?
  • What orchestration platform (Kubernetes, ECS, bare VM, serverless)?
  • What managed services (RDS, ElastiCache, SQS, CloudFront)?
  • What load balancers and traffic routing infrastructure?
  • What VPCs, subnets, and network segmentation?

This inventory forms the skeleton of the deployment nodes in the diagram.

Step 3: Map Containers to Infrastructure

For each container in the Container diagram, identify:

  • Where it runs (which deployment node)
  • How many replicas
  • How it is scaled (manual, auto-scaling, serverless)
  • What resource configuration it has
  • What health checks are configured

This mapping is the core content of the Deployment diagram. Every container from the Container diagram must be accounted for.

Step 4: Document the Traffic Flow

Trace how traffic enters the system and flows through the infrastructure:

  • Where does DNS resolve to?
  • What CDN or load balancer receives incoming requests?
  • How are requests routed to container instances?
  • How are requests routed between container instances (service mesh, load balancer, direct)?

Step 5: Document Data Replication

For every data store, document the replication configuration:

  • Is there a primary-replica setup?
  • Is replication synchronous or asynchronous?
  • Are replicas in different availability zones?
  • Are replicas in different regions?
  • What is the RPO (Recovery Point Objective) implied by the replication lag?

Step 6: Annotate Failure Domains

The most valuable analysis a Deployment diagram enables is failure domain analysis: "If this availability zone goes down, what components are affected?" Annotate the diagram to make failure domains explicit group components by availability zone, show which components have cross-AZ redundancy, and identify single points of failure.


Real-World Example 1: E-Commerce Platform AWS Production Deployment

Infrastructure Overview

The e-commerce platform runs on AWS in the us-east-1 region, distributed across three availability zones (us-east-1a, us-east-1b, us-east-1c). The system uses EKS (Elastic Kubernetes Service) for container orchestration, RDS for managed PostgreSQL databases, ElastiCache for Redis, MSK (Managed Streaming for Kafka) for the message broker, and CloudFront for global CDN.

Deployment Node Hierarchy

AWS Region: us-east-1
├── CloudFront Distribution (CDN)
│   └── Web Application (React SPA) static assets cached at edge
│
├── Availability Zone: us-east-1a
│   ├── Public Subnet
│   │   └── Application Load Balancer (ALB) receives HTTPS traffic on port 443
│   └── Private Subnet
│       ├── EKS Worker Node (m5.2xlarge)
│       │   ├── API Gateway Pod (Kong) 2 replicas in this AZ
│       │   ├── Order Service Pod 2 replicas in this AZ
│       │   └── Payment Service Pod 2 replicas in this AZ
│       └── RDS Primary Instance (db.r6g.2xlarge, PostgreSQL 16)
│           ├── Orders Database (primary)
│           ├── Customer Database (primary)
│           └── Payment Database (primary)
│
├── Availability Zone: us-east-1b
│   └── Private Subnet
│       ├── EKS Worker Node (m5.2xlarge)
│       │   ├── API Gateway Pod 2 replicas
│       │   ├── Order Service Pod 2 replicas
│       │   ├── Product Catalog Service Pod 2 replicas
│       │   └── Customer Service Pod 2 replicas
│       ├── RDS Read Replica (db.r6g.xlarge)
│       │   └── Asynchronous replication from primary in us-east-1a
│       └── ElastiCache Node (cache.r6g.large)
│           └── Session Cache (Redis 7)
│
├── Availability Zone: us-east-1c
│   └── Private Subnet
│       ├── EKS Worker Node (m5.2xlarge)
│       │   ├── Notification Service Pod 2 replicas
│       │   ├── Product Catalog Service Pod 2 replicas
│       │   └── Customer Service Pod 2 replicas
│       └── ElastiCache Node (cache.r6g.large)
│           └── Session Cache (Redis 7) replica of us-east-1b node
│
├── MSK Cluster (Multi-AZ, 3 brokers one per AZ)
│   └── Message Broker (Apache Kafka) replicated across all 3 AZs
│
└── Elasticsearch Service (3-node cluster, one node per AZ)
    └── Search Index

Traffic Flow

User's browser
    │
    ▼ HTTPS (port 443)
CloudFront Distribution
    │ Serves static assets (JS, CSS, images) from edge cache
    │ Forwards dynamic API requests to ALB
    ▼ HTTPS (port 443)
Application Load Balancer
    │ SSL termination
    │ Routes /api/* to API Gateway target group
    │ Health check: GET /health, 30s interval
    ▼ HTTP (port 8000), round-robin across 6 API Gateway pods (2 per AZ)
API Gateway (Kong)
    │ JWT validation
    │ Rate limiting (per-customer, per-IP)
    │ Routes to backend services by path prefix
    ▼ HTTP (internal), routes to appropriate service
Order Service / Customer Service / Product Catalog Service
    │
    ▼ SQL (PostgreSQL wire protocol, port 5432)
RDS Primary Instance

Failure Domain Analysis

Single AZ failure (e.g., us-east-1a goes down):

  • ALB automatically routes traffic to pods in us-east-1b and us-east-1c
  • EKS Kubernetes scheduler reschedules evicted pods to remaining AZs
  • RDS fails over to a standby replica in us-east-1b or us-east-1c (automatic Multi-AZ failover, ~60 seconds)
  • MSK continues with 2 of 3 brokers available (Kafka tolerates broker loss with replication factor ≥ 2)
  • ElastiCache Redis fails over to replica node (~30 seconds, some cache misses expected)
  • User impact: Increased latency during failover (30-90 seconds), no data loss for committed transactions

RDS primary failure:

  • Automatic Multi-AZ failover to standby replica (~60-120 seconds)
  • Application reconnects automatically via RDS endpoint (DNS update)
  • Read replicas are promoted independently if needed
  • User impact: Brief period of write unavailability during failover

MSK broker failure (one of three):

  • Kafka partition leadership automatically reassigned to remaining brokers
  • No message loss for partitions with replication factor 3
  • User impact: None for properly configured consumers

What the Diagram Reveals

Multi-AZ redundancy is explicit. The diagram shows exactly which components have multi-AZ redundancy (EKS pods distributed across AZs, RDS Multi-AZ, MSK cluster spanning all three AZs, ElastiCache replicated across two AZs) and which do not (the ALB is a single logical entity, but AWS manages its availability internally). This makes the availability story visible and verifiable.

The public/private subnet boundary. The ALB is in the public subnet (internet-facing). All application pods and databases are in private subnets (no direct internet access). This network security boundary which is architecturally significant but invisible in the Container diagram is made explicit in the Deployment diagram.

The CloudFront CDN layer. Static assets (the React SPA's JavaScript bundles, CSS, images) are served from CloudFront's global edge network, not from origin servers in us-east-1. This means users in Europe get the SPA's static assets from a European edge node, not from a server in Virginia. This performance architecture is invisible in the Container diagram but clearly visible in the Deployment diagram.

Read replica for analytics offloading. The diagram shows a read replica that is separate from the primary RDS instance. This signals that analytics queries, reporting, and batch operations are routed to the replica rather than the primary a critical performance and availability decision.


Real-World Example 2: Banking Mobile Application Multi-Region Production Deployment

Infrastructure Overview

The banking mobile application is deployed across two AWS regions for high availability and disaster recovery: us-east-1 (primary) and us-west-2 (secondary/DR). This multi-region architecture is driven by regulatory requirements for data availability and the bank's RTO (Recovery Time Objective) of less than 15 minutes.

Primary Region: us-east-1

AWS Region: us-east-1 (Primary)
├── Route 53 (DNS with health-check-based failover)
│   └── mobile-api.bank.com → ALB in us-east-1 (primary)
│   └── mobile-api.bank.com → ALB in us-west-2 (failover, activated on health check failure)
│
├── AWS WAF + Shield Advanced
│   └── Applied to ALB DDoS protection, SQL injection prevention, rate limiting
│
├── Availability Zone: us-east-1a
│   ├── ALB (primary routes to us-east-1a and us-east-1b pods)
│   └── Private Subnet
│       ├── EKS Worker Node (r5.2xlarge memory-optimized for banking workloads)
│       │   ├── Mobile API (BFF) Pod 3 replicas in this AZ
│       │   ├── Authentication Service Pod 2 replicas in this AZ
│       │   └── Transaction Service Pod 3 replicas in this AZ
│       └── RDS Primary (db.r6g.4xlarge, PostgreSQL 16, encrypted at rest)
│           └── Transaction Database Multi-AZ with synchronous standby in us-east-1b
│
├── Availability Zone: us-east-1b
│   └── Private Subnet
│       ├── EKS Worker Node
│       │   ├── Mobile API (BFF) Pod 3 replicas
│       │   ├── Account Service Pod 2 replicas
│       │   └── Card Service Pod 2 replicas
│       ├── RDS Standby (synchronous replication from us-east-1a primary)
│       └── ElastiCache Primary Node (r6g.xlarge, Redis 7, TLS + auth enabled)
│           └── Session Store primary node
│
├── Availability Zone: us-east-1c
│   └── Private Subnet
│       ├── EKS Worker Node
│       │   ├── Notification Service Pod 2 replicas
│       │   └── Account Service Pod 2 replicas
│       └── ElastiCache Replica Node
│           └── Session Store replica of us-east-1b node
│
├── Amazon MSK (Multi-AZ, 3 brokers)
│   └── Event Stream (Kafka)
│
└── AWS PrivateLink Endpoints
    ├── → Core Banking System (on-premises, connected via AWS Direct Connect)
    └── → Fraud Detection System (separate AWS account, VPC peering)

Secondary Region: us-west-2 (Disaster Recovery)

AWS Region: us-west-2 (DR warm standby)
├── ALB (inactive only receives traffic during failover)
│
├── EKS Cluster (scaled to 0 pods in standby scaled up during failover)
│   └── All pods: Mobile API, Authentication, Transaction, Account, Card, Notification
│       └── Deployment manifests identical to us-east-1
│       └── Scaled up in <10 minutes via automated runbook
│
├── RDS Read Replica (asynchronous replication from us-east-1 primary)
│   └── Transaction Database ~5 second replication lag
│   └── Promoted to primary during failover (manual or automated)
│
├── ElastiCache (empty Session Store rebuilt from RDS on startup)
│
└── MSK Cluster (separate, not replicated events replayed from us-east-1 during failover)

Network Security Topology

Internet
    │
    ▼ HTTPS (port 443)
Route 53 DNS (health-check routing)
    │
    ▼
AWS WAF + Shield Advanced
    │
    ▼
Application Load Balancer (public subnet)
    │ SSL termination (ACM certificate)
    │ Security group: allows 443 inbound from 0.0.0.0/0
    ▼
EKS Pods (private subnet)
    │ Security group: allows traffic only from ALB security group
    │ No public IP addresses
    ▼
RDS / ElastiCache (isolated subnet no internet gateway)
    │ Security group: allows traffic only from EKS node security group
    │ Encrypted in transit (TLS) and at rest (AES-256)
    ▼
Core Banking System (on-premises via AWS Direct Connect)
    └── Private connectivity traffic never traverses public internet

What the Diagram Reveals

The DR topology is explicit. The diagram shows that the secondary region is a warm standby infrastructure exists but pods are scaled to zero. This communicates the RTO: pods can be scaled up in under ten minutes, but the process requires action (either manual or automated). This is not a hot standby (instant failover) but is significantly better than cold standby (infrastructure must be provisioned from scratch).

Replication lag as an explicit RPO signal. The RDS read replica in us-west-2 has a "~5 second replication lag" annotation. This communicates the Recovery Point Objective in a worst-case failover, up to 5 seconds of transaction data could be lost. This is an explicit, visible commitment to stakeholders who need to understand the system's data durability guarantees.

AWS Direct Connect for Core Banking. The private connectivity to the on-premises Core Banking System via AWS Direct Connect rather than public internet is architecturally and security-significant. The Deployment diagram makes this explicit: traffic between the AWS-hosted mobile application and the on-premises Core Banking System never traverses the public internet. This is a compliance requirement for banking systems in many jurisdictions.

Security group layering. The network security topology section of the diagram shows a three-layer security group configuration: the ALB accepts traffic from the internet; the EKS pods accept traffic only from the ALB; the databases accept traffic only from the EKS nodes. This defense-in-depth configuration is architecturally significant and should be explicitly shown.


Real-World Example 3: SaaS Project Management Tool Kubernetes Production Deployment

Infrastructure Overview

The SaaS platform runs on GCP (Google Cloud Platform) using GKE (Google Kubernetes Engine) in the us-central1 region, with global traffic routing via Google Cloud Load Balancing and Cloud CDN. It uses Cloud SQL for managed PostgreSQL and Cloud Memorystore for Redis.

Kubernetes Cluster Architecture

GCP Project: prod-saas-platform
├── Google Cloud Load Balancing (global, anycast IP)
│   ├── Cloud CDN (serves React SPA static assets from edge)
│   └── HTTPS Load Balancer → GKE Ingress Controller
│
├── GCP Region: us-central1
│   ├── GKE Cluster: prod-cluster (regional cluster, 3 AZs)
│   │   ├── Node Pool: api-pool (n2-standard-8, 3-12 nodes, autoscaling)
│   │   │   └── Namespace: production
│   │   │       ├── API Server Deployment (6 pods, 2 per AZ)
│   │   │       │   ├── Liveness probe: GET /healthz, 10s interval
│   │   │       │   ├── Readiness probe: GET /ready, 5s interval
│   │   │       │   ├── Resources: 2 CPU / 4GB RAM per pod
│   │   │       │   └── HPA: min 6, max 30, target CPU 60%
│   │   │       ├── Authentication Service Deployment (3 pods)
│   │   │       │   └── HPA: min 3, max 15, target CPU 60%
│   │   │       ├── Billing Service Deployment (2 pods)
│   │   │       ├── Search Service Deployment (3 pods)
│   │   │       └── File Service Deployment (2 pods)
│   │   │
│   │   ├── Node Pool: worker-pool (n2-standard-4, 2-8 nodes, autoscaling)
│   │   │   └── Namespace: production
│   │   │       ├── Notification Service Deployment (3 pods)
│   │   │       └── Integration Service Deployment (3 pods)
│   │   │
│   │   └── Node Pool: system-pool (n2-standard-2, 3 nodes, fixed)
│   │       └── System components: Ingress controller, Cert Manager,
│   │           Prometheus, Grafana, Jaeger
│   │
│   ├── Cloud SQL Instance (db-custom-16-65536, PostgreSQL 16)
│   │   ├── Primary Database: prod-primary (us-central1-a)
│   │   │   └── Primary Database (multi-tenant, RLS)
│   │   └── Read Replica: prod-replica-1 (us-central1-b)
│   │       └── Serves reporting and analytics queries
│   │
│   ├── Cloud Memorystore (Redis 7, Standard Tier)
│   │   ├── Instance: prod-redis-primary (us-central1-a, 16GB)
│   │   │   ├── Job Queue (Bull)
│   │   │   └── Real-Time Cache
│   │   └── Replica: prod-redis-replica (us-central1-b)
│   │       └── Automatic failover in ~30 seconds
│   │
│   └── Google Cloud Pub/Sub
│       └── Analytics Event Stream
│           └── Subscription: analytics-consumer → BigQuery export
│
└── Global Resources
    ├── Cloud Storage Bucket: prod-attachments (multi-region, US)
    │   └── File attachments versioning enabled, lifecycle rules configured
    ├── Elasticsearch Service (Elastic Cloud, us-central1, 3-node cluster)
    │   └── Search Index
    └── Secret Manager
        └── Stores: database credentials, Stripe API keys, Auth0 secrets,
            SendGrid API keys, GitHub OAuth secrets

Kubernetes Ingress and Service Mesh

Internet
    │
    ▼ HTTPS (443), HTTP (80 → 443 redirect
Google Cloud Load Balancer (global, managed SSL with Google-managed certificate)
    │
    ▼ HTTP (8080), internal
GKE Ingress Controller (nginx-ingress, in system-pool)
    │
    │ Routes by path:
    │ /api/v1/*          → API Server Service (ClusterIP)
    │ /api/v1/search/*   → Search Service (ClusterIP)
    │ /api/v1/files/*    → File Service (ClusterIP)
    │ /socket.io/*       → API Server Service (sticky sessions via cookie)
    │
    ▼
Kubernetes Services (ClusterIP) → Pod Endpoints (kube-proxy round-robin)
    │
    ▼ TLS (PostgreSQL wire protocol, port 5432)
Cloud SQL Auth Proxy (sidecar container in each pod)
    │
    ▼
Cloud SQL Primary Instance

Namespace and Resource Isolation

Kubernetes Namespaces:
├── production     Live customer workloads
├── staging        Pre-production validation
├── monitoring     Prometheus, Grafana, Jaeger, AlertManager
├── ingress        Nginx ingress controller
└── cert-manager   Certificate lifecycle management

Resource Quotas (production namespace):
├── CPU limit:     80 cores total
├── Memory limit:  160 GB total
├── Pod limit:     200 pods
└── PVC limit:     50 persistent volume claims

Network Policies (production namespace):
├── Default: deny all ingress and egress
├── Allow: ingress from ingress namespace (HTTP)
├── Allow: egress to Cloud SQL (port 5432)
├── Allow: egress to Cloud Memorystore (port 6379)
└── Allow: egress to Google APIs (HTTPS, port 443)

What the Diagram Reveals

Three distinct node pools with different purposes. The cluster uses three node pools: api-pool for API workloads (higher CPU, higher count), worker-pool for background workers (lower CPU, lower count), and system-pool for infrastructure components (fixed size, cannot be scaled down). This resource segregation ensures that a traffic spike that consumes all api-pool nodes does not starve the system components or workers. The Deployment diagram makes this deliberate segregation visible.

HPA configuration as availability signal. The Horizontal Pod Autoscaler configuration for the API Server (min 6, max 30, target CPU 60%) communicates specific operational commitments: the system always has at least 6 pods available (baseline capacity), can scale to 30 pods under peak load, and triggers scaling at 60% CPU utilization (leaving headroom before pods become overloaded). These numbers are not arbitrary they represent capacity planning decisions that the Deployment diagram makes explicit.

The Cloud SQL Auth Proxy sidecar. Connecting to Cloud SQL does not go through a standard TCP connection it goes through the Cloud SQL Auth Proxy, which runs as a sidecar container alongside each application pod. This proxy handles authentication, authorization, and encryption automatically. Its presence in the Deployment diagram explains to engineers why database connection strings do not contain standard credentials and why connections go to localhost rather than a database hostname.

Network policies as security controls. The network policies section shows a default-deny posture with explicit allow rules. This zero-trust network configuration where pods cannot communicate with anything not explicitly permitted is a significant security control that is invisible in the Container diagram. The Deployment diagram makes it visible and auditable.


Real-World Example 4: Ride-Sharing Platform Multi-Region Active-Active Deployment

Infrastructure Overview

The ride-sharing platform requires the most aggressive availability posture of the examples in this guide. An active-active multi-region deployment across three AWS regions ensures that even a complete regional failure does not interrupt service for riders and drivers in unaffected geographies.

Multi-Region Architecture

Global Traffic Management
├── AWS Route 53 (GeoDNS + latency-based routing)
│   ├── api.rideshare.com → us-east-1 (serves US East traffic)
│   ├── api.rideshare.com → eu-west-1 (serves European traffic)
│   └── api.rideshare.com → ap-southeast-1 (serves APAC traffic)
│
└── AWS Global Accelerator (anycast IP, TCP/UDP acceleration)
    └── Routes mobile client connections to nearest healthy region

Region: us-east-1 (US Primary)
├── Availability Zone: us-east-1a
│   ├── EKS Worker Nodes (c5.4xlarge compute-optimized)
│   │   ├── Passenger API Pod 5 replicas in this AZ
│   │   ├── Driver API Pod 5 replicas in this AZ
│   │   ├── Matching Service Pod 3 replicas (Go, high CPU)
│   │   └── Trip Service Pod 3 replicas
│   └── ElastiCache Node Location Store primary (r6g.2xlarge)
│
├── Availability Zone: us-east-1b
│   ├── EKS Worker Nodes
│   │   ├── Passenger API Pod 5 replicas
│   │   ├── Driver API Pod 5 replicas
│   │   ├── Routing Service Pod 3 replicas (Python, Google Maps integration)
│   │   └── Payment Service Pod 3 replicas
│   └── ElastiCache Node Location Store replica
│       └── Asynchronous replication from us-east-1a primary
│
├── Availability Zone: us-east-1c
│   ├── EKS Worker Nodes
│   │   ├── Matching Service Pod 3 replicas
│   │   ├── Pricing Service Pod 3 replicas
│   │   ├── Onboarding Service Pod 2 replicas
│   │   └── Communication Service Pod 2 replicas
│   └── ElastiCache Node Route Cache
│
├── MSK Cluster (3 brokers, one per AZ, replication factor 3)
│   └── Event Stream (Kafka)
│       └── Topics: TripEvents, PaymentEvents, DriverLocationUpdates
│
├── Aurora PostgreSQL Cluster (Multi-AZ)
│   ├── Writer Instance: us-east-1a (db.r6g.4xlarge)
│   │   ├── Trips Database
│   │   ├── Driver Database
│   │   └── Passenger Database
│   ├── Reader Instance 1: us-east-1b
│   └── Reader Instance 2: us-east-1c
│
└── Firebase Realtime Database (Google-managed, multi-region)
    └── Real-Time Data Store driver locations, ride status

Region: eu-west-1 (European Region active, serves EU traffic)
├── [Same structure as us-east-1, scaled proportionally to EU traffic]
├── Aurora Global Database secondary cluster
│   └── Asynchronous replication from us-east-1 global cluster writer
│   └── Replication lag: < 1 second
└── Data residency: EU passenger and driver data stored only in eu-west-1
    └── GDPR compliance: EU PII does not replicate to non-EU regions

Region: ap-southeast-1 (APAC Region active, serves APAC traffic)
└── [Same structure as us-east-1, scaled proportionally to APAC traffic]

Location Service Architecture (Special Case)

The Location Service has a unique deployment architecture because of its extreme performance requirements it must process thousands of GPS updates per second and serve sub-100ms proximity queries.

Location Service Deployment
├── Redis Cluster (ElastiCache, cluster mode enabled)
│   ├── 6 shards × 2 replicas = 18 nodes per region
│   ├── Geospatial index: GEOADD / GEORADIUS commands
│   ├── Data model: driverID → {latitude, longitude, timestamp, status}
│   ├── Key TTL: 30 seconds (stale drivers auto-expire)
│   └── Update throughput: ~50,000 writes/second at peak
│
├── Location Service Pods (Go, 10 replicas per AZ)
│   ├── Receives driver location via Firebase (batched every 3 seconds)
│   ├── Writes to Redis Cluster (<5ms per write target)
│   └── Serves proximity queries to Matching Service (<10ms target)
│
└── Firebase Realtime Database
    └── Driver apps write location here (mobile SDK handles connectivity)
    └── Location Service subscribes to Firebase for location events

What the Diagram Reveals

Active-active vs. active-passive clarity. The multi-region deployment is explicitly labeled as active-active all three regions serve live traffic simultaneously, not just one. This is a critical distinction from the Banking example's active-passive DR setup. The Deployment diagram communicates the difference immediately.

GDPR data residency as infrastructure constraint. The eu-west-1 region section explicitly notes that EU passenger and driver data is stored only in eu-west-1 and does not replicate to non-EU regions. This is a legal compliance requirement (GDPR data residency) that has direct infrastructure implications Aurora Global Database replication is configured to exclude PII fields. The Deployment diagram is the right place to document this constraint because it is implemented at the infrastructure level.

Aurora Global Database for sub-second cross-region replication. The use of Aurora Global Database with "< 1 second" replication lag is specifically called out. This is not standard RDS replication it is a specific AWS service designed for cross-region active-active workloads. Naming the specific technology communicates the operational commitment: in a cross-region failover, at most 1 second of write data could be lost.

The Location Service's Redis Cluster detail. The Location Service section shows a Redis Cluster configuration with 6 shards and 18 total nodes, with specific throughput targets (50,000 writes/second). This level of detail is appropriate for a component with such unusual performance characteristics. The Deployment diagram justifies the infrastructure investment and communicates the performance engineering decisions made for this specific component.


Real-World Example 5: Healthcare Patient Portal HIPAA-Compliant AWS Deployment

Infrastructure Overview

The healthcare portal operates in a HIPAA-compliant AWS environment. HIPAA compliance imposes specific infrastructure requirements: encryption in transit and at rest everywhere, comprehensive audit logging, network isolation, backup and recovery procedures, and Business Associate Agreements (BAAs) with all cloud service providers.

HIPAA-Compliant Architecture

AWS Region: us-east-1 (HIPAA-compliant, BAA signed with AWS)
│
├── AWS Organizations Dedicated AWS Account for PHI workloads
│   └── Separate account from non-PHI workloads (account-level isolation)
│
├── VPC: 10.0.0.0/16 (HIPAA VPC no shared resources with other accounts)
│   │
│   ├── Public Subnets (10.0.1.0/24, 10.0.2.0/24, 10.0.3.0/24 one per AZ)
│   │   ├── AWS WAF (HIPAA-required: prevents injection attacks on PHI)
│   │   └── ALB (HTTPS only HTTP redirects to HTTPS, TLS 1.2 minimum)
│   │
│   ├── Application Subnets (10.0.11.0/24, 10.0.12.0/24, 10.0.13.0/24)
│   │   └── EKS Worker Nodes (m5.2xlarge, HIPAA-hardened AMI)
│   │       │
│   │       ├── Patient Web Portal Pod (2 replicas per AZ = 6 total)
│   │       ├── Patient API Pod (3 replicas per AZ = 9 total)
│   │       ├── Clinical API Pod (2 replicas per AZ = 6 total)
│   │       ├── Epic Integration Service Pod (2 replicas per AZ = 6 total)
│   │       ├── Appointment Service Pod (2 replicas per AZ = 6 total)
│   │       ├── Prescription Service Pod (2 replicas per AZ = 6 total)
│   │       ├── Secure Messaging Service Pod (2 replicas per AZ = 6 total)
│   │       ├── FHIR API Service Pod (2 replicas per AZ = 6 total)
│   │       └── Notification Service Pod (2 replicas per AZ = 6 total)
│   │
│   ├── Data Subnets (10.0.21.0/24, 10.0.22.0/24, 10.0.23.0/24)
│   │   └── No internet gateway isolated from public internet
│   │       │
│   │       ├── RDS Aurora PostgreSQL Cluster (encrypted at rest, AES-256)
│   │       │   ├── Writer: us-east-1a (db.r6g.2xlarge)
│   │       │   │   ├── Patient Database (PHI contains PII)
│   │       │   │   └── Message Store (PHI field-level encrypted content)
│   │       │   ├── Reader: us-east-1b (for reporting queries)
│   │       │   └── Automated backups: 35-day retention, encrypted
│   │       │       └── PITR (Point-in-Time Recovery) enabled
│   │       │
│   │       └── ElastiCache Redis (TLS + AUTH, encrypted in transit + at rest)
│   │           ├── Primary: us-east-1a
│   │           └── Replica: us-east-1b
│   │               └── Patient Data Cache (TTL: 15 minutes, PHI-safe expiry)
│   │
│   ├── AWS PrivateLink Endpoints (no traffic leaves AWS network)
│   │   ├── → com.amazonaws.us-east-1.s3 (S3 access without internet)
│   │   ├── → com.amazonaws.us-east-1.sqs (SQS access)
│   │   ├── → com.amazonaws.us-east-1.secretsmanager
│   │   └── → com.amazonaws.us-east-1.logs (CloudWatch)
│   │
│   └── Transit Gateway Connection
│       └── → Hospital On-Premises Network (via AWS Direct Connect)
│           └── Epic EHR System (HL7 FHIR API over private connection)
│
├── AWS S3 (PHI Bucket HIPAA-compliant configuration)
│   ├── Bucket policy: deny HTTP (HTTPS only)
│   ├── Versioning: enabled (HIPAA requires version history)
│   ├── MFA delete: enabled
│   ├── Server-side encryption: AES-256
│   └── Access logging: all access logged to audit bucket
│
├── AWS CloudWatch Logs (Audit Log)
│   ├── Log group: /hipaa/phi-access (immutable resource policy prevents deletion)
│   ├── Log group: /hipaa/api-access (all API access to PHI endpoints)
│   ├── Retention: 7 years (HIPAA minimum)
│   └── Export: daily export to S3 for long-term archival
│
├── AWS SQS + SNS (Event Stream PHI-safe)
│   ├── All queues: server-side encryption with KMS key
│   └── Message content: event metadata only, no PHI in message payloads
│       └── PHI retrieved by consumers via API using IDs in event payloads
│
└── AWS KMS (Key Management Service)
    ├── Customer-managed key: hipaa-phi-key (annual rotation)
    ├── Encrypts: RDS, ElastiCache, S3, SQS, CloudWatch Logs, EBS volumes
    └── Key usage logged: all encrypt/decrypt operations audited

Network Traffic Audit Trail

All inbound requests:
    ALB → CloudWatch Logs (access log: timestamp, source IP, request,
          response code, latency, user agent)

All API access to PHI:
    Application pods → CloudWatch Logs /hipaa/phi-access
    (patient ID, clinician ID, operation, timestamp, source IP,
     data elements accessed HIPAA minimum necessary standard)

All database access:
    Aurora → CloudWatch Logs (PostgreSQL audit extension)
    (query text, user, database, timestamp, rows affected)

All S3 access to PHI:
    S3 → CloudWatch Logs (server access logging)
    (requester, operation, key, timestamp, status)

What the Diagram Reveals

HIPAA compliance as infrastructure. The Deployment diagram is dense with HIPAA compliance annotations: separate AWS account for PHI workloads, encryption annotations on every data store, no internet gateway on data subnets, AWS PrivateLink to prevent traffic leaving AWS network, 7-year log retention, immutable audit logs, MFA delete on S3. These are not incidental details they are infrastructure-implemented compliance controls. The Deployment diagram is the compliance evidence document for these controls.

The KMS key as a central security element. The KMS Customer-Managed Key appears explicitly as an infrastructure element it encrypts RDS, ElastiCache, S3, SQS, CloudWatch Logs, and EBS volumes. Showing the KMS key in the Deployment diagram communicates that encryption is centrally managed and auditable, not a patchwork of separately configured encryption settings.

PHI-free message payloads. The SQS/SNS section explicitly notes "message content: event metadata only, no PHI in message payloads." This is a deliberate design decision that reduces the PHI surface area if an SQS message were accidentally exposed or misconfigured, it would contain only IDs, not actual patient data. This is an architectural decision that has infrastructure implications (consumers must make API calls to retrieve PHI using the IDs in events, rather than reading PHI directly from events).

Direct Connect for Epic connectivity. The connection to the on-premises Epic EHR system goes through AWS Direct Connect via a Transit Gateway dedicated private network connectivity. This is a HIPAA-relevant infrastructure decision: HL7 FHIR messages containing PHI never traverse the public internet. The Deployment diagram makes this compliance control explicit.


Common Mistakes and How to Avoid Them

Mistake 1: The Flat Deployment Diagram

A Deployment diagram that shows all containers as flat boxes on the same level without nesting them inside deployment nodes fails to communicate the physical topology. It is the deployment diagram equivalent of the three-box Container diagram: technically present but informationally empty.

Why it happens. Drawing nested deployment nodes is more work than drawing flat boxes.

How to avoid it. Apply the nesting hierarchy: containers run inside Kubernetes pods, which run inside nodes, which run inside availability zones, which run inside regions. Show this hierarchy. The nesting is where the availability and failure domain information lives.

Mistake 2: Omitting Replica Counts

A Deployment diagram that shows a container without indicating how many replicas are running omits one of the most important pieces of operational information. Is this a single instance (single point of failure) or ten replicas (highly available)? The diagram cannot answer the question without replica annotations.

Why it happens. Replica counts feel like operational configuration details, not architecture.

How to avoid it. Treat replica counts as architectural information. They determine availability, capacity, and blast radius all of which are architectural properties of the system.

Mistake 3: Ignoring the Data Layer

Deployment diagrams that show where application services run but not where databases and caches run omit the most failure-sensitive components of the architecture. Database placement in which availability zone, with what replication configuration, with what backup policy is often the most critical reliability information in the diagram.

Why it happens. Data infrastructure feels like DBA territory rather than architecture.

How to avoid it. Databases, caches, and message brokers are containers in the C4 Model and must appear in the Deployment diagram with their placement, replication configuration, and backup policy.

Mistake 4: Conflating the Container and Deployment Diagrams

Showing internal service code structure (packages, classes, modules) in the Deployment diagram, or showing infrastructure elements (Kubernetes nodes, EC2 instances) in the Container diagram mixing the two levels of abstraction.

How to avoid it. The Container diagram shows what runs. The Deployment diagram shows where it runs. Keep these concerns strictly separate.

Mistake 5: Single-Environment Diagrams Presented as Production

Drawing a single Deployment diagram without labeling the environment, or drawing the development/staging environment and presenting it as production. Readers who make decisions based on a staging diagram assuming it reflects production will make incorrect decisions.

How to avoid it. Every Deployment diagram must be explicitly labeled with the environment it describes. Production and staging must be separate diagrams if their topologies differ.

Mistake 6: Omitting Security and Network Topology

Deployment diagrams that show containers in deployment nodes but omit load balancers, security groups, network boundaries, and VPC topology miss most of the security architecture. For regulated systems, this information is not optional.

How to avoid it. Include load balancers, security groups, network boundaries, and VPC/subnet structure as infrastructure nodes in the Deployment diagram. For HIPAA, PCI-DSS, SOC 2, or similar compliance contexts, these elements are required for the diagram to serve as compliance evidence.


Deployment Diagrams Across Environments

A mature deployment diagram practice maintains diagrams for multiple environments, each clearly labeled and accurately reflecting the actual configuration of that environment.

Production

The most detailed and most important diagram. Must be accurate, current, and include: replica counts, instance types, replication configuration, backup policies, network topology, and security configuration. This is the diagram used for incident response, capacity planning, and compliance audits.

Staging

Simplified topology compared to production typically single-AZ rather than multi-AZ, smaller instance types, no cross-region replication. Must accurately reflect the actual staging configuration so engineers understand the gap between staging and production (which represents the risk surface of deployments).

Development

Often the simplest diagram may show Docker Compose or a local Kubernetes configuration. Useful for onboarding engineers onto the local development setup. Shows the entire system running on a developer's laptop or in a shared development cluster.

Disaster Recovery

If the system has a formal DR environment (separate from the production secondary region), it warrants its own Deployment diagram showing the DR topology, the RTO/RPO commitments it supports, and the runbook steps required to activate it.


Deployment Diagrams as Compliance Evidence

For systems operating in regulated industries healthcare (HIPAA), financial services (PCI-DSS, SOC 2), government (FedRAMP) the Deployment diagram serves a dual purpose: it is both an operational reference document and a compliance evidence artifact.

Auditors reviewing HIPAA or PCI-DSS compliance routinely request architecture diagrams showing:

  • Network segmentation and isolation of sensitive data
  • Encryption in transit and at rest for all sensitive data
  • Access controls and authentication mechanisms
  • Audit logging and monitoring configuration
  • Backup and recovery procedures

A well-maintained Deployment diagram provides precisely this information. Rather than writing a separate compliance narrative document, teams can reference the Deployment diagram as the authoritative record of these controls.

For this compliance use case, Deployment diagram annotations should be explicit and precise:

  • "Encrypted in transit (TLS 1.2+)" and "Encrypted at rest (AES-256)" rather than just "encrypted"
  • "7-year retention, immutable" for audit logs rather than just "audit logging enabled"
  • "No PHI in message payloads" for message queues rather than just "uses SQS"

These precise annotations make the diagram legible to auditors who are looking for specific controls, not just general architectural intent.


Tooling for Deployment Diagrams

Structurizr DSL

Structurizr supports deployment diagrams natively in its DSL. The deployment node hierarchy, container instance placement, and infrastructure nodes can all be expressed as code.

deploymentEnvironment "Production" {
    deploymentNode "AWS us-east-1" {
        tags "Amazon Web Services - Region"

        deploymentNode "us-east-1a" {
            tags "Amazon Web Services - Availability Zone"

            deploymentNode "EKS Worker Node" {
                tags "Amazon Web Services - EC2 Instance"

                containerInstance apiGateway {
                    properties {
                        "replicas" "2"
                        "instance-type" "m5.2xlarge"
                    }
                }

                containerInstance orderService {
                    properties {
                        "replicas" "2"
                    }
                }
            }

            deploymentNode "RDS Primary" {
                tags "Amazon Web Services - RDS"
                containerInstance ordersDb
            }
        }

        deploymentNode "us-east-1b" {
            tags "Amazon Web Services - Availability Zone"
            // ... similar structure
        }
    }
}

AWS Architecture Diagrams

AWS provides official architecture diagram icons and templates in draw.io, Lucidchart, and other tools. Using official AWS icons for deployment nodes (EC2, RDS, ElastiCache, EKS, etc.) produces diagrams that are immediately recognizable to AWS-experienced engineers. These cloud-specific icons can be combined with C4 Model container notation to produce hybrid diagrams that are both C4-compliant and cloud-specific.

Terraform and IaC as Ground Truth

For teams using infrastructure-as-code (Terraform, Pulumi, CloudFormation, CDK), the IaC code is the authoritative source of truth for the deployed infrastructure. Deployment diagrams should be maintained in sync with the IaC code and ideally, generated or validated from it.

Tools like Inframap and Pluralith can generate topology diagrams from Terraform state, providing an auto-generated view of the deployed infrastructure. These generated diagrams are not identical to manually maintained C4 Deployment diagrams they typically lack the annotation, organization, and narrative clarity of a manually maintained diagram but they serve as a useful complement and a source of truth for keeping the manual diagram accurate.


Frequently Asked Questions

How detailed should a Deployment diagram be?

The right level of detail is "enough to answer the questions its audience will ask." For an on-call engineer: which availability zone is the primary database in, and what is the failover procedure? For a security reviewer: which components are in public subnets, and which are in private subnets? For a capacity planner: how many replicas of each service are running, and what are the auto-scaling limits?

A Deployment diagram that cannot answer these questions is too abstract. A Deployment diagram that shows every Kubernetes resource manifest setting, every security group rule, and every IAM policy is too detailed that level of detail belongs in the IaC code, not the diagram.

Should Kubernetes-internal resources (Services, Ingresses, ConfigMaps) appear in the Deployment diagram?

Kubernetes Services (ClusterIP, NodePort, LoadBalancer) that are responsible for routing traffic between containers should appear as infrastructure nodes they are part of the traffic routing topology. Kubernetes Ingresses should appear as infrastructure nodes that route external traffic into the cluster. ConfigMaps and Secrets are configuration mechanisms, not topology elements, and should not appear in the Deployment diagram.

How do you show auto-scaling in a Deployment diagram?

Show the current steady-state configuration with an annotation indicating the scaling policy: "3 replicas (HPA: min 3, max 15, target CPU 60%)." This communicates both the current state and the operational boundaries. Alternatively, show the minimum and maximum as a range: "3-15 replicas."

How often should Deployment diagrams be updated?

Deployment diagrams should be updated whenever the infrastructure topology changes new services added, scaling configurations changed, database instances resized, network topology modified, new availability zones added, or DR configuration changed. In teams using infrastructure-as-code, the Deployment diagram update should be part of the same pull request as the IaC change.


Conclusion

The Deployment diagram is the C4 Model's most operationally concrete artifact. It transforms the logical architecture of the Container diagram into the physical reality of running infrastructure. It answers the questions that matter most in production where things run, how many, with what redundancy, across which failure domains, with what network controls, and with what compliance properties.

The five examples in this guide e-commerce, banking, SaaS, ride-sharing, and healthcare illustrate how Deployment diagrams serve different purposes in different contexts:

The e-commerce example demonstrates how multi-AZ redundancy, CloudFront CDN placement, and the public/private subnet boundary are made explicit in the Deployment diagram invisible in the Container diagram but essential for reliability analysis.

The banking example demonstrates how multi-region DR topology, replication lag as an explicit RPO signal, Direct Connect for on-premises connectivity, and defense-in-depth security group layering are communicated through the Deployment diagram.

The SaaS example demonstrates how Kubernetes node pool segregation, HPA auto-scaling configuration, Cloud SQL Auth Proxy as a sidecar, and network policy zero-trust posture appear as deployment-level architectural decisions.

The ride-sharing example demonstrates how active-active multi-region topology, GDPR data residency constraints, Aurora Global Database replication lag commitments, and the Location Service's extreme performance engineering are expressed at the deployment level.

The healthcare example demonstrates how HIPAA compliance controls separate AWS accounts, KMS customer-managed keys, PHI-free message payloads, Direct Connect for Epic connectivity, and immutable audit logs are implemented as infrastructure and documented in the Deployment diagram.

Across all five examples, the Deployment diagram is not simply a documentation artifact it is an architectural reasoning tool. It makes visible the trade-offs between availability, cost, performance, and compliance that determine how a system behaves under real-world conditions. It bridges the gap between architectural intent and operational reality.

For engineering teams that invest in maintaining accurate, annotated Deployment diagrams alongside their infrastructure-as-code, the return is real: faster incident response, clearer reliability analysis, more productive capacity planning discussions, and architecture documentation that serves as compliance evidence without additional effort.

The principles that make Deployment diagrams excellent are simple:

  • Show the nesting hierarchy containers inside nodes inside AZs inside regions
  • Annotate replica counts on every container instance
  • Document replication configuration on every data store
  • Show traffic routing from DNS through load balancers to container instances
  • Make failure domains explicit through AZ and region grouping
  • Include security topology public vs private subnets, security groups, network policies
  • Label the environment prominently production is not staging
  • Keep it current a stale Deployment diagram is worse than none

The Deployment diagram is where architecture meets reality. For any system that runs in production and matters to its users, that meeting deserves to be documented.


Related Articles