A monolithic Terraform state is not just a technical inconvenience. It is an organizational risk. When every domain shares a single state, the blast radius of any apply spans the entire infrastructure, and the incentive to make quick manual changes in the AWS console rather than wait for a full plan cycle becomes hard to resist. That incentive is exactly what had driven the drift this client was dealing with. Before we could enforce any governance policy, we needed to understand the full dependency structure and design a decomposition that teams could operate safely within.
- State audit and domain segmentation: We audited the existing Terraform state structure and dependency graph in full, mapping every resource, identifying tightly coupled domains, and surfacing the undocumented manual changes in AWS that had diverged from state. The audit revealed drift across multiple resource types that had been introduced incrementally and never reconciled. We used this map to design a domain-based segmentation strategy, isolating VPC, RDS, EKS, and shared services into independent state files with clearly defined ownership boundaries. Decomposition was executed using Terraform state move operations, with zero resource recreation; the infrastructure did not change, only how it was managed did.
- Environment-specific execution pipelines: With state segmented by domain, we created environment-specific execution pipelines for each isolated module. Engineers modifying the RDS domain ran plans and applied them only to that state, with no risk of inadvertently touching VPC or EKS configuration in the same operation. Pipeline execution was configured with a targeted plan output reviewed before any apply was permitted, making every proposed change visible and deliberate. The blast radius of any single infrastructure change dropped from the entire platform to a bounded, well-understood domain.

