Reasoned Position The carefully considered conclusion based on evidence, constraints, and analysis
Pull request cost estimation requires deterministic, offline architecture because the evaluation context - CI/CD pipelines during code review - imposes constraints that runtime cost analysis does not face. The tool must produce identical results for identical Terraform plan JSON inputs regardless of when or where it runs, cannot depend on network access to cloud pricing APIs, and must operate in environments that lack cloud credentials. These architectural constraints force trade-offs: directional cost feedback instead of billing precision, JSON plan parsing instead of HCL evaluation, and acceptance that some cost components remain invisible until runtime. This is not a limitation to overcome but a design boundary that defines what class of tool can exist in this execution context.
Infrastructure decisions get made in pull requests. Cost feedback arrives in billing dashboards. This timing mismatch creates irreversible mistakes - instance families chosen, regions selected, NAT gateways deployed - all before cost implications surface. The gap between decision and feedback is not a tooling problem to solve with better dashboards. It represents a fundamental architectural constraint that demands different design choices.
The Problem: Cost Feedback Arrives Too Late
Infrastructure-as-code workflows compress decision velocity. Engineers select instance types, configure storage classes, choose regions, and deploy load balancers through pull requests merged in minutes. Cost consequences manifest days or weeks later when billing cycles close or budget alerts trigger.
Consider the decision sequence for adding a NAT gateway. The engineer evaluates connectivity requirements, reviews Terraform documentation, selects a configuration, submits a PR, receives approval, and merges. Total elapsed time: 2-4 hours. Cost feedback arrives when the next AWS bill closes - 45-60 days after the decision became irreversible.
This is not about dollar amounts. It’s about irreversibility windows. Instance families lock in CPU architectures. Disk classes determine IOPS characteristics. Regional deployments establish data gravity. These decisions compound - each subsequent choice builds on previous infrastructure selections, creating dependency graphs that resist reversal.
Cost is not loud. It is final. By the time financial signals appear in observability platforms, the infrastructure decisions that generated those costs have been deployed, integrated, and load-tested. Rollback means rewriting dependent services, migrating data, or accepting the cost permanently.
Why Existing Tools Miss This Moment
Billing dashboards solve historical analysis. They aggregate spending patterns, identify cost trends, and enable attribution across teams and projects. This works for optimization - finding waste after it accumulates, negotiating reserved instances, right-sizing over-provisioned resources.
Budget alerts operate on trailing indicators. They trigger when spending crosses thresholds, providing reactive feedback after costs materialized. Useful for containment, ineffective for prevention.
Live pricing APIs offer real-time cost data. They query cloud provider APIs during deployment, calculate projected costs, and present estimates to engineers. This approach introduces external dependencies - network connectivity, authentication credentials, API availability, and latency. Each dependency breaks determinism.
Runtime agents model actual behavior. They observe traffic patterns, measure resource utilization, and project costs based on usage. This captures autoscaling dynamics, traffic spikes, and workload characteristics that static analysis cannot see. But runtime agents require deployed infrastructure. They provide feedback after the irreversible decisions already executed.
These tools solve legitimate problems - observability, optimization, attribution. They operate in the correct phase of the infrastructure lifecycle for their design goals. They do not address the code review decision point where infrastructure choices become permanent.
The Design Constraint: Determinism Over Completeness
Pull request workflows demand deterministic validation. Same input produces same output. Every time. Without network calls, without external dependencies, without authentication, without variance.
This constraint is not negotiable. Code review gates require consistent evaluation criteria. A pull request approved today using one cost estimate cannot fail tomorrow using different estimates from API price changes. Engineers cannot debug CI failures caused by transient network issues in pricing services. Platform teams cannot maintain infrastructure pipelines dependent on third-party API uptime.
Determinism requires embedded data. Pricing information gets captured as snapshot data, versioned alongside code, updated deliberately rather than dynamically. When cloud providers change pricing, the change propagates through explicit snapshot updates with review and validation, not silently through API responses.
This creates a fundamental trade-off: determinism versus completeness. Live APIs provide comprehensive, current pricing across all services and regions. Embedded snapshots provide stable, reproducible estimates for a bounded set of resources. The choice determines system architecture.
For pull request cost analysis, determinism wins. The alternative - API-dependent estimates - creates three failure modes that break CI workflows:
Price Drift: APIs change prices mid-pipeline. A PR approved with one estimate fails re-validation with different estimates from price updates.
Outage Cascade: Third-party API downtime blocks all infrastructure PRs. Entire deployment pipelines stall waiting for pricing services to recover.
Authentication Sprawl: Every CI environment needs provider credentials. Credential rotation, permission scoping, and secret management become deployment dependencies.
Embedded snapshots accept staleness in exchange for reliability. Prices lag real-time by days or weeks. This is deliberate - the lag represents the update cycle where pricing changes get reviewed, validated, and propagated. During that window, all environments see identical estimates.
The implementation details matter. Terraform plan JSON provides normalized infrastructure representation. The plan format captures resolved resource configurations after module expansion, variable interpolation, and conditional evaluation. This normalized representation eliminates HCL parsing ambiguity - count parameters, for_each loops, and dynamic blocks all resolve to concrete resource instances in the plan.
No network access. No IAM credentials. No telemetry. The cost estimation process operates entirely on snapshot data and plan JSON. This makes it embeddable in CI systems, reproducible across environments, and independent of external service availability.
What Offline Cost Estimation Can and Cannot See
Static analysis has clear boundaries. Plan JSON captures infrastructure intent - the resources Terraform will create, modify, or destroy. This represents baseline infrastructure cost: instance types selected, disk sizes configured, load balancer deployments, NAT gateway counts.
Static analysis cannot model runtime behavior. Autoscaling policies define capacity ranges, not actual utilization. Traffic patterns determine data transfer costs, not subnet configurations. Usage-based services like Lambda or API Gateway bill on invocations, not deployment declarations.
The distinction matters for setting appropriate expectations. If a cost depends on runtime behavior, it does not belong in PR-time validation. Attempting to estimate autoscaling costs from Terraform configuration produces false precision - ranges like “between $50 and $5,000 monthly” provide no decision value. Better to acknowledge the uncertainty boundary explicitly.
Consider three infrastructure changes:
Change A: Adding a t3.xlarge instance. Static analysis sees instance type and pricing tier. Estimate: $121/month baseline compute cost. This is deterministic - the instance exists or it doesn’t.
Change B: Enabling autoscaling from 2-10 instances. Static analysis sees capacity bounds, not utilization. Runtime behavior determines actual cost anywhere in that range. Static analysis should surface the scaling configuration without pretending to predict outcomes.
Change C: Deploying a Lambda function. Plan JSON shows memory allocation and timeout configuration, not invocation frequency. Cost depends entirely on runtime traffic. Static analysis provides no useful signal.
Offline estimation provides value for Change A, surfaces configuration for Change B without false estimates, and correctly identifies Change C as runtime-dependent. This aligns feedback with what engineers can reason about during code review.
The subset of infrastructure amenable to static cost analysis represents 40-60% of typical cloud spending - compute instances, managed databases, persistent storage, networking infrastructure. The remainder depends on usage patterns, traffic volume, and runtime behavior that emerge after deployment.
This is not a limitation to overcome. It represents the correct boundary between design-time validation and runtime observation.
Why Directional Accuracy Is the Right Trade-Off
Cost estimates at code review time do not require cent-level precision. Engineers evaluating infrastructure changes need to understand magnitude and direction, not exact dollar amounts.
A PR adding three m5.2xlarge instances provides clear signal: this increases baseline compute spend by approximately $800 monthly. Whether the actual cost ends up at $784 or $812 depending on partial-hour billing and regional price differences is irrelevant to the review decision. The order of magnitude - hundreds monthly, not tens or thousands - shapes the conversation.
Relative deltas matter more than absolute values. “This change increases projected monthly cost by 40%” provides actionable signal. “This change costs $1,247.39 monthly” provides false precision that will never match actual billing.
False precision is dangerous. It suggests confidence the estimation process cannot deliver. An engineer seeing “$1,247.39” expects that number to appear in billing. When actual costs vary by 15-20% due to runtime factors, the estimation appears broken. Better to communicate “$1,200-1,400 monthly baseline” and acknowledge the uncertainty.
The estimation process operates under multiple sources of variance:
Pricing Staleness: Embedded snapshots lag provider pricing by days to weeks. Cloud providers change pricing unpredictably - AWS modifies hundreds of prices monthly.
Regional Differences: Resources deployed across regions experience different pricing, but plan JSON may not specify regions until runtime.
Commitment Discounts: Reserved instances, savings plans, and volume commitments affect realized costs but not list pricing.
Usage Patterns: Partial-hour billing, storage operations, and data transfer all depend on runtime behavior.
Given these variance sources, pursuing cent-level accuracy is wasted effort. The goal is directional guidance: Is this change small (< $100 monthly), medium ($100-1,000), or large (> $1,000)? Does it increase or decrease spending? By what approximate magnitude?
Engineers can reason about these questions during code review. They cannot reason about exact billing line items before deployment.
Why JSON-Only (and Not HCL Parsing)
Terraform HCL provides human-readable infrastructure definitions. Engineers write modules, compose resources, and leverage language features like count, for_each, and dynamic blocks. This expressiveness creates parsing ambiguity.
Consider a module that provisions databases:
module "databases" {
source = "./modules/db"
count = var.environment == "production" ? 3 : 1
instance_class = lookup(var.instance_map, var.region, "db.t3.medium")
}
What does this cost? The answer depends on variable evaluation, conditional resolution, and map lookups. Parsing the HCL file directly provides no deterministic answer - the actual resource count and instance class emerge from variable inputs at plan time.
Terraform plan JSON resolves this ambiguity. After variable interpolation, conditional evaluation, and module expansion, the plan contains concrete resource specifications:
{
"resource_changes": [
{
"type": "aws_db_instance",
"change": {
"after": {
"instance_class": "db.r5.xlarge",
...
}
}
},
...
]
}
This normalized representation eliminates interpretation. Each resource appears with resolved configuration. No variables, no conditionals, no ambiguity. Cost estimation operates on concrete resource specifications, not infrastructure intent.
The alternative - parsing HCL directly - requires reimplementing Terraform’s evaluation logic. Variable resolution, function evaluation, module composition, and dynamic block expansion all become parsing responsibilities. This introduces maintenance burden, version compatibility issues, and subtle behavioral differences from actual Terraform execution.
More critically, HCL parsing breaks determinism. Terraform’s evaluation logic evolves across versions. Language features get added, function behavior changes, module resolution updates. Maintaining parsing compatibility across Terraform versions becomes a tracking problem that grows with language complexity.
Plan JSON isolates cost estimation from Terraform internals. As long as the plan JSON schema remains stable (which it has across Terraform 0.13-1.x), cost estimation continues working regardless of HCL language changes.
This creates an open question: Is there a safe middle ground where limited HCL parsing provides value without compromising determinism? Or is JSON-only the correct long-term boundary? The answer likely depends on specific use cases and acceptable trade-off frontiers.
Where This Fits (and Where It Doesn’t)
PR-time cost estimation occupies a specific niche in infrastructure tooling. It does not replace:
FinOps Platforms: These provide comprehensive cost analytics, showback/chargeback, optimization recommendations, and executive reporting. They operate on realized costs with full billing data.
Billing Analytics: Cost allocation, trend analysis, and budget management require historical spending data and organizational hierarchy. PR-time tools cannot access this information.
Runtime Optimization: Right-sizing recommendations, idle resource detection, and utilization analysis all require observing actual workload behavior over time.
PR-time cost estimation enables different outcomes:
Review Conversation: Cost becomes visible during architectural discussion, not after deployment. Engineers can evaluate trade-offs between instance types, storage classes, and regional deployments with cost implications surfaced.
Mistake Prevention: Catching accidental resource duplication, misconfigured instance counts, or unintended regional deployments before they deploy prevents the entire cycle of deploy → discover → rollback.
Cultural Shift: Making cost visible during code review normalizes cost conversations. Engineers develop intuition about infrastructure pricing through repeated exposure during their daily workflow.
The value is not cost savings - that’s downstream benefit. The value is decision timing. Moving cost feedback from billing cycles to code review changes when engineers have optionality. Before merge, everything is reversible. After deployment, reversal requires migration effort.
Cost as a Code Review Concern
Code review evaluates correctness, maintainability, security, and performance. Cost deserves equivalent visibility. An architectural decision that doubles infrastructure spend should trigger the same review rigor as a performance regression or security vulnerability.
The challenge is not technical - cost estimation tools exist. The challenge is cultural. Many engineering organizations treat cost as an operations concern, not an engineering concern. Someone else reads billing dashboards, negotiates reserved instances, and optimizes spending. Engineers focus on feature velocity.
This separation works when infrastructure costs remain small relative to revenue. It breaks when infrastructure becomes a primary cost driver. Treating cost as someone else’s problem during design guarantees misalignment between architectural decisions and financial constraints.
Making cost visible during code review does not mean every PR needs financial approval. It means engineers see cost implications of their infrastructure choices at the moment those choices are negotiable. A comment showing “$1,200 monthly increase” provides context. The engineer might realize a smaller instance type suffices, or the increase might be justified for performance requirements. Either way, the decision happens with cost visibility, not cost ignorance.
Treating cost as code is less about saving money and more about reducing regret. Infrastructure decisions compound over time. Each choice constrains future options. The gap between decision and feedback determines how many compounding cycles occur before correction becomes possible. Shortening that gap - bringing cost feedback into code review - reduces the accumulation of irreversible mistakes.