Reading time: 7 min Tags: Legacy Modernization, Software Engineering, Architecture, Risk Management, Testing

The Strangler Fig Pattern: Modernize Legacy Software in Safe, Small Steps

A practical guide to modernizing legacy software with the Strangler Fig pattern: replace parts incrementally using routing seams, a data plan, and quality gates to reduce risk.

Modernizing a legacy system is rarely blocked by a lack of ideas. It is blocked by risk: breaking critical flows, stalling delivery, or discovering too late that the old system held messy business rules no one documented.

The Strangler Fig pattern is a way to modernize without betting the company on a single rewrite. Instead of replacing everything at once, you replace one slice at a time, while the legacy system continues to run.

This post explains how to apply the pattern in a practical, team-friendly way: where to start, how to route safely, how to handle data, and what quality gates keep incremental cutovers from becoming incremental outages.

What the Strangler Fig pattern is (and why it works)

The pattern is named after a vine that grows around a tree. In software terms, you build new functionality around the edges of an existing system, then gradually route more traffic to the new parts until the legacy system can be removed.

What makes it work is not the architecture diagram. It is the ability to change one bounded behavior at a time while keeping a stable contract for users and other systems.

In practice, this means you introduce a controlled “decision point” in front of a legacy capability (often an endpoint, a UI route, a job, or a message consumer). That decision point sends requests to either the legacy implementation or the new implementation, based on explicit rules.

Key Takeaways
  • Start with a seam that has clear inputs and outputs, not the most painful code.
  • Make routing a first-class feature: measurable, reversible, and testable.
  • Choose a data strategy per slice (read-only, dual-write, sync) and document it.
  • Use quality gates (parity checks, monitoring, rollback) before increasing traffic.

Find the right seam to start

A “seam” is where you can intercept requests or responsibilities cleanly. Good seams reduce the number of hidden dependencies you must replicate to ship value.

Look for seams with these traits:

  • Stable contract: The inputs and outputs are already understood (even if imperfect) and can be captured in tests.
  • High leverage: Replacing it improves delivery speed, reliability, or cost in a noticeable way.
  • Limited blast radius: A failure is containable with a rollback to the legacy path.
  • Observable behavior: You can measure success and failure (latency, errors, business outcomes).

A common trap is choosing the gnarliest module because it feels satisfying. Early slices should build confidence and infrastructure: routing, metrics, deployment patterns, and a shared definition of “parity.”

Add a thin routing layer you can trust

The routing layer is the “switch” that decides where work goes. It can live at different levels depending on your system: an API gateway, a reverse proxy, a web controller, a queue router, or even a UI navigation rule.

Whatever form it takes, treat it as production-critical. The routing layer should be boring and heavily tested, because every future incremental replacement depends on it.

What good routing rules look like

Routing rules should be explicit and explainable. If a rule cannot be described in one sentence, it is likely too complex to operate safely.

  • By endpoint or path: Route /v2/invoices to the new service while keeping legacy on /invoices.
  • By tenant or customer segment: Route internal users first, then a small set of low-risk customers.
  • By percentage: Start with 1 percent of traffic, then increase after verification.
  • By capability flag: Route only requests that use a specific feature (for example, “subscription discounts”).

Crucially, the routing layer needs a fast rollback. Rollback should be a configuration change, not a redeploy, and it should be rehearsed.

The flow often looks like this:

Client request
  -> Router (rules + telemetry + rollback)
      -> Legacy implementation (baseline)
      -> New implementation (incremental replacement)
  -> Response (with parity checks in early stages)

Plan the data story early

Data is where “incremental” can quietly turn into “impossible.” If your new component cannot read the right data reliably, or cannot write without creating inconsistencies, cutover stalls.

Before you build, choose a data approach for the specific slice you are replacing. You do not need one grand strategy for the whole company on day one, but you do need a clear plan per slice.

Three practical data strategies

  • Read-only first: The new component reads from legacy data sources but does not write. This is the safest start for reporting, search, and validation services.
  • Dual-write with reconciliation: During migration, writes go to both systems, and you run reconciliation checks to detect drift. This requires a plan for idempotency and conflict handling.
  • Sync then cutover: You backfill data into the new store, run both in parallel for a while, then flip writes to the new system. You keep the legacy store as a fallback until confidence is high.

Whatever strategy you choose, document the “source of truth” for each field during each phase. Teams often fail here by assuming source-of-truth is obvious. It is not obvious once two systems can accept writes.

Quality gates: tests, monitoring, and rollback

Incremental replacement works when you can prove the new component behaves acceptably before you send it more traffic. That proof comes from a small set of repeatable gates.

Here is a copyable checklist you can use for each slice:

  1. Contract captured: Inputs, outputs, and error conditions are written down and covered by automated tests.
  2. Baseline metrics: You know current latency, error rate, and key business metrics for the legacy path.
  3. Shadow or parity testing: The new path can process real requests (without affecting users) and results are compared.
  4. Observability: Dashboards and alerts exist for the router and the new component (not just the legacy system).
  5. Rollback rehearsal: The team has practiced flipping traffic back and knows what “recovered” looks like.
  6. Operational runbook: A short “if X then Y” playbook exists for on-call or incident response.

Notice what is not on the list: “finish the perfect architecture.” The goal is safe replacement, not a ceremony.

A concrete example: modernizing checkout

Imagine a small e-commerce platform with a legacy monolith. Checkout is fragile and hard to change. The team wants to modernize, but a full rewrite would take too long and risks breaking revenue-critical flows.

They pick a seam: the “pricing and promotions” calculation, which is currently embedded in checkout. It has clear inputs (cart, customer, shipping) and a deterministic output (line items, discounts, totals). It is also a frequent source of bugs.

They introduce a routing layer inside the monolith at the point where totals are computed. For most users, it calls the legacy method. For employees and a small pilot customer group, it calls a new pricing service.

For the data plan, they start read-only: the new service reads product and promotion data from the same database as the monolith via a safe, limited interface. That avoids dual-writes early. They add parity checks: on a sampled subset of requests, they compute totals both ways and compare results, logging any differences with enough context to debug.

After parity stabilizes, they expand traffic gradually. Only then do they tackle the next slice: payment authorization, which has a different data and risk profile. Over time, checkout becomes a composition of replaceable components, and the monolith shrinks instead of being “replaced all at once.”

Common mistakes to avoid

  • No clear definition of parity: Teams say “it works” without specifying which behaviors must match and which can change (rounding, tax edge cases, error messages).
  • Routing without observability: If you cannot tell how much traffic is on the new path, or how it is performing, you are migrating blind.
  • Dual-write too early: Writing to two systems before you have idempotency and reconciliation increases complexity fast.
  • Moving shared logic accidentally: The new component may depend on “small helper functions” that actually contain business rules. Discover them deliberately and test them.
  • Never deleting: The pattern only pays off if you remove legacy paths. Put decommissioning tasks on the roadmap as part of “done.”

When NOT to use this approach

The Strangler Fig pattern is powerful, but it is not a universal fit. Consider alternatives if:

  • The system is tiny: If the legacy application is small and well-understood, a straightforward rewrite may be cheaper than maintaining two paths.
  • You cannot create a seam: Some systems have no stable contracts and no intercept points, especially if they are tightly coupled to a vendor or a black-box process.
  • Regulatory or safety constraints require a single validated implementation: Running parallel logic and routing may add unacceptable compliance overhead.
  • Your team cannot operate the routing layer: If there is no on-call coverage, no monitoring culture, or no ability to roll back quickly, the approach can increase risk.

If you do proceed, start with a small slice that builds operational confidence, not just code.

Conclusion

The Strangler Fig pattern is a strategy for turning modernization into a sequence of controlled decisions. You identify a seam, introduce routing, choose a data plan, prove parity with quality gates, and expand traffic only when the evidence supports it.

The end goal is not “new tech.” The end goal is a system you can change safely, one slice at a time, until the legacy parts are no longer needed.

FAQ

How small should the first slice be?

Small enough to ship in weeks, not quarters, and narrow enough that you can define parity clearly. A good first slice often improves infrastructure (routing, telemetry, deployment) while delivering a modest user-visible benefit.

Where should the router live?

Put it where you can make decisions with the most context and the safest rollback. For APIs that might be at the edge gateway. For internal behavior, it can be inside the application. The key is that it is observable, testable, and reversible.

Do we need to migrate the database first?

Not necessarily. Many teams start with read-only access or shared reads and postpone database migration until they have stabilized the new component. Plan for data early, but migrate only when a slice requires it.

What is the best way to validate parity?

Use a combination: automated contract tests for known cases, and parity comparisons on real traffic for unknown cases. Sampled comparisons are often enough early on, as long as you capture and triage mismatches systematically.

How do we avoid “two systems forever”?

Make decommissioning an explicit deliverable for each slice: remove legacy code paths, delete unused tables or jobs when safe, and update runbooks and ownership. If deletion is always postponed, the organization pays compounding operational cost.

This post was generated by software for the Artificially Intelligent Blog. It follows a standardized template for consistency.