EHR Platform

12TB database migration to AWS RDS without downtime

Eliminating $480K annual cost and operational risk. The EHR platform relied on a large, self-managed MariaDB Enterprise deployment that supported critical healthcare workloads. As the scale increased following the acquisition, the system became both a financial burden and an operational risk.

sales@itsyndicate.org

Enterprise reliability without enterprise waste

A US-based EHR platform serving thousands of private medical practices had built its infrastructure on a self-hosted MariaDB Enterprise cluster running on EC2. For years, it worked. But following an acquisition into a larger health tech group, the pressure to standardize, harden, and audit-prepare the stack became unavoidable. The database, 12TB of PHI-regulated clinical and billing data, was the most critical and most fragile component in the entire platform. It was also costing $40,000 per month in enterprise vendor support. The company needed a path from fragile dependency to a managed, auditable infrastructure asset. And it had to happen without downtime.

Quick facts

EHR Platform

Private medical practices in the USA

The cloud-based platform covers the full clinical workflow: scheduling, charting, e-prescribing, billing, insurance claims, and telehealth, all handling PHI data under HIPAA.

~$480,000

Support cost eliminated

By migrating from self-hosted MariaDB Enterprise on EC2 to Amazon RDS, we completely eliminated the $40,000 per month vendor support contract.

MariaDB → Amazon RDS

Live replication enabled a zero-downtime migration of 12TB of production data. Instead of a dump-and-restore approach, we used HAProxy to route traffic during the staged cutover.

"ITsyndicate transformed our most critical infrastructure component into a predictable, managed system without disrupting a single day of operations."

Michael Torres

CTO

What we did for the EHR Platform

Designing a zero-risk migration strategy

A 12TB production database in a HIPAA-regulated environment cannot be migrated with a maintenance window. The stakes for PHI data continuity, insurance claim processing, and real-time scheduling across thousands of practices made downtime commercially and legally unacceptable. The existing MariaDB Enterprise cluster on EC2 had no managed failover, no automated point-in-time recovery, and a replication topology that had grown organically over the years. Before we wrote a single line of Terraform, we needed to understand exactly what we were working with and where the failure modes were.

Infrastructure and replication audit: We mapped the full topology of the existing MariaDB Enterprise cluster, including replication lag behavior, failover procedures, backup schedule, and the specific EC2 instance configurations that could affect RDS parity. This audit surfaced several undocumented dependencies in the application's connection handling that would have caused silent failures if we had proceeded without them. We pushed back on the initial timeline as a result of this audit, and the client agreed.
Phased migration design with HAProxy as the control layer: Rather than a direct cutover, we designed a staged approach. HAProxy was introduced as a routing intermediary between the application layer and the database, giving us precise control over traffic during the transition. We defined RPO/RTO targets up front, modeled the replication lag thresholds that would trigger a go/no-go decision, and documented the rollback procedure before beginning execution.

Strategic compliance and operational positioning

The migration was not just a cost and risk exercise; it was a step toward audit-grade infrastructure readiness. Following the acquisition, the platform faced increased scrutiny on disaster recovery posture, encryption practices, and access controls. Moving to Amazon RDS gave the engineering and compliance teams a foundation they could demonstrate directly to auditors.

HIPAA-aligned controls built into RDS configuration: We configured RDS with encryption at rest using AWS KMS, TLS-enforced connections, IAM database authentication, and CloudTrail-integrated audit logging. These controls replaced a patchwork of manual configurations on the self-hosted cluster that were difficult to evidence in an audit context. Our team proactively recommended enabling Performance Insights and enabling automated minor version upgrades; both were initially out of scope but added without additional effort.
Disaster recovery documentation and runbook delivery: We delivered a full DR runbook covering point-in-time recovery procedures, Multi-AZ failover testing steps, and RPO/RTO baselines. The internal team had previously operated without formal DR documentation for the database layer. This was one of the first things flagged in the pre-acquisition infrastructure review, and we treated it as a deliverable with equal weight to the migration itself.

Enterprise database migration: FAQ

How do you migrate a 12TB production database without downtime?

Use live replication, not dump-and-restore.

At 12TB, a traditional dump-and-restore approach would have required a multi-hour maintenance window, unacceptable for a healthcare platform with real-time clinical workflows. Instead, we connected Amazon RDS as a replica to the live production master, allowed it to fully synchronize over several days, validated row-level consistency, and performed the final traffic switch during a low-traffic window. The cutover itself took minutes. HAProxy gave us precise control over traffic routing so we could execute and, if needed, roll back cleanly.

Why eliminate MariaDB Enterprise support rather than renew it?

MariaDB Enterprise support at $40,000/month primarily provided access to hotfixes, enterprise connectors, and vendor-backed incident response. Amazon RDS with Multi-AZ delivers automated failover, managed patching, point-in-time recovery, and AWS-backed SLAs natively.

For this client's workload profile, the managed service replaced every practical use case the enterprise support contract was serving at a fraction of the operational cost and with less manual intervention required from the internal team.

How does Amazon RDS improve HIPAA compliance posture compared to self-hosted MariaDB?

Managed RDS converts manual compliance controls into auditable, configuration-driven ones.

On the self-hosted EC2 cluster, encryption, access controls, and audit logging were implemented manually and inconsistently documented. Amazon RDS provides encryption at rest via AWS KMS, enforced TLS for data in transit, IAM database authentication, and CloudTrail integration for access logging, all of which produce audit artifacts automatically.

For this platform, that meant the compliance team could demonstrate database-layer controls to auditors through AWS Config and CloudTrail exports rather than through manual attestations.

What is the operational difference after moving to managed RDS?

Backups become automatic, failover becomes built-in, and DR becomes testable rather than theoretical.

On the self-hosted cluster, backup verification, failover testing, and patch management required manual intervention and internal coordination. Post-migration, automated backups run on schedule with configurable retention, Multi-AZ failover is tested through AWS Fault Injection Simulator rather than simulated manually, and minor version patching is handled by RDS on a defined schedule.

The internal engineering team freed up meaningful capacity that database operations had consumed and diverted it toward product delivery.

How do you handle read replica reconfiguration during cutover?

Replicas are the last thing you move, after the primary cutover is confirmed stable.

During the migration, the existing read replicas continued serving the application from the EC2-hosted cluster until the RDS primary was confirmed healthy and stable under production traffic. Only then did we reconfigure the application's read connections to the RDS read replicas and decommission the self-hosted replicas.

This sequencing kept the read path available throughout the cutover and gave us a clean rollback option if the RDS primary had shown any unexpected behavior under load.

We’d love to hear from you

Ready to migrate critical systems without disrupting your business?

Talk to our team about your needs.