Table of contents

Add hundreds of integrations to your product through Merge’s Unified API
Get a demo

Inside Merge: how we’re building the leading sync engine

Ani Katipally
Software engineer
@Merge

Our sync jobs move millions of records every day for frontier LLM providers, leading banks, and thousands of other B2B SaaS companies. 

To power this scale, our engineering team is constantly rethinking how we can deliver faster, more reliable, and more resilient integrations.

To that end, here are some of the measures we’ve recently taken to raise the bar for sync performance.

Evolving concurrency from batching to dynamic scheduling

Our initial implementation for concurrency involved fixed-size batches with a "Sync Issuer" coordinating work. This involved:

  • Processing API requests sequentially
  • Grouping substeps into fixed batches (e.g., batch size of 2)
  • Waiting for an entire batch completion before proceeding

This approach came with a few drawbacks. Notably, performance was constrained by the slowest batch member, and sync issuers were left waiting instead of making more API requests—leading to wasted time.

This led us to adopt a fundamentally different approach: “Dynamic Node Scheduling.”

Here’s a snapshot of how it works: 

1. The Sync Issuer makes all API requests as quickly as possible.

2. Each result becomes a <code class="blog_inline-code">`QUEUED`</code> sync node.

3. Up to batch-size nodes run simultaneously as <code class="blog_inline-code">`RUNNING`</code>.

4. Completed nodes automatically trigger queued nodes.

This led us to eliminate the bottleneck caused by slow batch members, and it’s prevented idle sync issuer time. Taken together, we’ve sped up syncs by up to 15x.

Adopting intelligent rate limit management

Careful rate limit management makes syncs faster, as it eliminates the delays and retries associated with hitting actual rate limits.

With this in mind, we use a shared Redis cache to track API request activity across all concurrent processes. 

This allows us to:

  • Monitor usage across different rate limit types (we’ve catalogued these for each integration)
  • Coordinate between multiple processing jobs
  • Trigger exceptions when approaching the 80% rate limit threshold
  • Schedule optimal retry timing based on encoded cooloff periods

This approach consistently pushes the boundaries of rate limit management at scale.

For example, we recently synced 1.3 million objects for a frontier LLM provider and operated within 3% of their theoretical maximum throughput by dynamically managing rate limits and backing off at the right times.

Engineering fault-tolerant infrastructure at scale

We’ve introduced fault-tolerant state persistence to ensure sync jobs survive interruptions.

When AWS issues a termination notice—or our memory monitoring detects trouble—the system immediately serializes the job’s entire state into a JSON snapshot. This snapshot, capturing hundreds of variables, is written to Elastic File System (EFS) within the two-minute window available.

When a replacement server comes online, it retrieves the state file, reconstructs the sync environment with complete fidelity, and resumes execution without losing progress. No manual intervention required.

This breakthrough lets us run jobs of any duration with confidence in their completion.

We’ve also realized significant business benefits: By leaning further into spot instances, we’ve achieved 40% daily compute cost savings; and our engineers are free from repetitive interventions in large account syncs.

Final thoughts

We’re proud of the progress so far, but our mission isn’t to be better than competitors. It’s to deliver the best sync performance possible for our customers.

With customer feedback and ongoing experimentation—whether it’s in scheduling, retry logic, or infrastructure resilience—we’ll continue to push the limits of what’s possible in data synchronization.

“It was the same process, go talk to their team, figure out their API. It was taking a lot of time. And then before we knew it, there was a laundry list of HR integrations being requested for our prospects and customers.”

Name
Position
Position
Ani Katipally
Software engineer
@Merge

Read more

A guide to integrating with Paylocity’s API 

Engineering

A guide to integrating with HiBob’s API

Insights

How to integrate with the Dropbox API via Python

Engineering

Subscribe to the Merge Blog

Get stories from Merge straight to your inbox

Subscribe

But Merge isn’t just a Unified 
API product. Merge is an integration platform to also manage customer integrations.  gradient text
But Merge isn’t just a Unified 
API product. Merge is an integration platform to also manage customer integrations.  gradient text
But Merge isn’t just a Unified 
API product. Merge is an integration platform to also manage customer integrations.  gradient text
But Merge isn’t just a Unified 
API product. Merge is an integration platform to also manage customer integrations.  gradient text