Skip to main content

Webhooks Rate Limit - Ambient Agents

  • January 30, 2026
  • 2 replies
  • 25 views

I wanted to flag a potential scalability observation, and keen to have further discussions.


With the introduction of Ambient Agents, more systems are integrating with Moveworks via webhooks. As we continue to add plugins across multiple systems, the volume and frequency of inbound calls will increase significantly, which raises the risk of hitting rate limits (HTTP 429 errors).
To illustrate with a concrete example:

  • One Power Automate flow pulls data from Intune and sends device status alerts to Moveworks users.
  • A second Power Automate flow also pulls data from Intune and sends device status alerts to Moveworks users.
  • A ServiceNow flow sends SAM reclamation alerts to Moveworks users.


If these plugins are scheduled to run daily, the cumulative execution time becomes quite long. While these jobs are running, additional webhook calls from external systems can still occur, increasing the likelihood of rate limiting.
We’ve already implemented a 10-second delay between each API call across all three flows. However, this approach doesn’t give us sufficient control to reliably prevent rate limit errors, especially as usage scales.


Would be interested to get an initial perspective on how we might approach this more robustly from a platform or architecture standpoint.

2 replies

Kevin Mok
Forum|alt.badge.img+1
  • Community Manager
  • January 30, 2026

Hey ​@gowthamshekar89s - Thanks for the detailed write-up, this is really helpful context.

You're right that the current 5 req/s org-scoped limit becomes a bottleneck as you scale across multiple integrations. The 10-second delay approach works but isn't sustainable in the long term, especially when you can't coordinate timing across independent systems like Power Automate and ServiceNow.

 

I've raised this internally with our platform team to understand what it would take to increase the limit or scope it per-listener instead of per-org. I don't have a timeline to share yet, but wanted you to know this is on our radar.

 

In the meantime, a few workarounds to consider:

Staggered scheduling - Instead of running all flows "daily," assign specific time windows. For example: Intune Flow 1 at 6am, Flow 2 at 9am, ServiceNow at 12pm. Eliminates collision risk for scheduled jobs.

Retry with exponential backoff - When you hit a 429, retry after 1s, then 2s, then 4s. Power Automate and ServiceNow should both support retry policies natively. Doesn't prevent 429s but handles them gracefully.

 

If you're scaling beyond that, a centralized queue (Azure Service Bus, AWS SQS) that buffers requests and drains at a controlled rate would give you full control, though it adds infrastructure overhead

This is how a centralized queue would work:

Current state

With a queue:

Proposed Queue

How it works:

  1. Producers (the flows) push messages to a queue instead of calling Moveworks directly. Fire-and-forget—they don't care about rate limits.
  2. Queue buffers all incoming requests. Azure Service Bus, AWS SQS, or even a simple database table works.
  3. Single worker reads from the queue at a controlled rate (e.g., 1 message every 250ms = 4 req/s, safely under 5). Makes the actual webhook call to Moveworks.

Thanks for the suggestion, Kevin.

We had already implemented staggered scheduling with retries as a mitigation prior to this, which has helped reduce failures overall. That said, we’re still seeing approximately 5% of executions fail.

These failures are primarily caused by webhook traffic from other systems that operate outside the scheduled flows and are directly connected to Moveworks via Ambient Agents. Because these calls are unscheduled and concurrent, they can still trigger rate limiting despite the mitigations in place.