Skip to main content

The Golden Rule: Why Your Plugin Is Slow

  • March 11, 2026
  • 0 replies
  • 51 views

Kevin Mok
Forum|alt.badge.img+1

Hey,

Your plugin went from 2 seconds to 8 to respond to your user. You didn't change anything. You just added a second action to the conversation process, then a third, then a fourth. Now the user stares at a spinner and you're filing a bug about "slow plugin response times."

It's not a platform bug. It's an architecture problem, and there's one golden rule that fixes it.

The Golden Rule

Here it is:

Never execute two action activities in a row in a conversation process without collecting a slot in between.

That's it. One sentence. The single most important architecture pattern in Agent Studio.

Why it matters

Here's what actually happens when you chain actions in a conversation process. Conversation processes are deterministic code paths: the reasoner will execute each activity in order regardless. But between each action activity, the reasoner still has to process the output. Each action triggers a full reasoner turn — think about the output, figure out what it means, then move on to the next action. Think → act → observe, for every single action in the chain.

Here's the thing: the reasoner was always going to run your second action. The path is predetermined. But you've handed it a full API payload it doesn't need, and it has to stop and reason about that information before continuing down a path it was already going to follow. That's wasted processing.

So three chained actions means three full reasoner turns:

  1. Reasoner fires lookup_user_calendar. Waits. Gets the response. Reads the full API payload. Thinks about what to do with it. Moves to the next activity.
  2. Reasoner fires fetch_room_list. Waits. Gets the response. Now it's reading both API payloads, thinking about both. Moves on.
  3. Reasoner fires check_room_capacity. Waits. Gets the response. Now it's reading all three API payloads plus everything else in the context window.

Each turn, the context window gets heavier. The reasoner processes all accumulated data on every turn, not just the latest response. And none of that intermediate data changes the outcome — the path was already set.

Two problems compound:

  1. Latency spikes. Three reasoner turns instead of one. Each turn includes the full think → act → observe cycle. A plugin that responded in 2 seconds with one action takes 6-8 seconds with three chained actions, because the reasoner is running three separate processing cycles.
  2. Quality drops. The reasoner has a fixed attention budget. More tokens in the context window means less focus per token. The second action's output lands in what researchers call the "lost middle," an attention dead zone where the reasoner is measurably less likely to use data correctly. Microsoft Research found a 39% performance drop in multi-turn settings for exactly this reason.

What chaining actually looks like

# Conversation Process: schedule_meeting
activities:
- action_activity:
action_name: lookup_user_calendar
required_slots: [organizer]
output_key: calendar_info
# ^ Reasoner turn 1: fires action, reads full calendar API response

- action_activity: # ← no slot collected before this
action_name: fetch_room_list
input_mapper:
building: data.calendar_info.default_building
output_key: rooms
# ^ Reasoner turn 2: fires action, reads rooms API response + calendar response

- action_activity: # ← no slot collected before this
action_name: check_room_capacity
input_mapper:
rooms: data.rooms.available
headcount: data.attendees.$LENGTH()
output_key: suitable_rooms
# ^ Reasoner turn 3: fires action, reads capacity response + rooms + calendar

Three actions, three reasoner turns. Each turn adds 2-5KB of raw API payload to the context. By the third turn, the reasoner is wading through 10-15KB of intermediate JSON it doesn't need to make a decision — because the decision was already made by the conversation process. Your clean plugin config is buried under data the reasoner was never going to act on.

The fix: compound actions

Move the chain out of the conversation process. A compound action wraps multiple steps into a single action that the reasoner fires once. The steps execute behind the scenes; the reasoner never sees or processes any intermediate results. It just gets back the fields it needs.

# Compound Action: prepare_meeting_room
steps:
- action:
action_name: lookup_user_calendar
input_mapper:
user_id: data.organizer.id
output_key: calendar_info

- action:
action_name: fetch_room_list
input_mapper:
building: data.calendar_info.default_building
output_key: rooms

- action:
action_name: check_room_capacity
input_mapper:
rooms: data.rooms.available
headcount: data.headcount
output_key: suitable_rooms

- return:
output_mapper:
recommended_room: data.suitable_rooms[0].name
room_capacity: data.suitable_rooms[0].capacity
alternatives_count: data.suitable_rooms.$LENGTH() - 1

Same three API calls. But they execute inside the compound action, not as separate activities in the conversation process. The reasoner fires prepare_meeting_room, waits once, and gets back three clean fields: a room name, its capacity, and how many alternatives exist. One reasoner turn instead of three. No intermediate payloads cluttering the context window. Response time drops back to 2 seconds.

The clean conversation process

Here's what the conversation process looks like after refactoring:

# Conversation Process: schedule_meeting (refactored)
activities:
- action_activity:
action_name: prepare_meeting_room # compound action
required_slots: [organizer, attendees]
output_key: room_prep

- action_activity:
action_name: create_calendar_event
required_slots: [meeting_title, start_time, duration]
input_mapper:
room: data.room_prep.recommended_room
attendees: data.attendees
output_key: created_event

Two action activities, but there's a slot collection between them (meeting_title, start_time, duration on the second activity). When those slots have required_slots, the reasoner stops and asks the user for them. That's your natural barrier. Golden Rule satisfied.

Rule of thumb: if your conversation process has two action activities back-to-back with no slot collection in between, refactor. Compound actions absorb the chain. Required slots on the next activity give you the natural checkpoint.

Quick Hits

Ambient agent plugins in the Marketplace — You can now install ambient agent plugins directly from the marketplace. Browse, click, deploy.

Python actions are live — No more APIthon. Write your action logic in Python and run it natively.

Forms and knowledge articles in content activities — You can finally attach forms or knowledge articles to your content activities. Long-awaited, now shipped. Check the docs!

Worth Reading

  • LLMs Get Lost In Multi-Turn Conversation — Microsoft Research tested 15 LLMs across 200K+ conversations. Performance dropped 39% in multi-turn settings. Not because the models got dumber; they anchor on early assumptions and can't recover. This is the empirical case for keeping your reasoner's context clean between actions.
  • Architecting Efficient Context-Aware Multi-Agent Framework for Production — Google's ADK team on treating context as a compiled view over structured state, not a mutable string buffer. They explicitly name "lost in the middle" as a failure mode of naive context stuffing. Sound familiar?
  • Building Durable AI Agents: A Guide to Context Engineering — Inngest's breakdown of how production agents break when context grows. The stat that got me: at 32K tokens, GPT-4o drops from 99.3% to 69.7% accuracy. Written by someone who shipped agents and watched them fail.

Join the Community

Building plugins? Come share your compound action patterns in the community.

Office Hours have been a cookout. Sign up here.

-- Kevin

Developer Advocate @ Moveworks | Agent Studio

P.S. If you've hit the chaining problem and found a creative fix, reply. I want to hear about it.