Skip to main content
Question

Best practices for testing and promoting plugins (Sandbox vs. Production)?

  • June 2, 2026
  • 2 replies
  • 56 views

hundleymf
Forum|alt.badge.img+5

Hi everyone,

I am looking for input on how other teams handle plugin testing and lifecycle management. I'd appreciate hearing what is working well for you!

Our current process:

We build and test in our Sandbox environment. We temporarily add testers, and once testing passes, we remove them to keep the environment clean and restrict access. We then export the plugin to Production and run a quick smoke test to confirm it works. For enhancements: We build the changes in Sandbox, export the new version to Production, and then swap them (turn off the old Production plugin and turn on the updated one). We have to keep the original plugin active in Production while we build the enhancement in Sandbox. What we are trying to figure out:

We are looking for a simpler approach. We considered just building directly in Production and using the Launch Configuration to whitelist our testers (via "Allow selected users"), but we hit two roadblocks:

System Triggers (Ambient Agents):

  • How do you safely test webhooks or scheduled triggers in Production? Since System Triggers don't support Audience Settings (you can't whitelist testers), we aren't sure how to test these without accidentally impacting the live environment.
  • Live Edits: How do you handle making updates to a conversational plugin in Production when saving your edits immediately pushes those changes live to users?

If your team has a cleaner testing approach, or if you have found workarounds for the two roadblocks above, I would love to hear your strategies.

Thanks!

2 replies

Forum|alt.badge.img+1
  • Known Participant
  • June 2, 2026

Hi ​@hundleymf 

To safely test triggers in a Production environment without impacting other users, it is recommended to combine the following strategies:

1. Test the Compound Action in Isolation

Since System Triggers simply execute a Compound Action, you should thoroughly validate the logic before attaching it to any active trigger.

  • Execute the test and review logs to validate:
    • Data routing
    • Conditional logic
    • Formatting

This approach allows you to validate behavior without relying on schedules or live webhook events.

 

2. Webhook / Listener Testing

When testing live webhooks end-to-end in Production, you should apply a DSL Event Filter at the Listener or Plugin level to restrict execution to specific users.

Why use a DSL Event Filter?

  • Ensures events are filtered before plugin execution
  • Keeps logs clean
  • Optimizes processing efficiency
  • Clearly separates testing logic from workflow logic

Example: Allowlist Filtering

If the incoming webhook payload contains the user’s email (e.g., parsed_body.user.email), you can apply an allowlist using the DSL IN operator:

parsed_body.user.email IN ["test1@example.com", "test2@example.com", "dev-user@example.com"]

Benefits

  • Fail-fast mechanism: Non-matching events are immediately discarded and marked as skipped in logs
  • Zero code changes for launch: Simply remove the filter when going live—no need to modify your Compound Action

3. Scheduled Triggers

System Triggers bypass Launch Rules (Audience Settings). Therefore, implementing an allowlist check the Compound Action is essential to control execution during Production testing.

You can implement this using either:

Option 1: Script Action (Recommended)

Use a reusable Script Action to centralize allowlist logic.

Script Example:


allowed_emails = ["test1@example.com", "test2@example.com", "dev-user@example.com"]
email_to_check in allowed_emails

Compound Action Example:

steps:
- action:
action_name: check_allowed_tester
output_key: is_allowed_tester
input_args:
email_to_check: data.user_email
allowed_emails: '["test1@example.com", "test2@example.com"]'

- switch:
cases:
- condition: data.is_allowed_tester == true
steps:
- action:
action_name: process_data_action
output_key: process_result
input_args:
email: data.user_email
default:
steps:
- return:
output_mapper:
message: "User is not an approved tester. Skipping execution."

How it works

  • The script evaluates email_to_check in allowed_emails, returning a boolean.
  • This value is stored in is_allowed_tester.
  • The switch condition evaluates the flag and executes only for approved testers.

Advantages

  • Highly scalable: Logic is maintained in a single reusable script
  • Flexible: Easily pass different allowlists across multiple workflows

Option 2: Native DSL (Lightweight & Fast)

For better performance and lower latency, you can implement allowlist checks directly using DSL.

Example:

steps:
- for:
each: record
in: data.fetched_records
output_key: processed_results
steps:
- switch:
cases:
- condition: 'record.email IN ["test1@example.com", "test2@example.com"]'
steps:
- action:
action_name: process_record_action
output_key: action_result
input_args:
record_id: record.id

Advantages

  • Lower latency (no external script call)
  • Simpler for straightforward checks

 

While both approaches are valid, Option 1 (Script Action) is strongly recommended:

  • âś… Centralized and reusable logic
  • âś… Easier to manage across multiple plugins
  • âś… More flexible for future enhancements
  • âś… Scales better as the number of workflows grows

 

Hope this helps! Let me know if you have further questions.


hundleymf
Forum|alt.badge.img+5
  • Author
  • Inspiring
  • June 8, 2026

Hi ​@chetan.bhagat,

Thank you for the detailed response — this is very helpful, especially the examples around filtering and allowlists.

After reviewing and discussing this internally, I wanted to follow up on a few points and clarify our situation a bit more.

Where your approach helps

  • The allowlist/filtering strategy makes sense for testing webhooks and scheduled triggers in Production.
  • The idea of filtering early (listener level / DSL) is helpful for controlling execution and avoiding unintended impact.

Where we are still running into challenges Our main goal is to simplify our lifecycle. Today, we do:

  • Build and test in Sandbox
  • Promote to Production
  • Do a quick validation
  • Release

In theory, this should work well. However, in practice we are seeing:

  • Different behavior between Sandbox and Production
  • Sandbox being less stable or not representing Production behavior accurately

Because of this, we end up:

  • Re-testing or debugging again in Production
  • Investigating environment differences
  • Spending more time than expected

So even when we “test once in Sandbox,” we cannot fully rely on it and still need deeper validation in Production.

Follow-up questions

  1. Production-first approach (overall lifecycle)

    • Do you or other teams actually build directly in Production as a primary workflow, or is your approach mainly for targeted testing only?
    • If you do build in Production, how do you manage complexity as the number of filters and allowlists grows?
  2. System Triggers / Scheduled jobs

    • Your suggestion to add allowlist checks inside the Compound Action makes sense.
    • How do you manage this at scale?
      • Do you keep reusable “tester check” actions across all plugins?
      • How do you ensure those checks are not accidentally left in place at go-live?
  3. Environment differences (Sandbox vs Production)

    • Have you seen similar inconsistencies between environments?
    • If yes, how do you decide when Sandbox is “good enough” to move forward vs continuing to troubleshoot there?
  4. Live edits in Production (key gap for us)

    • One of our biggest concerns with a Production-first approach is that edits go live immediately.
    • How does your team handle:
      • Updating conversational plugins safely
      • Avoiding partial or broken experiences while making changes
    • Do you rely on cloning/versioning patterns (duplicate plugin, then swap), or another approach?

What we are trying to solve We are not necessarily looking to replace Sandbox entirely, but to:

  • Reduce duplicate testing effort
  • Reduce time spent troubleshooting environment differences
  • Find a cleaner and more predictable promotion model

Your trigger testing approach helps with part of this, but we are still trying to understand the broader lifecycle strategy teams are using successfully.

Appreciate any additional insight you can share — especially around how you balance safety vs speed in Production.

Thanks again!