From 7bc510354152c0d72237e68369c7414a6159337a Mon Sep 17 00:00:00 2001
From: Rodribm10 <rodrigobm10@me.com>
Date: Sun, 19 Apr 2026 11:03:22 -0300
Subject: [PATCH] fix(captain): cap max_turns at 15 + restore
 scenario->orchestrator handoff

Runaway incident: Daniela (reservation scenario) entered a tool-calling
loop, invoking faq_lookup with the same query dozens of times per
second, stuck at 'Performing' in Sidekiq for minutes with 1-of-12 busy.
Root cause was two interacting factors:

1. The previous commit removed scenario_agent.register_handoffs(
   assistant_agent) to prevent ping-pong. In practice, the scenario LLM
   uses handoff_to_orchestrator as a safety valve when it cannot
   advance. Without it, the LLM kept calling other available tools
   (faq_lookup) indefinitely.

2. max_turns was 100. A runaway loop could burn 100 LLM + tool cycles
   before Sidekiq's timeout fired, which meant real token spend in a
   single bad turn could blow a day's budget.

Both restored/fixed:
- max_turns: 100 -> 15. Plenty for normal flows; hard ceiling on any
  runaway. The LLM simply ran out of turns and had to emit a final
  response instead of looping further.
- scenario -> orchestrator handoff: re-registered. Ping-pong risk is
  contained by max_turns AND by explicit prompt rules in the scenario
  instruction forbidding gratuitous handoffs (added to Daniela prompt
  in earlier commit).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 .../captain/assistant/agent_runner_service.rb | 23 ++++++++++++-------
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/enterprise/app/services/captain/assistant/agent_runner_service.rb b/enterprise/app/services/captain/assistant/agent_runner_service.rb
index 9daf769e0..95cca90cc 100644
--- a/enterprise/app/services/captain/assistant/agent_runner_service.rb
+++ b/enterprise/app/services/captain/assistant/agent_runner_service.rb
@@ -47,7 +47,11 @@ class Captain::Assistant::AgentRunnerService
     runner = add_usage_metadata_callback(runner)
     runner = add_callbacks_to_runner(runner) if @callbacks.any?
     install_instrumentation(runner)
-    result = runner.run(message_to_process, context: context, max_turns: 100)
+    # max_turns is the hard safety cap: each "turn" = one LLM call + optional tool calls.
+    # 100 allowed runaway loops (LLM calling faq_lookup indefinitely when confused).
+    # 15 is plenty for normal flows (greeting -> handoff -> coleta -> tool calls -> resposta)
+    # while keeping a burn-budget ceiling per message.
+    result = runner.run(message_to_process, context: context, max_turns: 15)
 
     process_agent_result(result, original_query: message_to_process)
   rescue StandardError => e
@@ -373,14 +377,17 @@ class Captain::Assistant::AgentRunnerService
     assistant_agent = build_orchestrator_agent_with_memory
     scenario_agents = @assistant.scenarios.enabled.map(&:agent)
 
-    # Orchestrator can hand off INTO any scenario. Scenarios do NOT hand off
-    # back to the orchestrator — that creates a ping-pong where the scenario
-    # calls handoff_to_jasmine mid-flow, the orchestrator resumes the turn,
-    # and responses get duplicated or routed through the FAQ guardrail. When
-    # a customer changes topic mid-scenario, pick_starting_agent on the next
-    # turn already routes back to the orchestrator based on conversation
-    # state — no manual handoff needed from the scenario side.
+    # Bidirectional handoff: orchestrator -> scenarios AND scenarios -> orchestrator.
+    # Historical note: removing the back-edge looks attractive (prevents ping-pong)
+    # but in practice the scenario LLM uses handoff_to_orchestrator as a "fallback"
+    # when it gets confused. Without that fallback, the LLM keeps calling other
+    # available tools (faq_lookup, etc.) in a loop — observed real-world incident
+    # where Daniela called faq_lookup dozens of times in a runaway. Keep the edge.
+    # Ping-pong is instead contained by max_turns in generate_response AND by
+    # explicit prompt rules in the scenario instruction forbidding gratuitous
+    # handoffs.
     assistant_agent.register_handoffs(*scenario_agents) if scenario_agents.any?
+    scenario_agents.each { |scenario_agent| scenario_agent.register_handoffs(assistant_agent) }
 
     [assistant_agent] + scenario_agents
   end