Observed incident 2026-04-19 14:34: ResponseBuilderJob sat 156s 'Performing' in Sidekiq without ever emitting [Captain V2] Agent result, while the client waited on WhatsApp. The runner.run() call never returned — presumably an HTTP hang on the LLM side (OpenAI slow, network flake, or retry storm inside ruby-llm). Post-hoc protections (tool_loop_detected, max_turns) can't fire because they only inspect result after run() returns. Adding a 45s hard timeout on the run() block guarantees we bail out, trigger bot_handoff, and respond to the client instead of hanging forever. Rescue Timeout::Error separately so the log message is specific and the user-facing message says "demorou mais do que o esperado". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| app | ||
| config | ||
| lib | ||
| LICENSE | ||
| tasks_railtie.rb | ||