From bcf41ad15f8a383f06982aea9820a2094337d8f7 Mon Sep 17 00:00:00 2001 From: Rodribm10 Date: Sun, 19 Apr 2026 09:06:35 -0300 Subject: [PATCH] fix(captain-memory): guard memory recall from blocking agent worker MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Real-world test triggered a Sidekiq worker hang on conv 67 after a message was routed through Daniela: two ResponseBuilderJobs (msg 1318 and 1319) started, emitted typing_on, then never returned. Sidekiq showed 2/12 workers stuck for 10+ minutes — indefinite. Root cause likely: Agents::Runner evaluates the orchestrator instructions lambda multiple times per turn, and our wrapped lambda calls MemoryPromptInjector#append_memory_block each time. Inside, RecallService invokes OpenAI embedding API (2s timeout) and pgvector. Ruby's Timeout.timeout has documented holes on net/http syscalls — if the embedding API stalls at the socket level, the worker hangs forever even though the timeout "fired". Two fixes: 1. Per-message cache in the injector instance: the same message_text is embedded + queried once, not N times per turn. Dramatic reduction in network calls + DB queries during a single agent run. Every call after the first returns the cached block instantly. 2. Absolute rescue at append_memory_block top level: rescue StandardError => e; return base_prompt. Even if the whole memory pipeline throws, the base system prompt passes through and the agent keeps responding. Memory is NEVER allowed to block a response — that was already the design intent but the lambda caller path didn't honor it rigorously enough. Co-Authored-By: Claude Opus 4.7 (1M context) --- .../assistant/memory_prompt_injector.rb | 34 ++++++++++++++----- 1 file changed, 25 insertions(+), 9 deletions(-) diff --git a/enterprise/app/services/captain/assistant/memory_prompt_injector.rb b/enterprise/app/services/captain/assistant/memory_prompt_injector.rb index 0fab1c499..4ff522ef8 100644 --- a/enterprise/app/services/captain/assistant/memory_prompt_injector.rb +++ b/enterprise/app/services/captain/assistant/memory_prompt_injector.rb @@ -1,6 +1,7 @@ class Captain::Assistant::MemoryPromptInjector def initialize(conversation:) @conversation = conversation + @memory_block_cache = {} end def recall_enabled? @@ -14,24 +15,39 @@ class Captain::Assistant::MemoryPromptInjector # Wraps the given base system prompt with a block # when recall is enabled and memories are found. Degrades gracefully: # returns the untouched base prompt on any failure or absent context. + # Caches the memory block per-message-text within the injector instance so + # Agents::Runner evaluating instructions multiple times per turn does not + # re-hit EmbeddingService or pgvector on every call. def append_memory_block(base_prompt, message_text) return base_prompt unless recall_enabled? return base_prompt if @conversation&.contact.blank? - memories = Captain::ContactMemories::RecallService.new( - contact: @conversation.contact, - query_text: message_text, - unit_id: resolve_unit_id - ).call + block = memory_block_for(message_text) + return base_prompt if block.blank? - memory_block = Captain::ContactMemories::PromptInjectionService.new(memories: memories).call - return base_prompt if memory_block.blank? - - [base_prompt, memory_block].join("\n\n") + [base_prompt, block].join("\n\n") + rescue StandardError => e + # Absolute guard: memory recall NEVER blocks or breaks the agent response. + Rails.logger.error("[Captain V2] MemoryPromptInjector unexpected failure: #{e.class}: #{e.message}") + base_prompt end private + def memory_block_for(message_text) + key = message_text.to_s + return @memory_block_cache[key] if @memory_block_cache.key?(key) + + memories = Captain::ContactMemories::RecallService.new( + contact: @conversation.contact, + query_text: key, + unit_id: resolve_unit_id + ).call + + @memory_block_cache[key] = + Captain::ContactMemories::PromptInjectionService.new(memories: memories).call + end + def resolve_unit_id return nil if @conversation.blank?