fix(captain-memory): guard memory recall from blocking agent worker

Real-world test triggered a Sidekiq worker hang on conv 67 after a message was routed through Daniela: two ResponseBuilderJobs (msg 1318 and 1319) started, emitted typing_on, then never returned. Sidekiq showed 2/12 workers stuck for 10+ minutes — indefinite. Root cause likely: Agents::Runner evaluates the orchestrator instructions lambda multiple times per turn, and our wrapped lambda calls MemoryPromptInjector#append_memory_block each time. Inside, RecallService invokes OpenAI embedding API (2s timeout) and pgvector. Ruby's Timeout.timeout has documented holes on net/http syscalls — if the embedding API stalls at the socket level, the worker hangs forever even though the timeout "fired". Two fixes: 1. Per-message cache in the injector instance: the same message_text is embedded + queried once, not N times per turn. Dramatic reduction in network calls + DB queries during a single agent run. Every call after the first returns the cached block instantly. 2. Absolute rescue at append_memory_block top level: rescue StandardError => e; return base_prompt. Even if the whole memory pipeline throws, the base system prompt passes through and the agent keeps responding. Memory is NEVER allowed to block a response — that was already the design intent but the lambda caller path didn't honor it rigorously enough. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 09:06:35 -03:00 · 2026-04-19 09:06:35 -03:00 · bcf41ad15f
commit bcf41ad15f
parent 6330bec857
1 changed files with 25 additions and 9 deletions
--- a/enterprise/app/services/captain/assistant/memory_prompt_injector.rb
+++ b/enterprise/app/services/captain/assistant/memory_prompt_injector.rb
@ -1,6 +1,7 @@
 class Captain::Assistant::MemoryPromptInjector
  def initialize(conversation:)
    @conversation = conversation
+    @memory_block_cache = {}
  end

  def recall_enabled?
@ -14,24 +15,39 @@ class Captain::Assistant::MemoryPromptInjector
  # Wraps the given base system prompt with a <memoria_cliente> block
  # when recall is enabled and memories are found. Degrades gracefully:
  # returns the untouched base prompt on any failure or absent context.
+  # Caches the memory block per-message-text within the injector instance so
+  # Agents::Runner evaluating instructions multiple times per turn does not
+  # re-hit EmbeddingService or pgvector on every call.
  def append_memory_block(base_prompt, message_text)
    return base_prompt unless recall_enabled?
    return base_prompt if @conversation&.contact.blank?

-    memories = Captain::ContactMemories::RecallService.new(
-      contact: @conversation.contact,
-      query_text: message_text,
-      unit_id: resolve_unit_id
-    ).call
+    block = memory_block_for(message_text)
+    return base_prompt if block.blank?

-    memory_block = Captain::ContactMemories::PromptInjectionService.new(memories: memories).call
-    return base_prompt if memory_block.blank?
-
-    [base_prompt, memory_block].join("\n\n")
+    [base_prompt, block].join("\n\n")
+  rescue StandardError => e
+    # Absolute guard: memory recall NEVER blocks or breaks the agent response.
+    Rails.logger.error("[Captain V2] MemoryPromptInjector unexpected failure: #{e.class}: #{e.message}")
+    base_prompt
  end

  private

+  def memory_block_for(message_text)
+    key = message_text.to_s
+    return @memory_block_cache[key] if @memory_block_cache.key?(key)
+
+    memories = Captain::ContactMemories::RecallService.new(
+      contact: @conversation.contact,
+      query_text: key,
+      unit_id: resolve_unit_id
+    ).call
+
+    @memory_block_cache[key] =
+      Captain::ContactMemories::PromptInjectionService.new(memories: memories).call
+  end
+
  def resolve_unit_id
    return nil if @conversation.blank?