fix(captain-memory): guard memory recall from blocking agent worker
Real-world test triggered a Sidekiq worker hang on conv 67 after a message was routed through Daniela: two ResponseBuilderJobs (msg 1318 and 1319) started, emitted typing_on, then never returned. Sidekiq showed 2/12 workers stuck for 10+ minutes — indefinite. Root cause likely: Agents::Runner evaluates the orchestrator instructions lambda multiple times per turn, and our wrapped lambda calls MemoryPromptInjector#append_memory_block each time. Inside, RecallService invokes OpenAI embedding API (2s timeout) and pgvector. Ruby's Timeout.timeout has documented holes on net/http syscalls — if the embedding API stalls at the socket level, the worker hangs forever even though the timeout "fired". Two fixes: 1. Per-message cache in the injector instance: the same message_text is embedded + queried once, not N times per turn. Dramatic reduction in network calls + DB queries during a single agent run. Every call after the first returns the cached block instantly. 2. Absolute rescue at append_memory_block top level: rescue StandardError => e; return base_prompt. Even if the whole memory pipeline throws, the base system prompt passes through and the agent keeps responding. Memory is NEVER allowed to block a response — that was already the design intent but the lambda caller path didn't honor it rigorously enough. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
6330bec857
commit
bcf41ad15f
@ -1,6 +1,7 @@
|
||||
class Captain::Assistant::MemoryPromptInjector
|
||||
def initialize(conversation:)
|
||||
@conversation = conversation
|
||||
@memory_block_cache = {}
|
||||
end
|
||||
|
||||
def recall_enabled?
|
||||
@ -14,24 +15,39 @@ class Captain::Assistant::MemoryPromptInjector
|
||||
# Wraps the given base system prompt with a <memoria_cliente> block
|
||||
# when recall is enabled and memories are found. Degrades gracefully:
|
||||
# returns the untouched base prompt on any failure or absent context.
|
||||
# Caches the memory block per-message-text within the injector instance so
|
||||
# Agents::Runner evaluating instructions multiple times per turn does not
|
||||
# re-hit EmbeddingService or pgvector on every call.
|
||||
def append_memory_block(base_prompt, message_text)
|
||||
return base_prompt unless recall_enabled?
|
||||
return base_prompt if @conversation&.contact.blank?
|
||||
|
||||
memories = Captain::ContactMemories::RecallService.new(
|
||||
contact: @conversation.contact,
|
||||
query_text: message_text,
|
||||
unit_id: resolve_unit_id
|
||||
).call
|
||||
block = memory_block_for(message_text)
|
||||
return base_prompt if block.blank?
|
||||
|
||||
memory_block = Captain::ContactMemories::PromptInjectionService.new(memories: memories).call
|
||||
return base_prompt if memory_block.blank?
|
||||
|
||||
[base_prompt, memory_block].join("\n\n")
|
||||
[base_prompt, block].join("\n\n")
|
||||
rescue StandardError => e
|
||||
# Absolute guard: memory recall NEVER blocks or breaks the agent response.
|
||||
Rails.logger.error("[Captain V2] MemoryPromptInjector unexpected failure: #{e.class}: #{e.message}")
|
||||
base_prompt
|
||||
end
|
||||
|
||||
private
|
||||
|
||||
def memory_block_for(message_text)
|
||||
key = message_text.to_s
|
||||
return @memory_block_cache[key] if @memory_block_cache.key?(key)
|
||||
|
||||
memories = Captain::ContactMemories::RecallService.new(
|
||||
contact: @conversation.contact,
|
||||
query_text: key,
|
||||
unit_id: resolve_unit_id
|
||||
).call
|
||||
|
||||
@memory_block_cache[key] =
|
||||
Captain::ContactMemories::PromptInjectionService.new(memories: memories).call
|
||||
end
|
||||
|
||||
def resolve_unit_id
|
||||
return nil if @conversation.blank?
|
||||
|
||||
|
||||
Loading…
Reference in New Issue
Block a user