iachat/enterprise/app/models/captain
Rodribm10 fa758e4848 feat(captain): hierarchical model routing + conversation-level memory cache
Two orthogonal cost optimizations to the Captain agent pipeline:

1. Hierarchical model routing (optimization A)

Captain::Scenario now overrides agent_model to read a dedicated
InstallationConfig CAPTAIN_OPEN_AI_MODEL_SCENARIO, falling back to the
global CAPTAIN_OPEN_AI_MODEL used by the orchestrator (Assistant).

Rationale: the orchestrator (Jasmine) does cheap triage (is this a
reservation intent? a greeting? escalate to human?) — a smaller model
handles this well. Scenarios (Daniela — reserva) run complex flows with
tool calling, strict taxonomies, and JSON schema output — they benefit
from a stronger model.

Config in this install: CAPTAIN_OPEN_AI_MODEL=gpt-4o-mini (orchestrator)
and CAPTAIN_OPEN_AI_MODEL_SCENARIO=gpt-4o (scenarios). Estimated ~60%
cost reduction vs everything on gpt-4o, preserving quality where it
matters for the business flow.

2. Conversation-level memory cache (optimization B)

MemoryPromptInjector now persists the computed memory block on
conversation.custom_attributes[captain_cached_memory_block]. First turn
computes once (embedding + pgvector query + XML formatting); subsequent
turns reuse. The customer's profile does not change during an open
conversation, so re-running the pipeline on every turn was pure waste.

Graceful fallbacks:
- Cache write failure → per-service-instance in-memory fallback still
  applies.
- Cache read failure → fresh recall runs (no regression).
- Contact mismatch → invalidates cache, fresh recall runs.

When a new conversation starts, custom_attributes is empty → fresh
recall populates the cache for that conversation's lifetime.

Estimated ~80% reduction in embedding + pgvector calls during
multi-turn conversations.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 09:47:15 -03:00
..
lifecycle feat(lifecycle): add MinInterval and CustomerReplied guards 2026-04-15 01:49:22 -03:00
assistant_response.rb feat: legacy features to ruby llm (#12994) 2025-12-11 14:17:28 +05:30
assistant.rb feat: Adiciona prompt orquestrador configurável para assistentes Captain com editor UI. 2026-02-27 11:57:59 -03:00
brand.rb chore(style): fix rubocop offenses and update typing indicators 2026-02-25 15:06:58 -03:00
contact_memory.rb feat(captain-memory): add Captain::ContactMemory model with scopes and lifecycle methods 2026-04-18 23:53:33 -03:00
conversation_insight.rb feat: Adiciona prompt orquestrador configurável para assistentes Captain com editor UI. 2026-02-27 11:57:59 -03:00
custom_tool.rb chore(style): fix rubocop offenses and update typing indicators 2026-02-25 15:06:58 -03:00
document.rb chore(style): fix rubocop offenses and update typing indicators 2026-02-25 15:06:58 -03:00
gallery_item.rb feat(units): allow one Pix unit to link to multiple inboxes (N:N) 2026-02-26 21:33:23 -03:00
lifecycle.rb feat(lifecycle): add Captain::Lifecycle::Config model 2026-04-15 01:14:19 -03:00
notification_template.rb refactor: move notification templates de units para inboxes 2026-03-01 22:17:27 -03:00
pix_charge.rb feat: Captain::PixCharge posta nota interna quando PIX eh gerado 2026-04-14 20:09:20 -03:00
report_snapshot.rb feat(captain): improve suite photo search accuracy with AI guidance 2026-02-26 23:04:28 -03:00
reservation.rb feat(lifecycle): wire Captain::Reservation lifecycle hooks 2026-04-15 01:37:23 -03:00
scenario.rb feat(captain): hierarchical model routing + conversation-level memory cache 2026-04-19 09:47:15 -03:00
unit_inbox.rb feat(captain): improve suite photo search accuracy with AI guidance 2026-02-26 23:04:28 -03:00
unit.rb feat(lifecycle): add concierge_* accessors to Captain::Unit 2026-04-15 01:23:40 -03:00