iachat

History

Rodribm10 fa758e4848 feat(captain): hierarchical model routing + conversation-level memory cache Two orthogonal cost optimizations to the Captain agent pipeline: 1. Hierarchical model routing (optimization A) Captain::Scenario now overrides agent_model to read a dedicated InstallationConfig CAPTAIN_OPEN_AI_MODEL_SCENARIO, falling back to the global CAPTAIN_OPEN_AI_MODEL used by the orchestrator (Assistant). Rationale: the orchestrator (Jasmine) does cheap triage (is this a reservation intent? a greeting? escalate to human?) — a smaller model handles this well. Scenarios (Daniela — reserva) run complex flows with tool calling, strict taxonomies, and JSON schema output — they benefit from a stronger model. Config in this install: CAPTAIN_OPEN_AI_MODEL=gpt-4o-mini (orchestrator) and CAPTAIN_OPEN_AI_MODEL_SCENARIO=gpt-4o (scenarios). Estimated ~60% cost reduction vs everything on gpt-4o, preserving quality where it matters for the business flow. 2. Conversation-level memory cache (optimization B) MemoryPromptInjector now persists the computed memory block on conversation.custom_attributes[captain_cached_memory_block]. First turn computes once (embedding + pgvector query + XML formatting); subsequent turns reuse. The customer's profile does not change during an open conversation, so re-running the pipeline on every turn was pure waste. Graceful fallbacks: - Cache write failure → per-service-instance in-memory fallback still applies. - Cache read failure → fresh recall runs (no regression). - Contact mismatch → invalidates cache, fresh recall runs. When a new conversation starts, custom_attributes is empty → fresh recall populates the cache for that conversation's lifetime. Estimated ~80% reduction in embedding + pgvector calls during multi-turn conversations. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>		2026-04-19 09:47:15 -03:00
..
captain	feat(captain): hierarchical model routing + conversation-level memory cache	2026-04-19 09:47:15 -03:00
channel	feat(enterprise): add voice conference API (#13064 )	2025-12-15 15:11:59 -08:00
concerns	feat(lifecycle): inject concierge context into Captain orchestrator prompt	2026-04-15 09:25:16 -03:00
enterprise	feat: bypass user limit validation to allow unlimited agents	2026-02-25 21:40:18 -03:00
account_saml_settings.rb	feat: update users on SAML setup and destroy [CW-2958][CW-5612] (#12346 )	2025-09-15 21:20:22 +05:30
agent_capacity_policy.rb	feat: Add agent capacity controllers (#12200 )	2025-08-26 19:12:58 -07:00
applied_sla.rb	Chore/merge upstream 4.8.0 (#150 )	2025-11-19 16:25:58 -03:00
article_embedding.rb	feat: legacy features to ruby llm (#12994 )	2025-12-11 14:17:28 +05:30
captain_inbox.rb	chore(style): fix rubocop offenses and update typing indicators	2026-02-25 15:06:58 -03:00
company.rb	chore(style): fix rubocop offenses and update typing indicators	2026-02-25 15:06:58 -03:00
copilot_message.rb	feat: Update UI for Copilot (#11561 )	2025-06-02 22:02:03 -05:00
copilot_thread.rb	feat: Add support for more tool, standardize copilot chat service (#11560 )	2025-05-23 01:07:07 -07:00
custom_role.rb	feat: Add APIs to manage custom roles in Chatwoot (#9995 )	2024-08-23 17:18:28 +05:30
inbox_capacity_limit.rb	feat: Add agent capacity controllers (#12200 )	2025-08-26 19:12:58 -07:00
sla_event.rb	feat: Conversation API to return applied_sla and sla_events (#9174 )	2024-04-01 23:30:07 +05:30
sla_policy.rb	fix: Prevent SLA deletion timeouts by moving to async job (#12944 )	2025-12-10 12:28:47 +05:30