Acrescenta valor 'openai_hermes_gateway' ao CAPTAIN_LLM_PROVIDER, sem mexer
nas opções existentes (openai_api e openai_codex_oauth continuam intactos).
Quando ativado, o Captain chama o Hermes Agent rodando em modo gateway HTTP
local (CAPTAIN_HERMES_GATEWAY_URL, default http://host.docker.internal:9877).
O Hermes faz o roteamento multi-modelo (Codex/Anthropic/Gemini) usando o
OAuth dele em ~/.hermes/auth.json — o Captain não precisa fazer OAuth direto.
Configs novas em installation_config.yml:
- CAPTAIN_HERMES_GATEWAY_URL — URL do gateway (default host.docker.internal:9877)
- CAPTAIN_HERMES_GATEWAY_MODEL — modelo no formato <provider>/<model>
- CAPTAIN_HERMES_GATEWAY_API_KEY — opcional, dummy se gateway local não exige
Embeddings e Files API continuam apontando pra OpenAI tradicional via
legacy_openai_settings — Hermes Gateway não expõe esses endpoints.
Specs cobrem: dummy key, custom api_key override, custom model, defaults,
trailing slash strip, light_model por provider, hermes_gateway? predicate.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Anon key não tinha permissão de INSERT em reserva_hotel.unidades — RLS
exige authenticated + tenant_member, não atendido. POST direto falhava
sem feedback útil.
Solução: RPC reserva_hotel.provision_unidade(...) com SECURITY DEFINER
que faz upsert idempotente bypassando RLS, com validações de tenant +
marca dentro da função. EXECUTE granted to anon.
Service agora chama /rpc/provision_unidade em vez de POST /unidades.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Hook after_commit on:create no Captain::Unit dispara
ProvisionUnitInSupabaseJob, que upserta a unit em reserva_hotel.unidades
via Supabase REST (UNIQUE on tenant_id+chatwoot_unit_id) e grava IDs no
Captain::Unit (supabase_unit_id, supabase_tenant_id, supabase_marca_id).
Sem isso, criar nova unidade no painel Pix não habilitava roleta — a row
no Supabase ficava ausente e OfferService caía em "tenant não resolvido".
Inclui rake captain:reprovision_unit_in_supabase[id] + provision_all
pra reconciliação manual e migration retroativa.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Resolve duas camadas de problema identificadas em teste end-to-end:
1. Embeddings falhavam com HTTP 404 (/codex/v1/embeddings não existe).
Solução: Captain::Llm::EmbeddingService sempre usa OpenAI tradicional
via Llm::Config.with_api_key(legacy_settings). ProviderConfig expõe
legacy_openai_settings pra isso.
2. Servidor Codex ocasionalmente responde com response.failed +
code=server_error (instabilidade transitória). Client agora retenta
até 2x com backoff exponencial (0.5s, 1.5s) em erros retryable:
HTTP 5xx, server_error no response.failed, ou stream inacabado.
Outras correções nesta etapa:
- Scenario#agent_model: em modo Codex, ignora CAPTAIN_OPEN_AI_MODEL_SCENARIO
(que pode ter gpt-4o legado) e usa ProviderConfig.model.
- ExtractionService/ContradictionCheckerService/TranslateQueryService:
trocam constantes hardcoded gpt-4o-mini/gpt-4.1-nano por
ProviderConfig.light_model (respeitando o provider ativo).
- ProviderConfig.DEFAULT_CODEX_MODEL agora é gpt-5.2 (reconhecido pelo
RubyLLM; gpt-5.4 não está no catalog do gem).
Validado ponta-a-ponta: WhatsApp → Chatwoot → Jasmine → handoff Daniela
→ faq_lookup com embedding OK → resposta com preços corretos.
Docs em docs/captain-codex-oauth.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adiciona o toggle openai_api | openai_codex_oauth. Por padrão mantém
comportamento legado (API key OpenAI tradicional). Quando mudamos pra
openai_codex_oauth, os clientes (RubyLLM + Agents gem) passam a
apontar para o proxy interno em http://localhost:3000/codex,
configurável via CAPTAIN_CODEX_PROXY_URL.
- Captain::Llm::ProviderConfig: single source of truth de api_key,
api_base e model, baseado em CAPTAIN_LLM_PROVIDER
- config/initializers/ai_agents.rb refatorado
- lib/llm/config.rb refatorado
- 8 specs do ProviderConfig passando
- Fallback seguro: api_key dummy ('codex-oauth') quando usando proxy
(o proxy ignora Authorization e usa OAuth interno)
NÃO mexe no Llm::LegacyBaseOpenAiService (PDF/Files API). Esse
continua sempre na API tradicional porque o endpoint Codex não
expõe Files API.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex endpoint retorna HTTP 400 "Instructions are required" quando o
campo vem ausente. Agora sempre incluímos o campo — string com espaço
quando não há system message no request.
Validado end-to-end: curl → /codex/v1/chat/completions → proxy traduz
→ Codex devolve streaming SSE → proxy agrega → JSON Chat Completions.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- RetentionSummaryBadge in the "Previous conversations" sidebar:
tiered status (First contact / Active / Recurring / Sleeping /
At risk / Inactive) + counts of interactions, one-shots, Pix.
- Retention tab in Captain Reports: KpiCards, FlowCard, CohortMatrix
(12x13 heatmap with CSV export).
- Five new filters on the contacts list: recurring, last interaction,
days since, interactions count, reservations paid.
- Full pt_BR + en i18n under CAPTAIN_REPORTS.RETENTION.*
- Spec for InteractionCalculatorService covering gap behavior,
one-shot classification, internal-label exclusion, multi-conversation
grouping across the 30h window.
- Docs: docs/captain-retention-indicators.md with business rules,
column reference, endpoint shape, and backup SQL queries.
Consolida o trabalho desta branch de abril/2026 em um bloco pronto pra
testar em staging antes do merge pra main.
## Correções de memória semântica
- ExtractionService: Princípio Zero + Regra de Ouro (ação consumada vs intenção).
- Cenário Daniela_Reservas: Passo 0 de classificação (consulta/intenção/fora).
## Roleta da Sorte (end-to-end)
- Schema Supabase + 7 RPCs atômicas (server-side, idempotentes).
- Services: Offer, Redeem, WeeklyReport.
- Jobs: OfferRouletteJob (hook em ConfirmationService após Pix pago),
NotifyRevealed + Scheduler de fallback.
- Tool manual GenerateRoletaLinkTool + endpoint público /roleta/notify.
- Dashboard /captain/roleta com Resgate + Relatório + anomaly detection.
## Cenário Reclamacoes_Ouvidoria
- Triagem P1-P4, framework LAST, Three-level listening, Self-check.
- Sem compensação material, detecção de cliente frustrado eleva prioridade.
## Analytics
- Funil de conversão /captain/funnel: 5 etapas via regex, zero LLM.
- Detector de churn via ChurnOutreach* (cron dias úteis 10h-17h BRT).
## Trabalho pré-existente incluído
- Captain Executive Reports (ceo_digest, mattermost_delivery).
- get_reserva_preco_tool, Lifecycle ajustes, Reservations UI polimentos.
## Outros
- .gitignore: patterns pra credenciais.
- Migrations de scenarios idempotentes.
- i18n completa pt_BR+en pra roleta/funnel.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Problema observado em teste real 2026-04-19 11:24:
usuário forneceu suíte+data+hora pra Daniela. Em vez de chamar
generate_pix, Daniela chamou handoff_to_jasmine. Jasmine respondeu
"Vou te transferir pra Daniela..." — mentira, a conversa ficou
parada com a Jasmine.
Sequência dentro de UM único run:
jasmine.handoff_to_daniela_reservas_agent
-> daniela.handoff_to_jasmine (!)
-> jasmine responde "vou te transferir..."
O prompt da Daniela tem "🚨 NUNCA FAÇA HANDOFF DE VOLTA PRA JASMINE"
mas o LLM ignora a proibição quando a ferramenta está registrada.
A única solução robusta é não registrar a ferramenta.
Historicamente tivemos medo de remover a back-edge porque sem ela
a Daniela (quando confusa) ficava em loop chamando faq_lookup —
incidente que queimou créditos reais. Esse medo não vale mais:
commit f3f8a8d5c adicionou TOOL_LOOP_THRESHOLD=3 +
MAX_TURNS_PER_MESSAGE=15 que disparam bot_handoff automático em
qualquer loop de tool. A proteção contra runaway existe por
OUTRA via agora, então podemos remover a back-edge com segurança.
Efeito esperado:
- scenario termina a resposta sozinho (sem ping-pong)
- scenario confuso/em loop -> rate limit corta -> humano recebe
Memory: atualizado feedback_never_touch_captain_without_safety_caps.md
refletindo a nova invariante.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Três camadas de proteção contra runaway token burn no AgentRunnerService:
1. MAX_TURNS_PER_MESSAGE = 15
Cap dentro de uma única chamada run(). Já estava aplicado;
agora extraído como constante nomeada.
2. MAX_TURNS_PER_CONVERSATION = 30
Cap ao longo da vida da conversa. Contador em
conversation.custom_attributes['captain_turn_count']. Ao atingir,
dispara bot_handoff automático e responde com mensagem de
transferência pra humano.
3. TOOL_LOOP_THRESHOLD = 3
Detecta a mesma (tool_name, args) invocada 3+ vezes no resultado
de um único run (sintoma do loop faq_lookup que queimou tokens
em 2026-04-19). Ao detectar: dispara bot_handoff e aborta o turno.
trigger_bot_handoff! aciona conversation.bot_handoff! quando
disponível, removendo a conversa do pipeline automático.
Motivação: dois incidentes reais de queima de crédito OpenAI em
2026-04-19. Ver memory/feedback_never_touch_captain_without_safety_caps.md
pras invariantes completas.
Tests atualizados: mock_result agora stuba :messages (usado pelo
novo tool_loop_detected?) e max_turns esperado é 15.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
User feedback revealed a fundamental design issue: the memory model was
accumulating contradictory "Prefere X" facts because a single choice was
being treated as a permanent preference. Result: 3 different
"Prefere suite X" entries coexisting, all at 90% confidence, with
reservation patterns over time (2hrs, 4hrs, pernoite) all claiming to be
the customer's "preferred" duration.
Corrections:
1. ExtractionService prompt — preferencia now requires EXPLICIT
declaration words ("prefiro", "gosto mais de", "sempre escolho",
"adoro", "favorita"). A mere choice in one conversation is NO LONGER
extracted as preferencia — instead it goes to padrao_comportamental
WITH THE DATE in the content (e.g. "Reservou Alexa para pernoite em
23/05/2026"). This makes memory temporal and auditable instead of
imposing fake consistency.
2. Reference date is passed to the LLM prompt via the latest message
timestamp, used as the anchor date the LLM must embed in every
padrao_comportamental content.
3. ContradictionCheckerService — dual threshold:
- cosine < 0.15 → auto-supersede without LLM (pure duplicate)
- 0.15 to 0.6 → ask LLM if contradicts, supersede if yes
- > 0.6 → ignore, unrelated facts
Previously only the middle band existed, so near-duplicate facts like
two "aniversário 23/05" entries or three "prefere suite X" entries
were never cleaned up.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Also fixes double-scheduling bug in scheduler_spec and delivery_spec caused by
after_create_commit hook firing while rules already exist — reservation is now
created before rules in setup so the hook finds nothing to schedule.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Orchestrates guards → render (Liquid) → send pipeline for one delivery.
Handles skip, reschedule, sent, failed states and re-enqueues on reschedule.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implement guards following the same pass/reschedule/too_stale pattern as QuietHours.
Also fix belongs_to :conversation on Delivery to use class_name: '::Conversation' to avoid namespace resolution failure inside Captain::Lifecycle module.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pure function mapping reservation events to timestamps; used by Scheduler (T9) to compute fire_at.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
# Pull Request Template
## Description
## Type of change
typo fix
## How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide
instructions so we can reproduce. Please also list any relevant details
for your test configuration.
## Checklist:
- [x] My code follows the style guidelines of this project
- [x] I have performed a self-review of my code
- [x] I have commented on my code, particularly in hard-to-understand
areas
- [ ] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] New and existing unit tests pass locally with my changes
- [x] Any dependent changes have been merged and published in downstream
modules
# Pull Request Template
## Description
Instruments captain v2
## Type of change
- [x] New feature (non-breaking change which adds functionality)
## How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide
instructions so we can reproduce. Please also list any relevant details
for your test configuration.
Local testing:
<img width="864" height="510" alt="image"
src="https://github.com/user-attachments/assets/855ebce5-e8b8-4d22-b0bb-0d413769a6ab"
/>
## Checklist:
- [x] My code follows the style guidelines of this project
- [x] I have performed a self-review of my code
- [x] I have commented on my code, particularly in hard-to-understand
areas
- [ ] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] New and existing unit tests pass locally with my changes
- [x] Any dependent changes have been merged and published in downstream
modules
---------
Co-authored-by: Shivam Mishra <scm.mymail@gmail.com>
## Summary
This PR reduces duplicate failure noise for audio transcription jobs
that fail with permanent HTTP 400 responses, and fixes a file-format
edge case causing intermittent 400s.
Sentry issue: [CHATWOOT-99E /
6660541334](https://chatwoot-p3.sentry.io/issues/6660541334/)
## Confirmed root cause
For some attachments, the stored filename had no extension (example:
`speech`, content type `audio/mpeg`).
When the temporary transcription upload file was created without an
extension, OpenAI returned:
`Unrecognized file format` (HTTP 400).
## Scope of changes
1. `Messages::AudioTranscriptionJob`
- Keeps `discard_on Faraday::BadRequestError` to avoid retry storms on
permanent request errors.
- Adds explicit Rails warning logs for discarded jobs with
attachment/job/status context.
2. `Messages::AudioTranscriptionService`
- Keeps guaranteed temp file cleanup via `ensure`.
- Ensures temp upload files include an extension when the original
filename has none, derived from blob `content_type`.
- This addresses intermittent failures like extensionless `audio/mpeg`
files.
## Reproduction
Enable audio transcription for an account and process an audio
attachment whose stored filename has no extension (for example `speech`)
but valid audio content type (`audio/mpeg`).
Before this fix, OpenAI transcription could return HTTP 400
`Unrecognized file format` for that attachment while similar attachments
with extensions succeeded.
## Testing
Ran:
`bundle exec rubocop
enterprise/app/jobs/messages/audio_transcription_job.rb
enterprise/app/services/messages/audio_transcription_service.rb`
Result: both modified files pass lint with no offenses.
## Linear Ticket:
https://linear.app/chatwoot/issue/CW-6081/review-feedback
## Description
Assignment V2 Service Enhancements
- Enable Assignment V2 on plan upgrade
- Fix UI issue with fair distribution policy display
- Add advanced assignment feature flag and enhance Assignment V2
capabilities
## Type of change
- [ ] Bug fix (non-breaking change which fixes an issue)
## How Has This Been Tested?
This has been tested using the UI.
## Checklist:
- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my code
- [ ] I have commented on my code, particularly in hard-to-understand
areas
- [ ] I have made corresponding changes to the documentation
- [ ] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged and published in downstream
modules
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> **Medium Risk**
> Changes auto-assignment execution paths, rate limiting defaults, and
feature-flag gating (including premium plan behavior), which could
affect which conversations get assigned and when. UI rewires inbox
settings and policy flows, so regressions are possible around
navigation/linking and feature visibility.
>
> **Overview**
> **Adds a new premium `advanced_assignment` feature flag** and uses it
to gate capacity/balanced assignment features in the UI (sidebar entry,
settings routes, assignment-policy landing cards) and backend
(Enterprise balanced selector + capacity filtering).
`advanced_assignment` is marked premium, included in Business plan
entitlements, and auto-synced in Enterprise accounts when
`assignment_v2` is toggled.
>
> **Improves Assignment V2 policy UX** by adding an inbox-level
“Conversation Assignment” section (behind `assignment_v2`) that can
link/unlink an assignment policy, navigate to create/edit policy flows
with `inboxId` query context, and show an inbox-link prompt after
creating a policy. The policy form now defaults to enabled, disables the
`balanced` option with a premium badge/message when unavailable, and
inbox lists support click-to-navigate.
>
> **Tightens/adjusts auto-assignment behavior**: bulk assignment now
requires `inbox.enable_auto_assignment?`, conversation ordering uses the
attached `assignment_policy` priority, and rate limiting uses
`assignment_policy` config with an infinite default limit while still
tracking assignments. Tests and i18n strings are updated accordingly.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
23bc03bf75ee4376071e4d7fc7cd564c601d33d7. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
---------
Co-authored-by: Pranav <pranav@chatwoot.com>
Co-authored-by: iamsivin <iamsivin@gmail.com>
Co-authored-by: Muhsin Keloth <muhsinkeramam@gmail.com>
Co-authored-by: Shivam Mishra <scm.mymail@gmail.com>
The index is already added in production.
Adds a new reporting API that returns conversation counts grouped by
channel type and first response time buckets (0-1h, 1-4h, 4-8h, 8-24h,
24h+).
- GET /api/v2/accounts/:id/reports/first_response_time_distribution
- Uses SQL aggregation to handle large datasets efficiently
- Adds composite index on reporting_events for query performance
Tested on production workload.
Request: GET
`/api/v2/accounts/1/reports/first_response_time_distribution?since=<since>&until=<until>`
Response payload:
```
{
"Channel::WebWidget": {
"0-1h": 120,
"1-4h": 85,
"4-8h": 32,
"8-24h": 12,
"24h+": 3
},
"Channel::Email": {
"0-1h": 12,
"1-4h": 28,
"4-8h": 45,
"8-24h": 35,
"24h+": 10
},
"Channel::FacebookPage": {
"0-1h": 50,
"1-4h": 30,
"4-8h": 15,
"8-24h": 8,
"24h+": 2
}
}
```
---------
Co-authored-by: Muhsin Keloth <muhsinkeramam@gmail.com>
# Pull Request Template
## Description
Fixes # (issue)
When we migrated to RubyLLM, images weren't being sent properly in
RubyLLM format to the model, so it did not understand images.
## Type of change
Please delete options that are not relevant.
- [x] Bug fix (non-breaking change which fixes an issue)
## How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide
instructions so we can reproduce. Please also list any relevant details
for your test configuration.
specs + local testing
Current behaviour on staging:
<img width="772" height="1012" alt="image"
src="https://github.com/user-attachments/assets/7b7d360f-dea4-48af-b20b-ee4c98a38a85"
/>
local testing with fix:
<img width="792" height="1216" alt="image"
src="https://github.com/user-attachments/assets/5ef82452-015e-4bda-a68f-884d00acb014"
/>
## Checklist:
- [x] My code follows the style guidelines of this project
- [x] I have performed a self-review of my code
- [x] I have commented on my code, particularly in hard-to-understand
areas
- [ ] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] New and existing unit tests pass locally with my changes
- [x] Any dependent changes have been merged and published in downstream
modules
---------
Co-authored-by: Sojan Jose <sojan@pepalo.com>
## Description
- Replaces Stripe Checkout session flow with direct card charging for AI
credit top-ups
- Adds a two-step confirmation modal (select package → confirm purchase)
for better UX
- Creates Stripe invoice directly and charges the customer's default
payment method immediately
## Type of change
- [ ] New feature (non-breaking change which adds functionality)
## How Has This Been Tested?
- Using the specs
- UI manual test cases
<img width="945" height="580" alt="image"
src="https://github.com/user-attachments/assets/52bdad46-cd0e-4927-b13f-54c6b6353bcc"
/>
<img width="945" height="580" alt="image"
src="https://github.com/user-attachments/assets/231bc7e9-41ac-440d-a93d-cba45a4d3e3e"
/>
## Checklist:
- [ ] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my code
- [ ] I have commented on my code, particularly in hard-to-understand
areas
- [ ] I have made corresponding changes to the documentation
- [ ] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged and published in downstream
modules
---------
Co-authored-by: Shivam Mishra <scm.mymail@gmail.com>
We’ve been watching Sidekiq workers climb from ~600 MB at boot to
1.4–1.5 GB after an hour whenever attachment-heavy jobs run. This PR is
an experiment to curb that growth by streaming attachments instead of
loading the whole blob into Ruby: reply-mailer inline attachments,
Telegram uploads, and audio transcriptions now read/write in chunks. If
this keeps RSS stable in production we’ll keep it; otherwise we’ll roll
it back and keep digging