From 792951e4c851aa2926852aefd542d2d2567a009a Mon Sep 17 00:00:00 2001 From: Rodrigo Borba Date: Thu, 26 Feb 2026 16:35:15 -0300 Subject: [PATCH] fix(ci): update health check endpoint for review apps - Return expected payload { version, timestamp, queue_services, data_services } in /health - Fix infinite attempt loop in deploy check Github Action - Untrack temporary wuzapi test scripts --- .github/workflows/deploy_check.yml | 7 +++--- app/controllers/health_controller.rb | 22 +++++++++++++++++- progresso/deploy_check_health_endpoint.md | 27 +++++++++++++++++++++++ 3 files changed, 52 insertions(+), 4 deletions(-) create mode 100644 progresso/deploy_check_health_endpoint.md diff --git a/.github/workflows/deploy_check.yml b/.github/workflows/deploy_check.yml index 9f295a6c8..bb99a00b8 100644 --- a/.github/workflows/deploy_check.yml +++ b/.github/workflows/deploy_check.yml @@ -22,22 +22,23 @@ jobs: run: echo "https://chatwoot-pr-${{ github.event.pull_request.number }}.herokuapp.com" - name: Check Deployment Status run: | - max_attempts=10 + max_attempts=15 attempt=1 status_code=0 echo "Waiting for review app to be deployed/redeployed, trying in 10 minutes..." sleep 600 while [ $attempt -le $max_attempts ]; do - response=$(curl -s -o /dev/null -w "%{http_code}" https://chatwoot-pr-${{ github.event.pull_request.number }}.herokuapp.com/api) + response=$(curl -s -o /dev/null -w "%{http_code}" https://chatwoot-pr-${{ github.event.pull_request.number }}.herokuapp.com/health) status_code=$(echo $response | head -n 1) if [ $status_code -eq 200 ]; then - body=$(curl -s https://chatwoot-pr-${{ github.event.pull_request.number }}.herokuapp.com/api) + body=$(curl -s https://chatwoot-pr-${{ github.event.pull_request.number }}.herokuapp.com/health) if echo "$body" | jq -e '.version and .timestamp and .queue_services == "ok" and .data_services == "ok"' > /dev/null; then echo "Deployment successful" exit 0 else echo "Deployment status unknown, retrying in 3 minutes..." sleep 180 + attempt=$((attempt + 1)) fi else echo "Waiting for review app to be ready, retrying in 3 minutes..." diff --git a/app/controllers/health_controller.rb b/app/controllers/health_controller.rb index fdf969a39..f696fa4ff 100644 --- a/app/controllers/health_controller.rb +++ b/app/controllers/health_controller.rb @@ -2,6 +2,26 @@ # authentication, and callbacks. Used for health checks class HealthController < ActionController::Base # rubocop:disable Rails/ApplicationController def show - render json: { status: 'woot' } + render json: { + version: Chatwoot.config[:version] || 'dev', + timestamp: Time.current.to_fs(:db), + queue_services: redis_status, + data_services: postgres_status + } + end + + private + + def redis_status + r = Redis.new(Redis::Config.app) + r.ping ? 'ok' : 'failing' + rescue StandardError + 'failing' + end + + def postgres_status + ActiveRecord::Base.connection.active? ? 'ok' : 'failing' + rescue StandardError + 'failing' end end diff --git a/progresso/deploy_check_health_endpoint.md b/progresso/deploy_check_health_endpoint.md new file mode 100644 index 000000000..88c38e9d3 --- /dev/null +++ b/progresso/deploy_check_health_endpoint.md @@ -0,0 +1,27 @@ +# Correção do Deploy Check da Review App + +## Objetivo +Corrigir a falha no pipeline de CI "Deploy Check" (`.github/workflows/deploy_check.yml`) que quebrava após 10 tentativas devido à Review App (Heroku) não retornar o healthcheck esperado. + +## Contexto +O workflow do GitHub Actions esperava um JSON do endpoint `/api` contendo os campos `version`, `timestamp`, `queue_services` e `data_services` todos populados e com valor `"ok"` pros serviços. Porém, o endpoint `/api` (referente ao `ApiController#index`) não era exposto corretamente em alguns ambientes ou levantava erro 500 caso o Redis/Postgres demorassem a subir, além de cair em um loop infinito no script bash porque a variável `$attempt` não era incrementada se a chamada HTTP retornasse 200 mas o JSON fosse inválido. + +## Passos Realizados +1. Mapeamos que já existia uma rota `get '/health', to: 'health#show'` apontando para o `HealthController` que apenas respondia `{ status: 'woot' }`. +2. Alteramos o `HealthController#show` para retornar o JSON robusto exigido pelo workflow, fazendo o ping no Redis e no Postgres e blindando as exceções com `rescue StandardError` para nunca retornar 500 durante a fase de boot. +3. Editamos o arquivo `.github/workflows/deploy_check.yml`: + - Trocamos o `curl` de `/api` para `/health`. + - Adicionamos a instrução `attempt=$((attempt + 1))` no bloco `else` (quando o teste do `jq` não passa), corrigindo o loop infinito. + - Aumentamos o `max_attempts` de 10 para 15 (dando 45 minutos de tolerância para a Review App subir o banco de dados e os dynos completamente). + +## Principais Arquivos Alterados +- `app/controllers/health_controller.rb` +- `.github/workflows/deploy_check.yml` + +## Como Validar +1. Subir essas alterações (commit e push) na branch do PR. +2. Acompanhar a aba "Actions" no GitHub e verificar o job "Check Deployment (pull_request)". +3. O script bash deverá fazer o cURL em `/health` e, assim que o PostgreSQL e Redis reportarem `"ok"`, o step será marcado como "Deployment successful". + +## Como Reverter +Basta fazer um git revert do commit que adicionou essas alterações ou retornar os arquivos aos estados anteriores (o `HealthController` retornando apenas `{ status: 'woot' }` e o `deploy_check.yml` voltando para `/api` sem o incremento).