iachat/lib/tasks
Vinay Keerthi ef54f07d5b
feat: Add company backfill migration for existing contacts (Part 1) (#12657)
## Description

Implements company backfill migration infrastructure for existing
contacts. This is **Part 1 of 2** for the company model production
rollout as described in
[CW-5726](https://linear.app/chatwoot/issue/CW-5726/company-model-setting-it-up-on-production).

Creates jobs and services to associate existing contacts with companies
based on their email domains, filtering out free email providers (gmail,
yahoo, etc.) and disposable addresses.
 

**What's included:**
- Business email detector service with ValidEmail2 (uses
`disposable_domain?` to avoid DNS lookups)
- Per-account batch job to process contacts for one account
- Orchestrator job to iterate all accounts
- Rake task: `bundle exec rake companies:backfill`

~~*NOTE*: I'm using a hard-coded approach to determine if something is a
"business" email by filtering out emails that are usually personal. I've
also added domains that are common to some of our customers' regions.
This should be simpler. I looked into `Valid_Email2` and I couldn't find
anything to dictate whether an email is a personal email or a business
one. I don't think the approach used in the frontend is valid here.~~
UPDATE: Using `email_provider_info` gem instead.


**Pending - Part 2 (separate PR):** Real-time company creation for new
contacts

## Type of change

- [x] New feature (non-breaking change which adds functionality)

## How Has This Been Tested?

```bash
# Run all new tests
bundle exec rspec spec/enterprise/services/companies/business_email_detector_service_spec.rb \\
                   spec/enterprise/jobs/migration/company_account_batch_job_spec.rb \\
                   spec/enterprise/jobs/migration/company_backfill_job_spec.rb

# Run RuboCop
bundle exec rubocop enterprise/app/services/companies/business_email_detector_service.rb \\
                     enterprise/app/jobs/migration/company_account_batch_job.rb \\
                     enterprise/app/jobs/migration/company_backfill_job.rb \\
                     lib/tasks/companies.rake
```

**Performance optimization:**
- Uses `disposable_domain?` instead of `disposable?` to avoid DNS MX
lookups (discovered via tcpdump analysis - `disposable?` was making
network calls for every email, causing 100x slowdown)

## Checklist:

- [x] My code follows the style guidelines of this project
- [x] I have performed a self-review of my code
- [x] I have commented on my code, particularly in hard-to-understand
areas
- [ ] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] New and existing unit tests pass locally with my changes
- [ ] Any dependent changes have been merged and published in downstream
modules

---------

Co-authored-by: Sojan Jose <sojan@pepalo.com>
2025-11-03 20:03:47 +05:30
..
dev feat: Add development variant toggle rake task (#11696) 2025-06-10 09:47:59 -04:00
ops feat: add ops task to purge orphan conversations (#12279) 2025-08-27 14:42:11 +02:00
.keep Initial Commit 2019-08-14 15:18:44 +05:30
asset_clean.rake chore: Upgrade Tailwind CSS to 3.3.2 (#7380) 2023-06-26 11:27:16 -07:00
auto_annotate_models.rake chore: Refactor Response Bot Data Schema (#8011) 2023-10-01 19:31:38 -07:00
build.rake feat: Vite + vue 3 💚 (#10047) 2024-10-02 00:36:30 -07:00
captain_chat.rake feat(ee): Captain custom http tools (#12584) 2025-10-06 07:53:15 -07:00
companies.rake feat: Add company backfill migration for existing contacts (Part 1) (#12657) 2025-11-03 20:03:47 +05:30
db_enhancements.rake fix: use supported access method for schema_format in Rails 7 (#11576) 2025-05-27 15:34:59 -06:00
generate_test_data.rake chore: Generate test data for bulk insertion (#11229) 2025-05-06 11:13:11 +05:30
instance_id.rake feat: Report cwctl events to hub (#8009) 2023-10-10 09:16:03 +05:30
ip_lookup.rake fix: ip-lookup database lazy loading for all environments (#8052) 2024-10-22 23:18:30 -07:00
mfa.rake fix: Session controller to not generate auth tokens before mfa verification (#12487) 2025-09-23 19:13:47 +05:30
seed_reports_data.rake feat: label reports overview (#11194) 2025-06-11 14:35:46 +05:30
sidekiq_tasks.rake feat: Add rake task to clear ActionCable jobs (#9307) 2024-04-30 08:03:40 -07:00
swagger.rake feat: Update swagger to openapi 3.0.4, update request payloads with examples (#11533) 2025-05-22 17:57:12 +07:00