Visitor Enrichment¶
Overview¶
Visitor Enrichment identifies company information about website visitors when the widget loads, before any conversation begins. This enables personalized experiences from the first interaction.
How It Works¶
Feature Flag¶
Enrichment is controlled per-site via enrich_all_visitors:
Checked via AgentConfigResolver.enrich_all_visitors.
API Endpoint¶
POST /api/visitor/impression
Request¶
{
"siteName": "example.com",
"sessionId": "sess_abc123",
"personId": "posthog_distinct_id",
"visitorIp": "1.2.3.4",
"browserRevealData": {
"ip": "1.2.3.4",
"userAgent": "..."
},
"snitcherSessionId": "snitcher_id"
}
Response¶
{
"status": "accepted",
"enrichment_triggered": true,
"message": "Enrichment started in background"
}
Status Values¶
| Status | Meaning |
|---|---|
accepted |
Enrichment triggered in background |
already_enriched |
Visitor already enriched (deduplication) |
skipped |
Feature disabled for this site |
error |
An error occurred |
Key Files¶
Backend¶
| File | Purpose |
|---|---|
ixchat/enrichment/impression_enricher.py |
Main enrichment logic |
ixchat/enrichment/unified_enricher.py |
Multi-source enrichment pipeline |
ixchat/enrichment/redis_cache.py |
Deduplication cache |
api/search/routes/visitor.py |
API endpoint |
Enrichment Sources¶
The system can query multiple enrichment sources:
| Source | Data Provided |
|---|---|
| IP-based services | Company name, domain, industry |
| Browser reveal | More accurate IP, device info |
| Snitcher | Session-based company identification |
Sources are configured in source_config.py.
Data Flow¶
- Widget loads → Sends impression request
- Feature check → Is
enrich_all_visitorsenabled? - Deduplication → Check Redis cache for existing enrichment
- Background enrichment → Query sources asynchronously
- Storage → Save to
visitorstable in Supabase - Session merge → Data available when conversation starts
Deduplication¶
To avoid redundant API calls, enrichments are cached:
async def is_already_enriched(person_id: str | None) -> bool:
cached = await get_cached_enrichment(person_id)
return cached is not None and cached.enrichment_status == "completed"
Cache key: enrichment:{person_id}
Enriched Data¶
Stored in the visitors table:
| Field | Description |
|---|---|
person_id |
PostHog distinct_id |
site_domain |
Site where visitor was seen |
email |
If captured later |
enrichment_data |
JSON with company info |
enrichment_data structure:
{
"company_name": "Acme Corp",
"domain": "acme.com",
"sector": "Technology",
"sub_sector": "SaaS",
"company_description": "..."
}
Using Enriched Data¶
When a conversation starts, enriched data is automatically loaded:
# In chatbot._prepare_query_state
existing_visitor_data = await storage.get_visitor_profile_data(site_domain, person_id)
if existing_visitor_data:
# Merge into visitor_profile
deserialized.visitor_profile = current_profile.model_copy(update=profile_updates)
This enables: - Personalized greetings ("Welcome back!") - Sector-specific content - Skip qualification questions
Debugging¶
Backend logs use the [Impression Enricher] and [Visitor Impression] prefixes:
[Visitor Impression] site=example.com, session=sess_abc..., person=yes, ip=yes
[Impression Enricher] Feature enabled for example.com
[Impression Enricher] Starting enrichment pipeline
[Impression Enricher] Enrichment completed: company=Acme Corp
Privacy Considerations¶
- IP addresses are hashed before storage
- Enrichment data is scoped to the site
- Visitors can be excluded via consent management
Testing¶
poetry run pytest packages/ixchat/tests/test_visitor_profiling.py -v
poetry run pytest apps/api/search/tests/test_visitor_impression.py -v
Related¶
- Returning Visitor Recognition - How enriched data persists
- In-Chat Booking Email - Email capture flow