November 2025 - Multi-Tier Enrichment & Interest Signals Detection¶
Context¶
Platform now has basic visitor profiling. November focused on building a comprehensive B2B enrichment pipeline and intelligent interest detection to automate lead qualification.
Active clients: AB Tasty, Pennylane, Skaleet, Skello, + Rose website
Technical Challenge¶
Primary Problems: 1. Single-source enrichment insufficient: IP-based enrichment only identifies ~35% of visitors 2. Interest detection manual: No automated way to detect high-intent visitors 3. Dialog state tracking: Conversation markers (demo requests, objections) not extracted 4. No dashboard: Clients cannot analyze conversations
Business Context: Manual lead qualification doesn't scale. Need automated signals to prioritize follow-up.
Hypothesis 1: Multi-Tier Enrichment Pipeline¶
"A cascading enrichment pipeline with multiple data sources and intelligent fallback will achieve >70% visitor identification."
Development Work¶
Architecture Design:
Visitor Request
↓
┌─────────────────────────────────────────────────────────────┐
│ UNIFIED ENRICHER │
├─────────────────────────────────────────────────────────────┤
│ Tier 1: Redis Cache → Hit? Return cached profile │
│ ↓ miss │
│ Tier 2: Supabase Lookup → Known IP hash? Return account│
│ ↓ miss │
│ Tier 3: Browser Reveal → Client-side data available? │
│ ↓ miss │
│ Tier 4: Snitcher Radar API → Session-based identification │
│ ↓ miss │
│ Tier 5: Enrich.so API → Server-side IP enrichment │
└─────────────────────────────────────────────────────────────┘
↓
Merged VisitorProfile with enrichment_source tracking
Key Technical Decisions:
-
IP-address abstraction: Domain is a RESULT of enrichment, not input. Different sources provide different confidence levels.
-
Store/skip IP hash mapping: Controls whether to cache IP→account mapping based on source confidence:
- Browser Reveal (high confidence) → Store
- Snitcher (session-based) → Store
-
Enrich.so (IP-only) → Skip (may be shared IP)
-
Enrichment tier tracking: Each profile records which tier provided data for analytics.
Implementation:
class UnifiedEnricher:
async def enrich(self, visitor_id: str, ip: str) -> EnrichmentResult:
# Tier 1: Redis cache (10ms)
cached = await self.redis_cache.get(visitor_id)
if cached:
return EnrichmentResult(profile=cached, source="cache")
# Tier 2: Supabase known accounts (50ms)
ip_hash = hash_ip(ip)
account = await self.supabase.get_by_ip_hash(ip_hash)
if account:
return EnrichmentResult(profile=account, source="supabase")
# Tier 3: Browser Reveal data (0ms - already available)
if browser_reveal_data:
return EnrichmentResult(profile=browser_reveal_data, source="browser_reveal")
# Tier 4: Snitcher Radar API (200ms)
snitcher_result = await self.snitcher.identify(session_id)
if snitcher_result:
await self._cache_mapping(ip_hash, snitcher_result)
return EnrichmentResult(profile=snitcher_result, source="snitcher")
# Tier 5: Enrich.so fallback (500ms)
enrichso_result = await self.enrichso.enrich(ip)
return EnrichmentResult(profile=enrichso_result, source="enrichso")
Testing & Validation¶
Method: Production A/B test with conversion tracking - Compared identification rates across tiers - Measured latency impact per tier - Tracked enrichment quality by source
Results¶
| Metric | Single API | 5-Tier Pipeline |
|---|---|---|
| Identification Rate | 35% | 72% |
| Average Latency | 800ms | 150ms (cached) / 400ms (full) |
| Data Completeness | 40% | 78% |
| Cache Hit Rate | N/A | 65% |
Conclusion¶
SUCCESS: Multi-tier pipeline more than doubles identification rate while reducing average latency through intelligent caching.
Hypothesis 2: Dynamic Interest Signals Detection¶
"LLM-based analysis of conversation content can detect buying signals and propose demos at the optimal moment."
Development Work¶
Technical Challenge: Demo proposals were never happening at the right time: - Too soon: Visitor still exploring, feels pushy → bounce - Too late: Visitor already left or lost interest - Too often: Repeated proposals annoying users
Solution: Detect interest signals in real-time to propose demo at optimal moment.
Interest Signals Framework:
@dataclass
class InterestSignal:
signal_type: str # pricing_inquiry, demo_request, technical_deep_dive
confidence: float # 0.0 - 1.0
evidence: str # Quote from conversation
action: str # propose_demo, escalate, continue
class InterestSignalsDetector:
def __init__(self, signal_definitions: List[SignalDefinition]):
"""Signal definitions are configurable per client/industry."""
self.signals = signal_definitions
async def detect(self, conversation: Conversation) -> List[InterestSignal]:
"""Analyze conversation for buying signals."""
prompt = self._build_detection_prompt(conversation)
result = await self.llm.analyze(prompt)
return self._parse_signals(result)
Configurable Signal Definitions:
# Per-client signal configuration
signals:
- name: pricing_inquiry
patterns: ["how much", "pricing", "cost", "budget"]
weight: 0.7
action: propose_demo
- name: technical_deep_dive
patterns: ["integration", "API", "implementation"]
weight: 0.5
action: continue
- name: competitor_mention
patterns: ["compared to", "vs", "alternative"]
weight: 0.8
action: escalate
Demo Proposal Logic:
async def should_propose_demo(signals: List[InterestSignal]) -> bool:
"""Threshold-based demo proposal decision."""
total_score = sum(s.confidence * s.weight for s in signals)
return total_score >= DEMO_THRESHOLD # Configurable per client
Testing & Validation¶
Method: Manual labeling + automated testing - Labeled 200 conversations for interest signals - Compared LLM detection vs human labels - Measured false positive/negative rates
Results¶
Signal Detection Accuracy:
| Signal Type | Precision | Recall | F1 Score |
|---|---|---|---|
| Demo Request | 92% | 88% | 0.90 |
| Pricing Inquiry | 87% | 91% | 0.89 |
| Technical Deep-Dive | 78% | 82% | 0.80 |
| Competitor Mention | 85% | 79% | 0.82 |
Business Impact (Conversion Rate):
November baseline established: 2.89% (interaction to form submitted)
| Metric | November 2025 | December 2025 |
|---|---|---|
| Conversion Rate | 2.89% | 3.15% (+9%) |
| Demo timing | Signal-based | Signal-based |
Conversion improvement measured as Interest Signals Detection matured in production.
Conclusion¶
SUCCESS: LLM-based signal detection achieves >85% accuracy for high-value signals. Established conversion baseline of 2.89% with signal-based demo proposals replacing manual/random timing.
Hypothesis 3: Dialog State Extraction¶
"Extracting structured markers from conversation (demo requested, objections raised, next steps) enables automated workflow triggers."
Development Work¶
Dialog State Model:
class DialogSupervisionState(BaseModel):
demo_requested: bool = False
demo_booked: bool = False
email_captured: bool = False
objections: List[str] = []
next_steps: List[str] = []
conversation_stage: ConversationStage = ConversationStage.DISCOVERY
Extraction Node in LangGraph:
async def dialog_state_extractor_node(state: IXChatState) -> dict:
"""Extract dialog markers from conversation history."""
prompt = DIALOG_STATE_EXTRACTION_PROMPT.format(
history=format_history(state.messages)
)
result = await llm.with_structured_output(DialogSupervisionLLMOutput).invoke(prompt)
return {
"dialog_supervision_state": result,
"should_propose_demo": result.demo_interest_score > THRESHOLD
}
Integration with Conversation Storage: - Dialog state persisted with each conversation - Enables dashboard queries (e.g., "all conversations with demo requests") - Triggers N8N webhooks for CRM integration
Testing & Validation¶
Method: Integration tests + production validation - Created test conversations with known markers - Validated extraction accuracy - Monitored webhook trigger reliability
Results¶
| Marker | Extraction Accuracy |
|---|---|
| Demo Requested | 94% |
| Email Captured | 99% |
| Objections | 81% |
| Next Steps | 76% |
Conclusion¶
SUCCESS: Dialog state extraction enables automated workflow triggers with high accuracy for key markers.
Hypothesis 4: Client Dashboard with Real-Time Analytics¶
"A dedicated dashboard will enable clients to analyze conversations, identify patterns, and measure ROI."
Development Work¶
Dashboard Architecture:
Client Dashboard (React + TailwindCSS)
├── Authentication (Supabase Auth)
├── Domain Selection (multi-tenant)
├── Statistics Tiles
│ ├── Active Visitors
│ ├── Conversations
│ ├── Email Capture Rate
│ ├── Demo Bookings
│ └── Engagement Rate
├── Conversations Table
│ ├── Filtering (date, intent, account)
│ ├── Sorting (messages, activity)
│ └── Pagination
├── Visitor Details Panel
│ ├── Profile Information
│ ├── Conversation History
│ └── Interest Signals
└── Accounts View
├── Company Aggregation
└── Multi-visitor Tracking
Key Features: - Environment filtering (production/staging/test) - Date range selection with custom ranges - Intent-based filtering - Admin-only features (test environment, hidden accounts)
Database Schema (Supabase):
-- Dashboard access control
CREATE TABLE backoffice_users (
id UUID PRIMARY KEY,
email TEXT UNIQUE NOT NULL,
domains TEXT[] NOT NULL, -- Accessible domains
is_admin BOOLEAN DEFAULT FALSE,
user_id UUID REFERENCES auth.users(id)
);
-- RLS for domain-based access
CREATE FUNCTION get_accessible_domains()
RETURNS TEXT[] AS $$
SELECT CASE
WHEN EXISTS (SELECT 1 FROM backoffice_users WHERE user_id = auth.uid() AND is_admin)
THEN ARRAY(SELECT DISTINCT domain FROM site_configs)
ELSE (SELECT domains FROM backoffice_users WHERE user_id = auth.uid())
END;
$$ LANGUAGE SQL SECURITY DEFINER;
Testing & Validation¶
Method: User acceptance testing with pilot clients - Deployed to AB Tasty and Pennylane - Gathered feedback on usability - Iterated on filtering and display
Results¶
| Feature | User Satisfaction |
|---|---|
| Statistics Overview | 4.5/5 |
| Conversation Search | 4.2/5 |
| Visitor Details | 4.0/5 |
| Account Aggregation | 4.3/5 |
Conclusion¶
SUCCESS: Dashboard provides actionable insights. Clients can now self-serve conversation analysis.
Hypothesis 5: Suggested Answers & Follow-ups for Conversation Depth¶
"Pre-computed suggested responses and follow-up questions will increase conversation depth by reducing friction for users to continue engaging."
Development Work¶
Technical Challenge: Users often didn't know what to ask next after initial response. Conversation would end after 1-2 turns.
Implementation: - Suggested Answers: LLM-generated relevant responses users can click to send - Follow-up Questions: Proactive next questions to continue conversation - Click-to-send functionality for frictionless interaction - Context-aware generation based on conversation history
Architecture:
interface SuggestedContent {
answers: string[]; // Pre-computed responses
followUps: string[]; // Next questions to ask
}
async function generateSuggestions(
conversation: Message[],
visitorContext: VisitorProfile
): Promise<SuggestedContent> {
// LLM generates contextual suggestions
// Based on conversation flow and visitor intent
}
Testing & Validation¶
Method: A/B test comparing with/without suggestions - Measured conversation depth (turns per conversation) - Tracked click-through rate on suggestions
Results¶
| Metric | October 2025 | November 2025 | Change |
|---|---|---|---|
| Engaged Conversations (2+ turns) | 29.24% | 39.27% | +34% |
Conclusion¶
SUCCESS: Suggested answers & follow-ups directly drove +34% improvement in conversation depth.
Additional Development¶
Safari Compatibility¶
- Fixed Supabase connection issues specific to Safari 18
- Enhanced error classification for browser-specific bugs
Dialog Supervision Enhancement¶
- Dynamic LLM output model creation for signal descriptions
- Preserved dialog state across conversation turns
Measured Business Impact¶
| Metric | October 2025 | November 2025 | Change |
|---|---|---|---|
| Initial Engagement | 1.24% | 1.73% | +40% |
| Engaged Conversations (2+ turns) | 29.24% | 39.27% | +34% |
| Visitor Identification | 35% | 72% | +106% |
Root Cause Analysis: - +40% engagement: Dynamic questions with per-page configuration (Oct) + enrichment (Nov) - +34% engaged conversations: Suggested answers & follow-ups - +106% identification: 5-tier enrichment pipeline
R&D Activities¶
- 5-tier enrichment pipeline (UnifiedEnricher with cascading fallback)
- Interest signals detection (LLM-based buying signal analysis)
- Dialog state extraction (conversation marker detection)
- Suggested answers & follow-ups (conversation depth optimization)
Other Development¶
- Client dashboard
- Database schema & RLS
- Safari compatibility fixes
- Testing & validation
Next Work (December)¶
- Build Intent Router for multi-agent orchestration
- Implement agent configuration system (per-client prompts)
- Prepare JEI (Jeune Entreprise Innovante) documentation
- Refine graph structure with parallel node execution
