November 2025 - Multi-Tier Enrichment & Interest Signals Detection¶

Context¶

Platform now has basic visitor profiling. November focused on building a comprehensive B2B enrichment pipeline and intelligent interest detection to automate lead qualification.

Active clients: AB Tasty, Pennylane, Skaleet, Skello, + Rose website

Technical Challenge¶

Primary Problems: 1. Single-source enrichment insufficient: IP-based enrichment only identifies ~35% of visitors 2. Interest detection manual: No automated way to detect high-intent visitors 3. Dialog state tracking: Conversation markers (demo requests, objections) not extracted 4. No dashboard: Clients cannot analyze conversations

Business Context: Manual lead qualification doesn't scale. Need automated signals to prioritize follow-up.

Hypothesis 1: Multi-Tier Enrichment Pipeline¶

"A cascading enrichment pipeline with multiple data sources and intelligent fallback will achieve >70% visitor identification."

Development Work¶

Architecture Design:

Visitor Request
       ↓
┌─────────────────────────────────────────────────────────────┐
│                   UNIFIED ENRICHER                          │
├─────────────────────────────────────────────────────────────┤
│  Tier 1: Redis Cache         → Hit? Return cached profile   │
│     ↓ miss                                                   │
│  Tier 2: Supabase Lookup     → Known IP hash? Return account│
│     ↓ miss                                                   │
│  Tier 3: Browser Reveal      → Client-side data available?  │
│     ↓ miss                                                   │
│  Tier 4: Snitcher Radar API  → Session-based identification │
│     ↓ miss                                                   │
│  Tier 5: Enrich.so API       → Server-side IP enrichment    │
└─────────────────────────────────────────────────────────────┘
       ↓
    Merged VisitorProfile with enrichment_source tracking

Key Technical Decisions:

IP-address abstraction: Domain is a RESULT of enrichment, not input. Different sources provide different confidence levels.
Store/skip IP hash mapping: Controls whether to cache IP→account mapping based on source confidence:
Browser Reveal (high confidence) → Store
Snitcher (session-based) → Store
Enrich.so (IP-only) → Skip (may be shared IP)
Enrichment tier tracking: Each profile records which tier provided data for analytics.

Implementation:

name="__codelineno-1-1" href="#__codelineno-1-1">class UnifiedEnricher: async def enrich(self, visitor_id: str, ip: str) -> EnrichmentResult: # Tier 1: Redis cache (10ms) cached = await self.redis_cache.get(visitor_id) if cached: return EnrichmentResult(profile=cached, source="cache") # Tier 2: Supabase known accounts (50ms) ip_hash = hash_ip(ip) account = await self.supabase.get_by_ip_hash(ip_hash) if account: return EnrichmentResult(profile=account, source="supabase") # Tier 3: Browser Reveal data (0ms - already available) if browser_reveal_data: return EnrichmentResult(profile=browser_reveal_data, source="browser_reveal") # Tier 4: Snitcher Radar API (200ms) snitcher_result = await self.snitcher.identify(session_id) if snitcher_result: await self._cache_mapping(ip_hash, snitcher_result) return EnrichmentResult(profile=snitcher_result, source="snitcher") # Tier 5: Enrich.so fallback (500ms) enrichso_result = await self.enrichso.enrich(ip) return EnrichmentResult(profile=enrichso_result, source="enrichso")

Testing & Validation¶

Method: Production A/B test with conversion tracking - Compared identification rates across tiers - Measured latency impact per tier - Tracked enrichment quality by source

Results¶

Metric	Single API	5-Tier Pipeline
Identification Rate	35%	72%
Average Latency	800ms	150ms (cached) / 400ms (full)
Data Completeness	40%	78%
Cache Hit Rate	N/A	65%

Conclusion¶

SUCCESS: Multi-tier pipeline more than doubles identification rate while reducing average latency through intelligent caching.

Hypothesis 2: Dynamic Interest Signals Detection¶

"LLM-based analysis of conversation content can detect buying signals and propose demos at the optimal moment."

Development Work¶

Technical Challenge: Demo proposals were never happening at the right time: - Too soon: Visitor still exploring, feels pushy → bounce - Too late: Visitor already left or lost interest - Too often: Repeated proposals annoying users

Solution: Detect interest signals in real-time to propose demo at optimal moment.

Interest Signals Framework:

@dataclass
class InterestSignal:
    signal_type: str  # pricing_inquiry, demo_request, technical_deep_dive
    confidence: float  # 0.0 - 1.0
    evidence: str  # Quote from conversation
    action: str  # propose_demo, escalate, continue

class InterestSignalsDetector:
    def __init__(self, signal_definitions: List[SignalDefinition]):
        """Signal definitions are configurable per client/industry."""
        self.signals = signal_definitions

    async def detect(self, conversation: Conversation) -> List[InterestSignal]:
        """Analyze conversation for buying signals."""
        prompt = self._build_detection_prompt(conversation)
        result = await self.llm.analyze(prompt)
        return self._parse_signals(result)

Configurable Signal Definitions:

# Per-client signal configuration
signals:
  - name: pricing_inquiry
    patterns: ["how much", "pricing", "cost", "budget"]
    weight: 0.7
    action: propose_demo

  - name: technical_deep_dive
    patterns: ["integration", "API", "implementation"]
    weight: 0.5
    action: continue

  - name: competitor_mention
    patterns: ["compared to", "vs", "alternative"]
    weight: 0.8
    action: escalate

Demo Proposal Logic:

async def should_propose_demo(signals: List[InterestSignal]) -> bool:
    """Threshold-based demo proposal decision."""
    total_score = sum(s.confidence * s.weight for s in signals)
    return total_score >= DEMO_THRESHOLD  # Configurable per client

Testing & Validation¶

Method: Manual labeling + automated testing - Labeled 200 conversations for interest signals - Compared LLM detection vs human labels - Measured false positive/negative rates

Results¶

Signal Detection Accuracy:

Signal Type	Precision	Recall	F1 Score
Demo Request	92%	88%	0.90
Pricing Inquiry	87%	91%	0.89
Technical Deep-Dive	78%	82%	0.80
Competitor Mention	85%	79%	0.82

Business Impact (Conversion Rate):

November baseline established: 2.89% (interaction to form submitted)

Metric	November 2025	December 2025
Conversion Rate	2.89%	3.15% (+9%)
Demo timing	Signal-based	Signal-based

Conversion improvement measured as Interest Signals Detection matured in production.

Conclusion¶

SUCCESS: LLM-based signal detection achieves >85% accuracy for high-value signals. Established conversion baseline of 2.89% with signal-based demo proposals replacing manual/random timing.

Hypothesis 3: Dialog State Extraction¶

"Extracting structured markers from conversation (demo requested, objections raised, next steps) enables automated workflow triggers."

Development Work¶

Dialog State Model:

class DialogSupervisionState(BaseModel):
    demo_requested: bool = False
    demo_booked: bool = False
    email_captured: bool = False
    objections: List[str] = []
    next_steps: List[str] = []
    conversation_stage: ConversationStage = ConversationStage.DISCOVERY

Extraction Node in LangGraph:

async def dialog_state_extractor_node(state: IXChatState) -> dict:
    """Extract dialog markers from conversation history."""
    prompt = DIALOG_STATE_EXTRACTION_PROMPT.format(
        history=format_history(state.messages)
    )

    result = await llm.with_structured_output(DialogSupervisionLLMOutput).invoke(prompt)

    return {
        "dialog_supervision_state": result,
        "should_propose_demo": result.demo_interest_score > THRESHOLD
    }

Integration with Conversation Storage: - Dialog state persisted with each conversation - Enables dashboard queries (e.g., "all conversations with demo requests") - Triggers N8N webhooks for CRM integration

Testing & Validation¶

Method: Integration tests + production validation - Created test conversations with known markers - Validated extraction accuracy - Monitored webhook trigger reliability

Results¶

Marker	Extraction Accuracy
Demo Requested	94%
Email Captured	99%
Objections	81%
Next Steps	76%

Conclusion¶

SUCCESS: Dialog state extraction enables automated workflow triggers with high accuracy for key markers.

Hypothesis 4: Client Dashboard with Real-Time Analytics¶

"A dedicated dashboard will enable clients to analyze conversations, identify patterns, and measure ROI."

Development Work¶

Dashboard Architecture:

Client Dashboard (React + TailwindCSS)
├── Authentication (Supabase Auth)
├── Domain Selection (multi-tenant)
├── Statistics Tiles
│   ├── Active Visitors
│   ├── Conversations
│   ├── Email Capture Rate
│   ├── Demo Bookings
│   └── Engagement Rate
├── Conversations Table
│   ├── Filtering (date, intent, account)
│   ├── Sorting (messages, activity)
│   └── Pagination
├── Visitor Details Panel
│   ├── Profile Information
│   ├── Conversation History
│   └── Interest Signals
└── Accounts View
    ├── Company Aggregation
    └── Multi-visitor Tracking

Key Features: - Environment filtering (production/staging/test) - Date range selection with custom ranges - Intent-based filtering - Admin-only features (test environment, hidden accounts)

Database Schema (Supabase):

-- Dashboard access control
CREATE TABLE backoffice_users (
    id UUID PRIMARY KEY,
    email TEXT UNIQUE NOT NULL,
    domains TEXT[] NOT NULL,  -- Accessible domains
    is_admin BOOLEAN DEFAULT FALSE,
    user_id UUID REFERENCES auth.users(id)
);

-- RLS for domain-based access
CREATE FUNCTION get_accessible_domains()
RETURNS TEXT[] AS $$
    SELECT CASE
        WHEN EXISTS (SELECT 1 FROM backoffice_users WHERE user_id = auth.uid() AND is_admin)
        THEN ARRAY(SELECT DISTINCT domain FROM site_configs)
        ELSE (SELECT domains FROM backoffice_users WHERE user_id = auth.uid())
    END;
$$ LANGUAGE SQL SECURITY DEFINER;

Testing & Validation¶

Method: User acceptance testing with pilot clients - Deployed to AB Tasty and Pennylane - Gathered feedback on usability - Iterated on filtering and display

Results¶

Feature	User Satisfaction
Statistics Overview	4.5/5
Conversation Search	4.2/5
Visitor Details	4.0/5
Account Aggregation	4.3/5

Conclusion¶

SUCCESS: Dashboard provides actionable insights. Clients can now self-serve conversation analysis.

Hypothesis 5: Suggested Answers & Follow-ups for Conversation Depth¶

"Pre-computed suggested responses and follow-up questions will increase conversation depth by reducing friction for users to continue engaging."

Development Work¶

Technical Challenge: Users often didn't know what to ask next after initial response. Conversation would end after 1-2 turns.

Implementation: - Suggested Answers: LLM-generated relevant responses users can click to send - Follow-up Questions: Proactive next questions to continue conversation - Click-to-send functionality for frictionless interaction - Context-aware generation based on conversation history

Architecture:

interface SuggestedContent {
    answers: string[];      // Pre-computed responses
    followUps: string[];    // Next questions to ask
}

async function generateSuggestions(
    conversation: Message[],
    visitorContext: VisitorProfile
): Promise<SuggestedContent> {
    // LLM generates contextual suggestions
    // Based on conversation flow and visitor intent
}

Testing & Validation¶

Method: A/B test comparing with/without suggestions - Measured conversation depth (turns per conversation) - Tracked click-through rate on suggestions

Results¶

Metric	October 2025	November 2025	Change
Engaged Conversations (2+ turns)	29.24%	39.27%	+34%

Conclusion¶

SUCCESS: Suggested answers & follow-ups directly drove +34% improvement in conversation depth.

Additional Development¶

Safari Compatibility¶

Fixed Supabase connection issues specific to Safari 18
Enhanced error classification for browser-specific bugs

Dialog Supervision Enhancement¶

Dynamic LLM output model creation for signal descriptions
Preserved dialog state across conversation turns

Measured Business Impact¶

Metric	October 2025	November 2025	Change
Initial Engagement	1.24%	1.73%	+40%
Engaged Conversations (2+ turns)	29.24%	39.27%	+34%
Visitor Identification	35%	72%	+106%

Root Cause Analysis: - +40% engagement: Dynamic questions with per-page configuration (Oct) + enrichment (Nov) - +34% engaged conversations: Suggested answers & follow-ups - +106% identification: 5-tier enrichment pipeline

R&D Activities¶

5-tier enrichment pipeline (UnifiedEnricher with cascading fallback)
Interest signals detection (LLM-based buying signal analysis)
Dialog state extraction (conversation marker detection)
Suggested answers & follow-ups (conversation depth optimization)

Other Development¶

Client dashboard
Database schema & RLS
Safari compatibility fixes
Testing & validation

Next Work (December)¶

Build Intent Router for multi-agent orchestration
Implement agent configuration system (per-client prompts)
Prepare JEI (Jeune Entreprise Innovante) documentation
Refine graph structure with parallel node execution