Skip to content

November 2025 - Multi-Tier Enrichment & Interest Signals Detection

Context

Platform now has basic visitor profiling. November focused on building a comprehensive B2B enrichment pipeline and intelligent interest detection to automate lead qualification.

Active clients: AB Tasty, Pennylane, Skaleet, Skello, + Rose website

Technical Challenge

Primary Problems: 1. Single-source enrichment insufficient: IP-based enrichment only identifies ~35% of visitors 2. Interest detection manual: No automated way to detect high-intent visitors 3. Dialog state tracking: Conversation markers (demo requests, objections) not extracted 4. No dashboard: Clients cannot analyze conversations

Business Context: Manual lead qualification doesn't scale. Need automated signals to prioritize follow-up.


Hypothesis 1: Multi-Tier Enrichment Pipeline

"A cascading enrichment pipeline with multiple data sources and intelligent fallback will achieve >70% visitor identification."

Development Work

Architecture Design:

Visitor Request
┌─────────────────────────────────────────────────────────────┐
│                   UNIFIED ENRICHER                          │
├─────────────────────────────────────────────────────────────┤
│  Tier 1: Redis Cache         → Hit? Return cached profile   │
│     ↓ miss                                                   │
│  Tier 2: Supabase Lookup     → Known IP hash? Return account│
│     ↓ miss                                                   │
│  Tier 3: Browser Reveal      → Client-side data available?  │
│     ↓ miss                                                   │
│  Tier 4: Snitcher Radar API  → Session-based identification │
│     ↓ miss                                                   │
│  Tier 5: Enrich.so API       → Server-side IP enrichment    │
└─────────────────────────────────────────────────────────────┘
    Merged VisitorProfile with enrichment_source tracking

Key Technical Decisions:

  1. IP-address abstraction: Domain is a RESULT of enrichment, not input. Different sources provide different confidence levels.

  2. Store/skip IP hash mapping: Controls whether to cache IP→account mapping based on source confidence:

  3. Browser Reveal (high confidence) → Store
  4. Snitcher (session-based) → Store
  5. Enrich.so (IP-only) → Skip (may be shared IP)

  6. Enrichment tier tracking: Each profile records which tier provided data for analytics.

Implementation:

class UnifiedEnricher:
    async def enrich(self, visitor_id: str, ip: str) -> EnrichmentResult:
        # Tier 1: Redis cache (10ms)
        cached = await self.redis_cache.get(visitor_id)
        if cached:
            return EnrichmentResult(profile=cached, source="cache")

        # Tier 2: Supabase known accounts (50ms)
        ip_hash = hash_ip(ip)
        account = await self.supabase.get_by_ip_hash(ip_hash)
        if account:
            return EnrichmentResult(profile=account, source="supabase")

        # Tier 3: Browser Reveal data (0ms - already available)
        if browser_reveal_data:
            return EnrichmentResult(profile=browser_reveal_data, source="browser_reveal")

        # Tier 4: Snitcher Radar API (200ms)
        snitcher_result = await self.snitcher.identify(session_id)
        if snitcher_result:
            await self._cache_mapping(ip_hash, snitcher_result)
            return EnrichmentResult(profile=snitcher_result, source="snitcher")

        # Tier 5: Enrich.so fallback (500ms)
        enrichso_result = await self.enrichso.enrich(ip)
        return EnrichmentResult(profile=enrichso_result, source="enrichso")

Testing & Validation

Method: Production A/B test with conversion tracking - Compared identification rates across tiers - Measured latency impact per tier - Tracked enrichment quality by source

Results

Metric Single API 5-Tier Pipeline
Identification Rate 35% 72%
Average Latency 800ms 150ms (cached) / 400ms (full)
Data Completeness 40% 78%
Cache Hit Rate N/A 65%

Conclusion

SUCCESS: Multi-tier pipeline more than doubles identification rate while reducing average latency through intelligent caching.


Hypothesis 2: Dynamic Interest Signals Detection

"LLM-based analysis of conversation content can detect buying signals and propose demos at the optimal moment."

Development Work

Technical Challenge: Demo proposals were never happening at the right time: - Too soon: Visitor still exploring, feels pushy → bounce - Too late: Visitor already left or lost interest - Too often: Repeated proposals annoying users

Solution: Detect interest signals in real-time to propose demo at optimal moment.

Interest Signals Framework:

@dataclass
class InterestSignal:
    signal_type: str  # pricing_inquiry, demo_request, technical_deep_dive
    confidence: float  # 0.0 - 1.0
    evidence: str  # Quote from conversation
    action: str  # propose_demo, escalate, continue

class InterestSignalsDetector:
    def __init__(self, signal_definitions: List[SignalDefinition]):
        """Signal definitions are configurable per client/industry."""
        self.signals = signal_definitions

    async def detect(self, conversation: Conversation) -> List[InterestSignal]:
        """Analyze conversation for buying signals."""
        prompt = self._build_detection_prompt(conversation)
        result = await self.llm.analyze(prompt)
        return self._parse_signals(result)

Configurable Signal Definitions:

# Per-client signal configuration
signals:
  - name: pricing_inquiry
    patterns: ["how much", "pricing", "cost", "budget"]
    weight: 0.7
    action: propose_demo

  - name: technical_deep_dive
    patterns: ["integration", "API", "implementation"]
    weight: 0.5
    action: continue

  - name: competitor_mention
    patterns: ["compared to", "vs", "alternative"]
    weight: 0.8
    action: escalate

Demo Proposal Logic:

async def should_propose_demo(signals: List[InterestSignal]) -> bool:
    """Threshold-based demo proposal decision."""
    total_score = sum(s.confidence * s.weight for s in signals)
    return total_score >= DEMO_THRESHOLD  # Configurable per client

Testing & Validation

Method: Manual labeling + automated testing - Labeled 200 conversations for interest signals - Compared LLM detection vs human labels - Measured false positive/negative rates

Results

Signal Detection Accuracy:

Signal Type Precision Recall F1 Score
Demo Request 92% 88% 0.90
Pricing Inquiry 87% 91% 0.89
Technical Deep-Dive 78% 82% 0.80
Competitor Mention 85% 79% 0.82

Business Impact (Conversion Rate):

November baseline established: 2.89% (interaction to form submitted)

Metric November 2025 December 2025
Conversion Rate 2.89% 3.15% (+9%)
Demo timing Signal-based Signal-based

Conversion improvement measured as Interest Signals Detection matured in production.

Conclusion

SUCCESS: LLM-based signal detection achieves >85% accuracy for high-value signals. Established conversion baseline of 2.89% with signal-based demo proposals replacing manual/random timing.


Hypothesis 3: Dialog State Extraction

"Extracting structured markers from conversation (demo requested, objections raised, next steps) enables automated workflow triggers."

Development Work

Dialog State Model:

class DialogSupervisionState(BaseModel):
    demo_requested: bool = False
    demo_booked: bool = False
    email_captured: bool = False
    objections: List[str] = []
    next_steps: List[str] = []
    conversation_stage: ConversationStage = ConversationStage.DISCOVERY

Extraction Node in LangGraph:

async def dialog_state_extractor_node(state: IXChatState) -> dict:
    """Extract dialog markers from conversation history."""
    prompt = DIALOG_STATE_EXTRACTION_PROMPT.format(
        history=format_history(state.messages)
    )

    result = await llm.with_structured_output(DialogSupervisionLLMOutput).invoke(prompt)

    return {
        "dialog_supervision_state": result,
        "should_propose_demo": result.demo_interest_score > THRESHOLD
    }

Integration with Conversation Storage: - Dialog state persisted with each conversation - Enables dashboard queries (e.g., "all conversations with demo requests") - Triggers N8N webhooks for CRM integration

Testing & Validation

Method: Integration tests + production validation - Created test conversations with known markers - Validated extraction accuracy - Monitored webhook trigger reliability

Results

Marker Extraction Accuracy
Demo Requested 94%
Email Captured 99%
Objections 81%
Next Steps 76%

Conclusion

SUCCESS: Dialog state extraction enables automated workflow triggers with high accuracy for key markers.


Hypothesis 4: Client Dashboard with Real-Time Analytics

"A dedicated dashboard will enable clients to analyze conversations, identify patterns, and measure ROI."

Development Work

Dashboard Architecture:

Client Dashboard (React + TailwindCSS)
├── Authentication (Supabase Auth)
├── Domain Selection (multi-tenant)
├── Statistics Tiles
│   ├── Active Visitors
│   ├── Conversations
│   ├── Email Capture Rate
│   ├── Demo Bookings
│   └── Engagement Rate
├── Conversations Table
│   ├── Filtering (date, intent, account)
│   ├── Sorting (messages, activity)
│   └── Pagination
├── Visitor Details Panel
│   ├── Profile Information
│   ├── Conversation History
│   └── Interest Signals
└── Accounts View
    ├── Company Aggregation
    └── Multi-visitor Tracking

Key Features: - Environment filtering (production/staging/test) - Date range selection with custom ranges - Intent-based filtering - Admin-only features (test environment, hidden accounts)

Database Schema (Supabase):

-- Dashboard access control
CREATE TABLE backoffice_users (
    id UUID PRIMARY KEY,
    email TEXT UNIQUE NOT NULL,
    domains TEXT[] NOT NULL,  -- Accessible domains
    is_admin BOOLEAN DEFAULT FALSE,
    user_id UUID REFERENCES auth.users(id)
);

-- RLS for domain-based access
CREATE FUNCTION get_accessible_domains()
RETURNS TEXT[] AS $$
    SELECT CASE
        WHEN EXISTS (SELECT 1 FROM backoffice_users WHERE user_id = auth.uid() AND is_admin)
        THEN ARRAY(SELECT DISTINCT domain FROM site_configs)
        ELSE (SELECT domains FROM backoffice_users WHERE user_id = auth.uid())
    END;
$$ LANGUAGE SQL SECURITY DEFINER;

Testing & Validation

Method: User acceptance testing with pilot clients - Deployed to AB Tasty and Pennylane - Gathered feedback on usability - Iterated on filtering and display

Results

Feature User Satisfaction
Statistics Overview 4.5/5
Conversation Search 4.2/5
Visitor Details 4.0/5
Account Aggregation 4.3/5

Conclusion

SUCCESS: Dashboard provides actionable insights. Clients can now self-serve conversation analysis.


Hypothesis 5: Suggested Answers & Follow-ups for Conversation Depth

"Pre-computed suggested responses and follow-up questions will increase conversation depth by reducing friction for users to continue engaging."

Development Work

Technical Challenge: Users often didn't know what to ask next after initial response. Conversation would end after 1-2 turns.

Implementation: - Suggested Answers: LLM-generated relevant responses users can click to send - Follow-up Questions: Proactive next questions to continue conversation - Click-to-send functionality for frictionless interaction - Context-aware generation based on conversation history

Architecture:

interface SuggestedContent {
    answers: string[];      // Pre-computed responses
    followUps: string[];    // Next questions to ask
}

async function generateSuggestions(
    conversation: Message[],
    visitorContext: VisitorProfile
): Promise<SuggestedContent> {
    // LLM generates contextual suggestions
    // Based on conversation flow and visitor intent
}

Testing & Validation

Method: A/B test comparing with/without suggestions - Measured conversation depth (turns per conversation) - Tracked click-through rate on suggestions

Results

Engaged Conversations Q4 2025

Metric October 2025 November 2025 Change
Engaged Conversations (2+ turns) 29.24% 39.27% +34%

Conclusion

SUCCESS: Suggested answers & follow-ups directly drove +34% improvement in conversation depth.


Additional Development

Safari Compatibility

  • Fixed Supabase connection issues specific to Safari 18
  • Enhanced error classification for browser-specific bugs

Dialog Supervision Enhancement

  • Dynamic LLM output model creation for signal descriptions
  • Preserved dialog state across conversation turns

Measured Business Impact

Metric October 2025 November 2025 Change
Initial Engagement 1.24% 1.73% +40%
Engaged Conversations (2+ turns) 29.24% 39.27% +34%
Visitor Identification 35% 72% +106%

Root Cause Analysis: - +40% engagement: Dynamic questions with per-page configuration (Oct) + enrichment (Nov) - +34% engaged conversations: Suggested answers & follow-ups - +106% identification: 5-tier enrichment pipeline


R&D Activities

  • 5-tier enrichment pipeline (UnifiedEnricher with cascading fallback)
  • Interest signals detection (LLM-based buying signal analysis)
  • Dialog state extraction (conversation marker detection)
  • Suggested answers & follow-ups (conversation depth optimization)

Other Development

  • Client dashboard
  • Database schema & RLS
  • Safari compatibility fixes
  • Testing & validation

Next Work (December)

  1. Build Intent Router for multi-agent orchestration
  2. Implement agent configuration system (per-client prompts)
  3. Prepare JEI (Jeune Entreprise Innovante) documentation
  4. Refine graph structure with parallel node execution