Skip to content

Message Evaluation

Overview

Message Evaluation allows visitors to rate bot responses with thumbs up/down buttons. This feedback helps track response quality and identify areas for improvement.

User Experience

After each bot message, visitors see evaluation buttons:

[Bot's response here]

            [👍]  [👎]

Clicking a button records the evaluation and may open a follow-up popup for more detailed feedback.

Rating Categories

When a visitor clicks thumbs down, they can provide more detail:

Rating Meaning
Accurate & OK Response was correct and helpful
Accurate but Not OK Correct but poorly presented/unhelpful
Not Accurate but OK Incorrect but reasonable attempt
Not Accurate & Not OK Incorrect and unhelpful

Key Files

Frontend

File Purpose
shared/src/components/EvaluationButtons.tsx Thumbs up/down buttons
shared/src/components/EvaluationPopup.tsx Detailed feedback popup

Component Props

interface EvaluationButtonsProps {
    onEvaluate: (rating: 'thumbs_up' | 'thumbs_down') => void;
    isEvaluated: boolean;
    evaluationRating?:
        | 'Accurate & OK'
        | 'Accurate but Not OK'
        | 'Not Accurate but OK'
        | 'Not Accurate & Not OK'
        | null;
    primaryColor: string;
}

Visual States

The buttons have different visual states:

State Appearance
Default Outline icons
Hover Slight scale up
Active (thumbs up) Filled with theme color
Active (thumbs down) Filled with theme color

Analytics Events

Event Trigger Data
rw_message_evaluated Button clicked rating, message_id, session_id

Current Limitations

Frontend-only tracking

Currently, evaluations are tracked via frontend analytics (PostHog) but not sent to the backend. This means:

  • Evaluations appear in PostHog dashboards
  • Not stored in conversation history
  • Not available for LLM training/fine-tuning

Implementation

<EvaluationButtons
    onEvaluate={(rating) => {
        // Track in analytics
        Analytics.trackMessageEvaluated({
            rating,
            messageId: message.id,
            sessionId: sessionId,
        });
    }}
    isEvaluated={message.isEvaluated}
    evaluationRating={message.evaluationRating}
    primaryColor={themeColor}
/>

Styling

  • Icons use smooth transitions (cubic-bezier(0.4, 0, 0.2, 1))
  • Active state includes subtle fill with 15% opacity
  • Scale animation on active state (1.1x)

Testing

cd frontend/shared
pnpm test EvaluationButtons
pnpm test EvaluationPopup

Future Improvements

Potential enhancements:

  1. Backend storage - Store evaluations with conversations
  2. Langfuse integration - Link evaluations to traces for analysis
  3. Feedback loops - Use evaluations to improve prompts
  4. Text feedback - Allow free-form comments