Skip to content

Prompt & Skill Development Workflow

This document describes the workflow for developing and testing prompts and skills using local files, AI assistance, and the rose-langfuse tool.

Overview

When running the development server with just dev development, the system uses local prompt and skill files instead of fetching from Langfuse. This enables a powerful AI-assisted development workflow where you can:

  • Edit prompts and skills locally in your IDE
  • Use AI (like Claude) to modify both prompts/skills AND code together
  • Test changes immediately with unit tests
  • Push verified changes to Langfuse when ready
flowchart LR A[Edit locally] --> B[Test with pytest] B --> C{Tests pass?} C -->|Yes| D[rose-langfuse prompt status] C -->|No| A D --> E[rose-langfuse prompt diff] E --> F[rose-langfuse prompt push] F --> G[rose-langfuse prompt bump]

Local Development Setup

Start the Development Server

cd backend
just dev development

With IX_ENVIRONMENT=development, the system reads prompts and skills from local files:

backend/apps/shared_data/prompts/website-agent/
├── main.md                          # Main system prompt
├── skills/
│   ├── response_handling/
│   │   └── pricing/
│   │       └── SKILL.md
│   └── clients/
│       └── abtasty.com/
│           └── skills/
│               └── pricing/
│                   └── SKILL.md
└── ...

AI-Assisted Development

Because prompts and skills are plain Markdown files in your codebase, you can use AI assistants (like Claude Code) to:

  1. Modify prompts and code together - Make coherent changes across the entire system
  2. Refactor skill definitions - Update SKILL.md files while adjusting related Python code
  3. Add new skills - Create new skill files with proper structure and metadata
  4. Test changes - Run unit tests to verify behavior before pushing

Example workflow with AI:

User: "Update the pricing skill to handle enterprise pricing differently"

AI:
1. Reads the current SKILL.md
2. Modifies the skill definition
3. Updates any related Python code if needed
4. Runs unit tests to verify
5. Reports ready for push

Testing with Spy Tests

Use pytest to test prompts and skills locally before pushing to Langfuse.

Run Skill Tests

cd backend
poetry run pytest -m unit packages/ixskills/

What Spy Tests Verify

  • Skill selection logic matches expected intents
  • Prompt rendering produces valid output
  • Skill metadata is correctly parsed
  • Integration between skills and the router works correctly

Example Test Pattern

@pytest.mark.unit
async def test_pricing_skill_selected_for_pricing_query(mocker):
    """Verify pricing skill is selected for pricing-related queries."""
    # Test uses local SKILL.md files
    result = await skill_selector.select_skill(
        query="How much does it cost?",
        context=mock_context,
    )
    assert result.skill_name == "pricing"

Syncing with Langfuse

Once your local changes are tested and ready, use rose-langfuse prompt to push them to Langfuse.

Step 1: Check Status

See which files have local modifications:

rose-langfuse prompt status

Output shows sync status for each file:

Status Meaning
SYNCED Local matches Langfuse
LOCAL_MODIFIED Local has unpushed changes
NEW_LOCAL New file, not yet in Langfuse

Step 2: Review Changes

See the actual diff before pushing:

rose-langfuse prompt diff

This shows a unified diff comparing your local changes against what's in Langfuse.

Step 3: Push Changes

Upload your changes to Langfuse:

rose-langfuse prompt push

This creates new versions in Langfuse with the latest label.

Step 4: Bump to Label

Set Langfuse to use your new versions with a specific label:

# Apply 'development' label (default)
rose-langfuse prompt bump

# Apply 'production' label for production deployment
rose-langfuse prompt bump --label production

The bump command applies a label to the latest versions without creating new versions. This is how you control which version the system uses in each environment.

Complete Workflow Example

# 1. Start development server
cd backend
just dev development

# 2. Edit prompts/skills (or ask AI to help)
# ... make changes to SKILL.md files ...

# 3. Run tests
poetry run pytest -m unit packages/ixskills/

# 4. Check what needs syncing
rose-langfuse prompt status

# 5. Review the diff
rose-langfuse prompt diff

# 6. Push to Langfuse
rose-langfuse prompt push --yes

# 7. Set development label for testing
rose-langfuse prompt bump --label development

# 8. After validation, promote to production
rose-langfuse prompt bump --label production

Environment Labels

Label Purpose When to Use
development Active development Default after push, for dev/staging testing
production Production-ready After validating in development
latest Most recent version For pulling/comparing (read-only)

Best Practices

1. Test Before Pushing

Always run unit tests before pushing changes:

poetry run pytest -m unit packages/ixskills/

2. Use Meaningful Commits

Commit prompt/skill changes with the code that depends on them:

git add backend/apps/shared_data/prompts/
git add backend/packages/ixskills/
git commit -m "feat: add enterprise pricing handling to pricing skill"

3. Validate in Development First

Always bump to development first, test in staging, then bump to production:

rose-langfuse prompt bump --label development
# ... test in staging environment ...
rose-langfuse prompt bump --label production

4. Coordinate with Team

Use rose-langfuse prompt status before making changes to avoid conflicts:

rose-langfuse prompt status
# Check if anyone else has pushed changes
rose-langfuse prompt pull  # If needed

Troubleshooting

Local Changes Not Taking Effect

Ensure you're running with IX_ENVIRONMENT=development:

just dev development  # Correct
just dev staging      # Uses Langfuse, not local files

Tests Failing After Skill Changes

  1. Check skill metadata is valid YAML
  2. Verify skill category matches expected values
  3. Run specific test with verbose output:
poetry run pytest -v packages/ixskills/tests/test_skill_selector.py

Push Rejected

If push fails due to version mismatch:

# Pull latest changes first
rose-langfuse prompt pull

# Resolve any conflicts
# Then push again
rose-langfuse prompt push