Skip to content

Prompt & Skill Development Workflow

This document describes the workflow for developing and testing prompts and skills using local files, AI assistance, and the cli-langfuse-sync tool.

Overview

When running the development server with just dev development, the system uses local prompt and skill files instead of fetching from Langfuse. This enables a powerful AI-assisted development workflow where you can:

  • Edit prompts and skills locally in your IDE
  • Use AI (like Claude) to modify both prompts/skills AND code together
  • Test changes immediately with unit tests
  • Push verified changes to Langfuse when ready
flowchart LR A[Edit locally] --> B[Test with pytest] B --> C{Tests pass?} C -->|Yes| D[cli-langfuse-sync status] C -->|No| A D --> E[cli-langfuse-sync diff] E --> F[cli-langfuse-sync push] F --> G[cli-langfuse-sync bump]

Local Development Setup

Start the Development Server

cd backend
just dev development

With IX_ENVIRONMENT=development, the system reads prompts and skills from local files:

backend/apps/shared_data/prompts/website-agent/
├── main.md                          # Main system prompt
├── skills/
│   ├── response_handling/
│   │   └── pricing/
│   │       └── SKILL.md
│   └── clients/
│       └── abtasty.com/
│           └── skills/
│               └── pricing/
│                   └── SKILL.md
└── ...

AI-Assisted Development

Because prompts and skills are plain Markdown files in your codebase, you can use AI assistants (like Claude Code) to:

  1. Modify prompts and code together - Make coherent changes across the entire system
  2. Refactor skill definitions - Update SKILL.md files while adjusting related Python code
  3. Add new skills - Create new skill files with proper structure and metadata
  4. Test changes - Run unit tests to verify behavior before pushing

Example workflow with AI:

User: "Update the pricing skill to handle enterprise pricing differently"

AI:
1. Reads the current SKILL.md
2. Modifies the skill definition
3. Updates any related Python code if needed
4. Runs unit tests to verify
5. Reports ready for push

Testing with Spy Tests

Use pytest to test prompts and skills locally before pushing to Langfuse.

Run Skill Tests

cd backend
poetry run pytest -m unit packages/ixskills/

What Spy Tests Verify

  • Skill selection logic matches expected intents
  • Prompt rendering produces valid output
  • Skill metadata is correctly parsed
  • Integration between skills and the router works correctly

Example Test Pattern

@pytest.mark.unit
async def test_pricing_skill_selected_for_pricing_query(mocker):
    """Verify pricing skill is selected for pricing-related queries."""
    # Test uses local SKILL.md files
    result = await skill_selector.select_skill(
        query="How much does it cost?",
        context=mock_context,
    )
    assert result.skill_name == "pricing"

Syncing with Langfuse

Once your local changes are tested and ready, use cli-langfuse-sync to push them to Langfuse.

Step 1: Check Status

See which files have local modifications:

cli-langfuse-sync status

Output shows sync status for each file:

Status Meaning
SYNCED Local matches Langfuse
LOCAL_MODIFIED Local has unpushed changes
NEW_LOCAL New file, not yet in Langfuse

Step 2: Review Changes

See the actual diff before pushing:

cli-langfuse-sync diff

This shows a unified diff comparing your local changes against what's in Langfuse.

Step 3: Push Changes

Upload your changes to Langfuse:

cli-langfuse-sync push

This creates new versions in Langfuse with the latest label.

Step 4: Bump to Label

Set Langfuse to use your new versions with a specific label:

# Apply 'development' label (default)
cli-langfuse-sync bump

# Apply 'production' label for production deployment
cli-langfuse-sync bump --label production

The bump command applies a label to the latest versions without creating new versions. This is how you control which version the system uses in each environment.

Complete Workflow Example

# 1. Start development server
cd backend
just dev development

# 2. Edit prompts/skills (or ask AI to help)
# ... make changes to SKILL.md files ...

# 3. Run tests
poetry run pytest -m unit packages/ixskills/

# 4. Check what needs syncing
cli-langfuse-sync status

# 5. Review the diff
cli-langfuse-sync diff

# 6. Push to Langfuse
cli-langfuse-sync push --yes

# 7. Set development label for testing
cli-langfuse-sync bump --label development

# 8. After validation, promote to production
cli-langfuse-sync bump --label production

Environment Labels

Label Purpose When to Use
development Active development Default after push, for dev/staging testing
production Production-ready After validating in development
latest Most recent version For pulling/comparing (read-only)

Best Practices

1. Test Before Pushing

Always run unit tests before pushing changes:

poetry run pytest -m unit packages/ixskills/

2. Use Meaningful Commits

Commit prompt/skill changes with the code that depends on them:

git add backend/apps/shared_data/prompts/
git add backend/packages/ixskills/
git commit -m "feat: add enterprise pricing handling to pricing skill"

3. Validate in Development First

Always bump to development first, test in staging, then bump to production:

cli-langfuse-sync bump --label development
# ... test in staging environment ...
cli-langfuse-sync bump --label production

4. Coordinate with Team

Use cli-langfuse-sync status before making changes to avoid conflicts:

cli-langfuse-sync status
# Check if anyone else has pushed changes
cli-langfuse-sync pull  # If needed

Troubleshooting

Local Changes Not Taking Effect

Ensure you're running with IX_ENVIRONMENT=development:

just dev development  # Correct
just dev staging      # Uses Langfuse, not local files

Tests Failing After Skill Changes

  1. Check skill metadata is valid YAML
  2. Verify skill category matches expected values
  3. Run specific test with verbose output:
poetry run pytest -v packages/ixskills/tests/test_skill_selector.py

Push Rejected

If push fails due to version mismatch:

# Pull latest changes first
cli-langfuse-sync pull

# Resolve any conflicts
# Then push again
cli-langfuse-sync push