For skill creators
See which skills work,
which don't, and why.
Your skill works for you. But every user talks differently. selftune gives you a comparison grid with confidence scores, trigger rates, and miss patterns — then lets your users' real signals improve your skills for everyone.
$ npx skills add selftune-dev/selftune Your default surface
Numbers you can trust
All your skills, side by side. Confidence scores, trigger rates, session counts. Instantly see which skills need attention and which are healthy.
Confidence scores
Every skill gets a confidence score based on real session data. See at a glance which skills are solid and which need work.
Trigger rates
How often does your skill fire when it should? selftune tracks trigger rates and surfaces miss patterns — the queries that should have matched but didn't.
Grade distributions
See how each skill performs across sessions. Grade distributions show whether a skill is consistently good or wildly inconsistent.
The moat
Your users make your skills better.
Add a config file to your skill package. Users who install it can opt in to share anonymous signals — what triggered, what missed, how well it worked. selftune aggregates across 20+ users and generates evolution proposals you could never get from testing alone.
Without selftune
You test with your vocabulary. Your users talk differently. You never know what's missing.
Personal mode
Your skills improve from your own sessions. Triggers match how you talk. But you're still just one person.
Crowdsourced
Anonymous signals from real users. Miss patterns you'd never discover. Evolution proposals from how everyone talks.
Detect. Evolve. Validate. Ship.
From detection to deploy in one command. Every change is backed by evidence from real sessions.
See what's missing
The comparison grid shows which skills are underperforming. Drill into any skill to see specific miss patterns — the exact queries that should have triggered your skill but didn't.
- Miss pattern detection
- Query-level evidence
- Confidence scoring
Fix it in one command
'selftune evolve my-skill' rewrites the description based on real usage data. The new version is validated against your eval set before it goes live. The old version is backed up automatically.
- Evidence-based rewrites
- Eval set validation
- Automatic backup
Crowdsource from your users
Add a selftune config to your skill package. Users who install it can opt in to share anonymous signals. You'll see aggregate proposals across 20+ contributors — miss patterns and evolution ideas from how everyone talks.
- Opt-in only
- Anonymized signals
- Aggregate evolution proposals
How the contribution pipeline protects users
Explicit opt-in only
Users must actively opt in to share signals. The default is off. No silent data collection — ever.
Anonymized before it leaves the machine
selftune strips raw prompts and session content on the user's machine. Only aggregate signals are shared: trigger rates, miss patterns, confidence scores. You never see individual user data.
Minimum 20 contributors for proposals
Evolution proposals are only generated when signals from 20+ users are aggregated. This prevents any single user's patterns from being identifiable.
You review every change
Crowdsourced proposals are suggestions, not automatic deploys. You review, approve, reject, or modify every evolution before it ships.
Numbers you can trust.
From how everyone talks.
Start with the free CLI. Upgrade to Team when you're ready for crowdsourced evolution.
$ npx skills add selftune-dev/selftune Skill observability updates
New features, skill tuning patterns, and ecosystem news. No spam. Unsubscribe anytime.
Monthly or less