selftune logo selftune

Self-Improving Skills

selftune watches the skill layer — where skills should fire but don't. Consumers get invisible improvement. Creators get the data to fix it. And when creators opt their users in, everyone's skills get better.

The Problem

Your agent has infinite knowledge and zero habits

Skills are how you teach your agent — marketing workflows, PDF generation, compliance checks, research pipelines. But skill descriptions are written based on what you think you'll say, not what you actually say. The gap means skills miss, and nobody notices.

Every correction you make is lost by the next session. selftune turns those corrections into permanent improvements — learning from real usage, validating every change, rolling back if anything regresses.

Our Approach

One product. Two surfaces.

Consumers and creators have different information appetites. Consumers want outcomes — install it, forget it. Creators want evidence — confidence scores, trigger rates, comparison grids. Same product, same data. Different default surfaces.

And when creators opt their users into the contribution pipeline, anonymous usage signals flow back — enabling crowdsourced skill evolution that no amount of personal testing can match.

Who We Built This For

Anyone who teaches their agent how to work

You use skills

You want your agent to just work. selftune runs in the background, improves your skills from real sessions, and never asks you to look at a dashboard. Seen and not heard.

You build skills

You publish skills that others install. selftune gives you a comparison grid with confidence scores, trigger rates, and miss patterns. When your users opt in to share signals, you get evolution proposals powered by how everyone talks — not just you.

Comparison

Why not just rewrite skills manually?

Approach Problem
Rewrite the description yourself No data on what users actually say. No validation. No regression detection.
Add "ALWAYS invoke when..." directives Brittle. One agent rewrite away from breaking.
Force-load skills on every prompt Doesn't fix the description. Expensive band-aid.
selftune Learns from real usage, rewrites descriptions to match how you work, validates against eval sets, auto-rollbacks on regressions.

Different Layer

MCP solved connection. selftune solves usage.

Langfuse, LangSmith, and OpenLIT trace LLM calls. selftune operates at the skill layer and uniquely captures consumer usage signals that flow back to creators through a privacy-preserving relay. The crowdsourced evolution loop is how skills improve for everyone.

Dimension selftune Langfuse LangSmith OpenLIT
Observes Skill triggers, missed fires, description drift LLM calls, token usage Agent traces, chain steps Infrastructure metrics
Diagnoses Why a skill didn't fire for a real user request Latency and cost Chain failures System bottlenecks
Improves Rewrites descriptions, bodies, and routing tables — 3-gate validation, auto-rollback
Validates 3-gate pipeline, baseline comparison, auto-rollback Custom evals
Runs Locally, zero deps, zero API keys Self-host or cloud Cloud required Helm chart
Price Free (MIT) Freemium Paid Free

These tools are complementary. They trace what happens inside the LLM. selftune makes sure the right skill fires in the first place — and improves it from how you actually work.

Open source. Self-improving.

One npx command. No API keys, no configuration. Your skills start learning from your next session. MIT licensed, forever.