selftune logo selftune
← Back to Blog
agent skills
skills ecosystem
skill observability
2026

The State of Agent Skills in 2026

Selftune Team ·
The State of Agent Skills in 2026

The Numbers

The agent skill ecosystem grew faster than most people expected. Here is where things stand as of early 2026.

Marketplace Scale:

  • 270,000+ skills published across agent marketplaces
  • The Agent Skills standard now unifies 17+ platforms under a common specification
  • New skills are being published at an accelerating rate, with no signs of plateau

Agent Adoption:

  • Claude Code accounts for approximately 4% of all GitHub commits, roughly 135,000 commits per day, generating over $1B in ARR
  • Codex onboarded 1 million+ developers in its first month of availability
  • OpenCode has reached 2.5 million monthly active users and 112,000 GitHub stars
  • Multiple additional agent CLIs are in active development or early access

Market Size:

  • The AI agent market was valued at $7.84 billion in 2025
  • Projected to reach $52.62 billion by 2030, a 46.3% CAGR
  • Developer tooling represents one of the fastest-growing segments

These are not projections from a pitch deck. These are observed numbers from public data, platform reporting, and market analysis.

What the Numbers Do Not Tell You

Market size and adoption figures describe growth. They do not describe quality. And on quality, the data tells a different story.

Not a single skill learns from its users. Across all 270K+ published skills, we could not identify one with systematic trigger-rate monitoring, language adaptation, or evidence-based description optimization. Skills are written once and frozen.

No marketplace measures skill effectiveness. Download counts are the primary quality proxy. Downloads measure initial interest. They do not measure whether the skill actually works when a user needs it.

No standard quality metrics exist. There is no agreed-upon way to measure skill trigger accuracy, description quality, or match reliability. Each marketplace uses different signals, none of which directly measure the thing that matters: does the right skill fire for the right prompt?

This gap is not a minor oversight. It is a structural deficit in how the ecosystem operates.

The Quality Infrastructure Gap

Every other layer of the development stack has quality infrastructure:

Layer Quality Tooling
Code Linters, type checkers, test frameworks
APIs Monitoring, error tracking, SLOs
ML Models Eval suites, A/B testing, drift detection
Agent Skills (nothing)

Skills are the only production component in the modern development stack with no quality feedback loop. Developers publish and hope. Users install and discover. When things do not work, both sides move on.

What We Are Measuring

SelfTune has started collecting data on skill quality across the ecosystems we instrument. Early findings (from our internal testing and early adopter deployments):

  • Description-intent mismatch is present in the majority of skills we have analyzed. Most descriptions use technical language that diverges from how users naturally phrase requests.
  • Trigger accuracy varies wildly. Some well-crafted skills achieve reliable invocation. Many do not.
  • Single-description strategies leave coverage gaps. Skills with one description string cannot cover the natural variation in how users express the same intent.

We are being deliberately conservative with specific numbers here because our dataset is still growing. We want the data to be rigorous before we publish percentages.

The Full Report Is Coming

We are compiling a comprehensive "State of Agent Skills" report with detailed analysis across platforms, skill categories, and quality metrics. It will include:

  • Trigger accuracy distributions across skill categories
  • Common description anti-patterns and their measured impact
  • Platform-specific trends and differences
  • Recommendations backed by data, not opinion

The report will be published later this quarter. If you want to be notified when it drops, watch the GitHub repository.

Help Us Build the Dataset

If you develop or use agent skills, you can contribute anonymized usage data to the research:

selftune contribute

This command packages anonymized trigger-pattern data from your local SelfTune observations and submits it to the aggregate dataset. No prompt content is shared. No personal information is included. The contribution is fully optional and transparent about what gets sent.

More data means more accurate analysis. More accurate analysis means better recommendations for everyone building in this ecosystem.

The Takeaway

The agent skill ecosystem is large, growing fast, and operating without quality infrastructure. That is not sustainable. As skills become critical production dependencies rather than optional conveniences, the absence of self-improvement becomes an engineering risk.

We are building the data to quantify that risk precisely. The full report will be the first systematic, evidence-based assessment of skill quality at scale.

In the meantime: if you have published skills, run selftune doctor on them. You might be surprised by what you find.