You know that sinking feeling when you swap in a cheaper model and watch your agent go from sharp to brain-dead? Yeah, me too.
Here’s the deal: every major model provider is now under pressure to show profits. Anthropic just killed third-party token usage for subscription plans—all API billing now. So if you’re running agents at scale, your token bill is about to explode.
The usual reflex is to downgrade the model. But here’s the thing—most tasks are easy. Why pay for a Ferrari to drive to the grocery store?
That’s where OpenSquilla flips the script. It’s an open-source agent that literally scores each request by difficulty, then routes it to one of four tiers. Simple stuff hits DeepSeek-Flash, hard problems still get Claude Opus. You see the routing table in the console—T0, T1, T2, T3. No black box.
Official numbers claim input tokens drop 44% compared to OpenClaw (1.7M vs 3.06M), and total cost cuts somewhere between 60% and 80%. I’d take those with a grain of salt until you measure on your own workload, but the direction is solid.
But the real kicker isn’t the routing. It’s MetaSkill.
I first saw the term and thought, “Oh great, another agent framework claiming it evolves.” Nope. MetaSkill is a protocol. It tells the agent how to chain atomic skills together into real workflows. You know the pain—everyone can write a single skill, and some auto-evolution agents even generate them, but the moment you try to stitch a dozen skills into a reliable pipeline, it falls apart. You end up manually orchestrating, debugging for days.
MetaSkill defines five patterns: sequential pipeline, parallel split and merge, multi-role voting with arbitration, conditional branching, and a few more. The agent picks the right pattern and links skills automatically. They come with seven prebuilt MetaSkills out of the box—each is a composed set of atomic skills. Need a custom one? Just tell MetaSkill Creator what you want in plain language, and it generates a new MetaSkill on the fly.
Now, does it actually work in production? I haven’t run a full-scale benchmark myself, but the architecture makes sense. The Harness layer (routing, skill injection, memory compression, tool gating, workflow orchestration) is where the next wave of agent efficiency lives—not just model upgrades. OpenSquilla is Apache-2.0, 2.6k stars on GitHub, and the install is a single uv command.
Installation is stupid simple:
uv tool install "opensquilla[recommended]"
If you don’t have uv yet, curl one line. Windows even has a no-Python portable version.
Then run the onboard wizard, drop in your API keys, choose which model for each tier, and you’re done. The smart routing is on by default. MetaSkill is also enabled out of the box—just describe your goal in plain English and let the agent figure out the skill combination.
Don’t take my word for it. Pull the repo, run it on your own data, and see if the cost drop is real. That’s the only benchmark that matters.