Inference
The current paradigm is broken.
We built what comes next.*
Frontier AI runs on rented GPUs and metered access — pay per token, pay per month, pay forever for the privilege of using something you can't bring with you. We're done with that model. Substrate-native inference is built, tested, and scaling for participant deployment. The BETA ships with bring-your-own-provider today; what we call Alpenglow Intelligence in the picker comes next.
Where the BETA stands
BYOK today. Native inference next.
At launch, the system is fully functional — memory, federation, skills, marketplace, the whole platform — but inference itself runs through whatever provider you bring. Connect your own API key from any major frontier-model provider (Anthropic, OpenAI, Google, xAI, DeepSeek, Mistral, MiniMax, others). Your machine talks to the provider directly. We never see your prompts, your responses, or your key.
For developers and power users, we recommend OpenRouter — one signup, every major model, one bill. Your agent picks the right model for each task. You don't pay our platform anything to use it.
For privacy-first or offline workflows, run local models via Ollama, LM Studio, or any OpenAI-compatible local endpoint. Your conversations never leave your machine.
Why we won't run that race
Metered inference is rent-seeking.
Every major AI platform's revenue model assumes inference is their value to extract. Charge per query. Charge per month. Charge for tiers. Lock you in by tying their best capabilities to their hosted endpoints, so you can't bring your work anywhere else without losing it.
That model lives or dies by who owns the cheapest GPU cluster and who can extract the most margin between provider cost and customer price. The cost basis is fixed: data centers, electricity, GPU depreciation, the whole tower of capex pointed at one workload. The margin has to come from somewhere — and it comes from you, paying twice: in subscription, and in the queries that become the next model's training data.
It's a bad architecture for the economics, a worse architecture for the participant, and a fundamentally unsustainable structure for the technology to evolve under. It treats inference as a metered utility owned by a small number of incumbents. We disagree, structurally.
What we built
Substrate-native inference. Tested. Working. Scaling.*
Alpenglow's inference is built on the same proprietary substrate that makes the rest of the platform work — the same primitive that gives us deterministic memory recall and anonymized federation. Memory, federation, and inference aren't three separate systems bolted together; they're three operations on one underlying mathematical foundation.
That architectural unification is the breakthrough. It's also what we're keeping deliberately off public materials. The capability is the claim. The implementation is patent-protected and not for marketing.
What we will commit to publicly:
- Frontier-quality output. The model is built and tested internally. Quality is real, not aspirational.
- Runs on your hardware. No GPU cluster required. No API dependency. Speed scales with your machine, not with our infrastructure.
- Improves through federation. The system gets better through anonymized signals from participant use — see how federation works.
- Stays local. Your prompts and outputs never leave your devices. The model doesn't phone home.
- Free, structurally. Not "free during BETA." Not "free until we figure out billing." Free as a property of how the architecture works — there's no per-query infrastructure cost for us to recover.
Status: built, tested, working in internal deployment. Scaling for participant rollout is the work that's in flight right now. There's no firm date because we won't ship until it's ready — but it's not a research project. It's the next major release after the BETA stabilizes.
What this means architecturally
Frontier labs cannot price-compete with consumer-hardware inference.
Their cost basis is GPU clusters, data centers, electricity, and people. Ours is your laptop already running. There is no amount of operational efficiency that closes that gap — the floor isn't pricing, it's architecture.
The frontier still wins on bleeding-edge reasoning. We're not claiming substrate-native inference is the absolute best at the absolute hardest problems on day one — that's where their capex pays off, and it should. What we are claiming is that for the overwhelming majority of automation, task flow, memory-grounded reasoning, agent work, and everyday productivity, substrate-native inference is competitive in quality, dramatically better in cost, and structurally better in privacy and continuity.
Over time, as the substrate scales and federation refines the system through real participant use, that quality gap closes for the use cases that actually matter to most people. The confidence we'll commit to publicly: most participants will choose free over paid for most flows, on their own, because the math works that way. No coercion required. Just architecture meeting economics.
When our model ships
Your choice, always. We don't lock anyone in.
Most platforms with their own inference restrict access to outside providers — because their revenue depends on you running their model, not someone else's. That conflict of interest is built into the business model. The platform wins when you can't leave.
We don't have that conflict because we don't have inference revenue to defend. When Alpenglow Intelligence ships, bring-your-own-provider still works. Local Ollama still works. Switching to whatever model just got released this morning still works. Nothing about our native inference landing makes any other path harder.
And federation benefits the network regardless of which model produced the work. The success patterns, reasoning paths, and procedural learnings flow back as anonymized signal whether the underlying inference came from us, from Anthropic, from OpenAI, or from a model running locally on your machine. Your participation strengthens the federation independently of your inference choice.
The bright lines
Things you'll never see from Alpenglow.
- No metered inference. No per-token billing, ever — for our model or anyone else's.
- No tiered access. No "Pro" subscription that unlocks higher-quality models. Same architecture for everyone.
- No platform routing. Your queries don't pass through our servers on the way to a provider. There is no per-query path through our infrastructure.
- No data sale. We don't sell your prompts, your conversations, or anything derived from them. Not to advertisers, not to research firms, not to AI labs.
- No usage caps. If your provider lets you run a query, we let you run it through Alpenglow.
- No vendor lock-in. Switch providers any time. Run locally any time. Use our native inference any time. We're indifferent to your model choice and we'll stay that way.
Get on the BETA.
Bring whatever provider key you already have. Run the system end-to-end. The native model lands when it's ready — and you'll be running on it without changing anything else about how you work.
Request access