Base URL →
https://api.passy.ai/v1Models →
https://api.passy.ai/v1/modelsWhy Passy?
Premium models
Keep using frontier models like ChatGPT, Claude and Gemini
Open models
Use
passy/open models like Deepseek, Kimi, Qwen, Glm & more🇪🇺 EU hosting
All of our models are configured for Zero Data Retention & EU by default
OpenAI & Anthropic compatible
Drop in replacement for any AI SDK and use it in Pi, Cursor, Kilo, Warp & more
We only track prompt inputs and outputs needed for cost analysis
Ready to Start?
Quickstart
Send your first request and see all endpoints
SDKs
SDK guides for Vercel, OpenAI & Anthtropic
Tools & Agents
Tool guides for Cursor, Claude Code, Pi & more
Migrate flawlessly to passy in 1 minute with no downtime
Handy Guides
Setup your Passy.ai walletCreate Account (X)
Sign up to Passy.ai
Create API Key
Send your first request
Invite Member (X)
Add your team
Create Workspace (X)
Seperate projects and set limits
Need Help?
WhatsApp Support
Chat with us instantly
Book a Call
Meet with us
Feedback
Tell us more
We are a personal, founder-led company.
When you work with Passy, you talk with us directly, you can chat with us instantly on WhatsApp, or book a live meeting on our calendar within minutes.
We love to know exactly who you are, what you’re working on and what your team needs so we can help you better!
When you work with Passy, you talk with us directly, you can chat with us instantly on WhatsApp, or book a live meeting on our calendar within minutes.
We love to know exactly who you are, what you’re working on and what your team needs so we can help you better!
Problems We Solve
Too many providers
Too many providers
The Problem: Setting up individual accounts across multiple AI providers (Azure for ChatGPT, AWS for Claude, Vertex for Gemini) takes days of legal onboarding, corporate billing setup, account configuration, model access requests, rate limit increases, and key provisioning. Because of so many different models releases and overtaking top position for different tasks week-to-week, teams face intense vendor lock-in or massive manual code re-writes trying to remain modular. Switching smoothly between premium and open-source models usually results in system downtime and a lot of work.Passy’s Solution: One unified API key that grants immediate, single-click access to every major model layout. Passy serves as a routing infrastructure: your applications maintain a single code dependency, while your team changes models or providers on the fly without changing backend code.
Surprise bills
Surprise bills
The Problem: Standard cloud provider consoles lack fine-grained spending controls. When engineering teams build autonomous agents or complex CI/CD tooling, a single logical bug can cause an agent to enter an infinite processing loop exhausting thousands of € in token costs over a single weekend with zero early warning.Passy’s Solution: Granular, multi-tier budget controls. Passy allows project leads to establish hard spending limits, automated alerts, and overflow controls across four distinct structural dimensions:
- Total Workspace
- Team Member
- **Per Key **
- Overflow Limits
- Balance Alerts
Untrustworthy Limits
Untrustworthy Limits
The Problem: Traditional SaaS “unlimited” coding plans are inherently unstable. Shifting rate limits, unexpected throttling, and sudden price changes prevent developers from maintaining a predictable day-to-day workflow. Meanwhile, trying to self-host open-source models onto local company GPUs creates massive engineering overhead, high infrastructure bills, and compliance liabilities.Passy’s Solution: A highly predictable, cost-optimized European GPU cluster. By hosting open models directly on our hardware, Passy only covers core hosting overhead, translating efficiency into compounding value:
- Passy Pro Subscriptions (€29/mo, billed annually): Includes 1M tokens per day on Passy models plus an additional 10% structural discount.
- Zero Gateway Overhead: All non-subscription users automatically receive a 5% built-in discount on Passy models, entirely nullifying the standard 5% gateway routing fee.
Locking in your developers❗
Locking in your developers❗
The Problem: The AI developer landscape is shifting rapidly. Engineers want the freedom to jump between Cursor, Claude Code, Cline, Warp, Open Code, Cline, and new emerging experimental agents. Forcing a development team onto a single corporate-mandated tool destroys engineering velocity and hurts your team directly!Passy’s Solution: Total developer tooling freedom. Because Passy acts as a drop-in endpoint matching standard provider schemas, your developers can configure their workspace to use any IDE extension or agent framework they prefer. You control the wallet and compliance; your developers choose their tools.
Slow responses
Slow responses
The Problem: Standard public cloud endpoints handle millions of concurrent global requests, causing heavy API latency fluctuations and strict initial usage tiers that throttle developers during high-velocity sprint cycles.Passy’s Solution: Isolated, hyper-focused compute. Because Passy infrastructure is provisioned specifically for our ecosystem, we offer massive API rate limits without demanding initial spending histories or multi-tiered verification delays. For maximum optimization, teams can run the Passy CLI a local gateway CLI tool that handles cryptographic security and routing rules directly on the local machine, reducing network overhead to a bare minimum giving you the safest and fastest access.
Local and private GPU endpoints
Local and private GPU endpoints
The Problem: Enterprise organizations utilizing custom-trained models or local private GPU nodes have no clean way to manage, audit, or access their siloed infrastructure alongside external public frontier models. This fragments team workflows and breaks regulatory reporting structures.Passy’s Solution: Hybrid endpoint federation. Passy permits organizations to securely link their private, self-hosted endpoints directly into our centralized dashboard ecosystem. By simply whitelisting
passy.ai within your corporate virtual private cloud (VPC) or local private networks, engineers can consume internal local assets and external premium models through the exact same secure key maintaining flawless audit visibility for legal and compliance teams.Developer Tool Compatibility
Works with your existing setup - Just change two lines of code.Cursor
Kilo Code
Cline
Claude Code
Open Code
Pi Agent
Why Choose Passy
- Devs
- Teams
- Enterprises
Fast Responses
Low-latency & fast inference for smooth coding.
Reliable Pricing
Top-Up only and 10% discount with Passy Pro on
passy/ modelsPlug and Play with any Agent and SDK
Works with your existing editors & tools.
Zero Data Retention
Your data stays yours!

.png?fit=max&auto=format&n=kWt04qZJ2feLSVvu&q=85&s=025addac6f17d281b5a7490171c947c5)