Skip to main content
Passy.ai is the AI wallet for your apps, developers and agents.
Base URLhttps://api.passy.ai/v1
Modelshttps://api.passy.ai/v1/models

Why Passy?

Premium models

Keep using frontier models like ChatGPT, Claude and Gemini

Open models

Use passy/open models like Deepseek, Kimi, Qwen, Glm & more

🇪🇺 EU hosting

All of our models are configured for Zero Data Retention & EU by default

OpenAI & Anthropic compatible

Drop in replacement for any AI SDK and use it in Pi, Cursor, Kilo, Warp & more
We only track prompt inputs and outputs needed for cost analysis

Ready to Start?

Quickstart

Send your first request and see all endpoints

SDKs

SDK guides for Vercel, OpenAI & Anthtropic

Tools & Agents

Tool guides for Cursor, Claude Code, Pi & more
Migrate flawlessly to passy in 1 minute with no downtime

Handy Guides

Setup your Passy.ai wallet

Create Account (X)

Sign up to Passy.ai

Create API Key

Send your first request

Invite Member (X)

Add your team

Create Workspace (X)

Seperate projects and set limits

Need Help?

WhatsApp Support

Chat with us instantly

Book a Call

Meet with us

Feedback

Tell us more
We are a personal, founder-led company.

When you work with Passy, you talk with us directly, you can chat with us instantly on WhatsApp, or book a live meeting on our calendar within minutes.

We love to know exactly who you are, what you’re working on and what your team needs so we can help you better!

Problems We Solve

The Problem: Setting up individual accounts across multiple AI providers (Azure for ChatGPT, AWS for Claude, Vertex for Gemini) takes days of legal onboarding, corporate billing setup, account configuration, model access requests, rate limit increases, and key provisioning. Because of so many different models releases and overtaking top position for different tasks week-to-week, teams face intense vendor lock-in or massive manual code re-writes trying to remain modular. Switching smoothly between premium and open-source models usually results in system downtime and a lot of work.Passy’s Solution: One unified API key that grants immediate, single-click access to every major model layout. Passy serves as a routing infrastructure: your applications maintain a single code dependency, while your team changes models or providers on the fly without changing backend code.
The Problem: Standard cloud provider consoles lack fine-grained spending controls. When engineering teams build autonomous agents or complex CI/CD tooling, a single logical bug can cause an agent to enter an infinite processing loop exhausting thousands of € in token costs over a single weekend with zero early warning.Passy’s Solution: Granular, multi-tier budget controls. Passy allows project leads to establish hard spending limits, automated alerts, and overflow controls across four distinct structural dimensions:
  • Total Workspace
  • Team Member
  • **Per Key **
* Per day, week, month
  • Overflow Limits
  • Balance Alerts
The Problem: Traditional SaaS “unlimited” coding plans are inherently unstable. Shifting rate limits, unexpected throttling, and sudden price changes prevent developers from maintaining a predictable day-to-day workflow. Meanwhile, trying to self-host open-source models onto local company GPUs creates massive engineering overhead, high infrastructure bills, and compliance liabilities.Passy’s Solution: A highly predictable, cost-optimized European GPU cluster. By hosting open models directly on our hardware, Passy only covers core hosting overhead, translating efficiency into compounding value:
  • Passy Pro Subscriptions (€29/mo, billed annually): Includes 1M tokens per day on Passy models plus an additional 10% structural discount.
  • Zero Gateway Overhead: All non-subscription users automatically receive a 5% built-in discount on Passy models, entirely nullifying the standard 5% gateway routing fee.
As our network grows, economy of scale allows Passy to get cheaper over time, giving companies easy access to secure and trustworthy EU-hosted GPU power without the maintenance nightmare.
The Problem: The AI developer landscape is shifting rapidly. Engineers want the freedom to jump between Cursor, Claude Code, Cline, Warp, Open Code, Cline, and new emerging experimental agents. Forcing a development team onto a single corporate-mandated tool destroys engineering velocity and hurts your team directly!Passy’s Solution: Total developer tooling freedom. Because Passy acts as a drop-in endpoint matching standard provider schemas, your developers can configure their workspace to use any IDE extension or agent framework they prefer. You control the wallet and compliance; your developers choose their tools.
The Problem: Standard public cloud endpoints handle millions of concurrent global requests, causing heavy API latency fluctuations and strict initial usage tiers that throttle developers during high-velocity sprint cycles.Passy’s Solution: Isolated, hyper-focused compute. Because Passy infrastructure is provisioned specifically for our ecosystem, we offer massive API rate limits without demanding initial spending histories or multi-tiered verification delays. For maximum optimization, teams can run the Passy CLI a local gateway CLI tool that handles cryptographic security and routing rules directly on the local machine, reducing network overhead to a bare minimum giving you the safest and fastest access.
The Problem: Enterprise organizations utilizing custom-trained models or local private GPU nodes have no clean way to manage, audit, or access their siloed infrastructure alongside external public frontier models. This fragments team workflows and breaks regulatory reporting structures.Passy’s Solution: Hybrid endpoint federation. Passy permits organizations to securely link their private, self-hosted endpoints directly into our centralized dashboard ecosystem. By simply whitelisting passy.ai within your corporate virtual private cloud (VPC) or local private networks, engineers can consume internal local assets and external premium models through the exact same secure key maintaining flawless audit visibility for legal and compliance teams.

Developer Tool Compatibility

Works with your existing setup - Just change two lines of code.

Cursor

Kilo Code

Cline

Claude Code

Open Code

Pi Agent


Why Choose Passy

Fast Responses

Low-latency & fast inference for smooth coding.

Reliable Pricing

Top-Up only and 10% discount with Passy Pro on passy/ models

Plug and Play with any Agent and SDK

Works with your existing editors & tools.

Zero Data Retention

Your data stays yours!

Need Help?

WhatsApp Support

Book a Call

Feedback