Skip to main content
Use Emby’s OpenAI-compatible API with FastAPI using the modern uv package manager for fast Python dependency management.

What is uv?

uv is an extremely fast Python package installer and resolver, written in Rust. It’s a drop-in replacement for pip that’s 10-100x faster.

Prerequisites

  • Python 3.9+ installed
  • An Emby account with an API key

Installation

1

Install uv

curl -LsSf https://astral.sh/uv/install.sh | sh
2

Create Project

Create a new project directory and initialize:
mkdir emby-fastapi && cd emby-fastapi
uv init
3

Install Dependencies

uv add fastapi uvicorn openai python-dotenv
This creates a pyproject.toml and installs dependencies in a virtual environment.
4

Set Environment Variables

Create a .env file:
EMBY_API_KEY=your-api-key-here
EMBY_BASE_URL=https://dev.emby.ai/v1

Create the Application

1

Create the Emby Client

# emby_client.py
import os
from dotenv import load_dotenv
from openai import OpenAI

load_dotenv()

emby = OpenAI(
    api_key=os.getenv("EMBY_API_KEY"),
    base_url=os.getenv("EMBY_BASE_URL"),
)
2

Create the FastAPI App

# main.py
from fastapi import FastAPI, HTTPException
from fastapi.responses import StreamingResponse
from pydantic import BaseModel
from emby_client import emby

app = FastAPI(title="Emby FastAPI")

class ChatRequest(BaseModel):
    message: str
    model: str = "gpt-4o"

class ChatResponse(BaseModel):
    reply: str

@app.post("/chat", response_model=ChatResponse)
async def chat(request: ChatRequest):
    try:
        completion = emby.chat.completions.create(
            model=request.model,
            messages=[{"role": "user", "content": request.message}],
        )
        return ChatResponse(reply=completion.choices[0].message.content)
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

@app.post("/chat/stream")
async def chat_stream(request: ChatRequest):
    async def generate():
        stream = emby.chat.completions.create(
            model=request.model,
            messages=[{"role": "user", "content": request.message}],
            stream=True,
        )
        for chunk in stream:
            content = chunk.choices[0].delta.content
            if content:
                yield content

    return StreamingResponse(generate(), media_type="text/plain")

@app.get("/health")
async def health():
    return {"status": "healthy"}
3

Run the Server

uv run uvicorn main:app --reload
Your API is now running at http://localhost:8000Test it with:
curl -X POST http://localhost:8000/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "Hello!"}'

Project Structure

emby-fastapi/
├── .env
├── .python-version
├── pyproject.toml
├── uv.lock
├── emby_client.py
└── main.py

Advanced: Async Client

For better performance with FastAPI’s async nature:
# emby_client.py
import os
from dotenv import load_dotenv
from openai import AsyncOpenAI

load_dotenv()

emby = AsyncOpenAI(
    api_key=os.getenv("EMBY_API_KEY"),
    base_url=os.getenv("EMBY_BASE_URL"),
)
# main.py (async version)
@app.post("/chat", response_model=ChatResponse)
async def chat(request: ChatRequest):
    completion = await emby.chat.completions.create(
        model=request.model,
        messages=[{"role": "user", "content": request.message}],
    )
    return ChatResponse(reply=completion.choices[0].message.content)

Available Models

Use any Emby supported model:
# Popular choices
model = "gpt-4o"           # Fast and capable
model = "gpt-5"            # Most powerful
model = "claude-sonnet-4-5" # Anthropic's latest

Need Help?