Using Supabase Edge Functions for AI Features: A Complete Guide

Supabase Edge Functions are the ideal backend layer for AI features in no-code apps. They run globally close to users, have direct access to your Supabase database, support streaming responses, and keep your AI API keys secure on the server. This guide covers every pattern we use in production.

Why Edge Functions for AI

Three reasons Edge Functions are the right choice for AI API calls:

1. **Security**: Your OpenAI/Anthropic API key lives as a server-side secret, never exposed to the client. Anyone with your frontend code can't extract the key.

2. **Database access**: Edge Functions run inside Supabase's infrastructure and have direct, low-latency access to your PostgreSQL database. You can fetch user context, store results, and log usage in the same function that calls the AI.

3. **Streaming support**: Edge Functions support Response streaming, which lets you send AI output to the client word-by-word — dramatically improving perceived performance for long AI responses.

Basic OpenAI Proxy Function

The simplest Edge Function: receive a prompt, call OpenAI, return the response.

```typescript
import OpenAI from "npm:openai"

const openai = new OpenAI({ apiKey: Deno.env.get("OPENAI_API_KEY") })

Deno.serve(async (req) => {
  const { prompt } = await req.json()
  
  const chat = await openai.chat.completions.create({
    model: "gpt-4o",
    messages: [{ role: "user", content: prompt }],
    max_tokens: 500
  })
  
  return new Response(
    JSON.stringify({ result: chat.choices[0].message.content }),
    { headers: { "Content-Type": "application/json" } }
  )
})
```

Deploy: `supabase functions deploy openai-proxy`. Secrets: `supabase secrets set OPENAI_API_KEY=sk-...`.

Adding Authentication and Rate Limiting

Every AI Edge Function should verify the user is authenticated and check their usage limits:

```typescript
import { createClient } from "npm:@supabase/supabase-js"

Deno.serve(async (req) => {
  // Verify auth token
  const token = req.headers.get("Authorization")?.replace("Bearer ", "")
  const supabase = createClient(Deno.env.get("SUPABASE_URL"), Deno.env.get("SUPABASE_SERVICE_ROLE_KEY"))
  const { data: { user }, error } = await supabase.auth.getUser(token)
  if (error || !user) return new Response("Unauthorized", { status: 401 })
  
  // Check rate limit (max 20 requests/hour)
  const oneHourAgo = new Date(Date.now() - 3600000).toISOString()
  const { count } = await supabase
    .from("ai_usage_log")
    .select("*", { count: "exact" })
    .eq("user_id", user.id)
    .gte("created_at", oneHourAgo)
  
  if (count >= 20) return new Response("Rate limit exceeded", { status: 429 })
  
  // ... call OpenAI and log usage
})
```

Streaming Responses to WeWeb

Streaming sends AI output to the client progressively — users see text appearing word by word instead of waiting for the full response.

In the Edge Function:
```typescript
const stream = await openai.chat.completions.create({
  model: "gpt-4o",
  messages,
  stream: true
})

const readable = new ReadableStream({ async start(controller) { for await (const chunk of stream) { const text = chunk.choices[0]?.delta?.content || "" controller.enqueue(new TextEncoder().encode(text)) } controller.close() } })

return new Response(readable, {
  headers: { "Content-Type": "text/event-stream" }
})
```

In WeWeb: use a custom JavaScript action to fetch the stream URL and update a page variable character by character as chunks arrive.

Building a RAG Pipeline (Retrieval Augmented Generation)

RAG improves AI answers by injecting relevant knowledge into the prompt at query time. Architecture:

1. **Knowledge ingestion** (run once): For each document in your knowledge base, call OpenAI's embedding API to get a 1536-dimensional vector. Store vectors in Supabase using the pgvector extension.

2. **Query time**: When a user asks a question, embed the question (same embedding API), then run a similarity search in Supabase: `SELECT content, 1 - (embedding <=> query_embedding) AS similarity FROM documents ORDER BY similarity DESC LIMIT 3`.

3. **Augmented prompt**: Inject the top 3 matching documents into the system prompt: "Answer using only the following context: [docs]. If the answer isn't in the context, say you don't know."

Result: the AI answers only from your documentation, with zero hallucination about things you haven't documented.

Why Edge Functions for AI

Basic OpenAI Proxy Function

Adding Authentication and Rate Limiting

Streaming Responses to WeWeb

Building a RAG Pipeline (Retrieval Augmented Generation)

App Studio Team