What is the difference between a rule-based chatbot and an AI chatbot?

A rule-based chatbot follows a fixed decision tree, it only handles questions it was explicitly programmed for. An AI chatbot powered by GPT-4 handles free-form natural language, can synthesise information from multiple sources, and handles questions it was never explicitly trained for. AI chatbots are more versatile but can occasionally produce incorrect answers, which is why a clear escalation path is essential.

How do I make the chatbot know about my product?

Use RAG (Retrieval Augmented Generation). Store your product documentation as text chunks in Supabase with pgvector embeddings. At query time, find the 3–5 most relevant chunks for the user's question and inject them into the system prompt. This approach scales to thousands of documentation pages and updates automatically when you add or edit content.

How much does an AI chatbot cost to run per month?

For 500 active users sending 10 messages per day using GPT-4o: approximately $75/month. Using GPT-4o mini for simple queries reduces this to under $10/month. Implement a token budget per session and use the cheaper model where quality allows. Log your actual usage weekly for the first month to dial in your cost model.

Can the chatbot hand off to a human agent?

Yes. Design escalation triggers in the system prompt, specific phrases or topics that cause the bot to output an ESCALATE signal. Your backend detects this, creates a support ticket, and notifies your team with the full conversation transcript. The user sees a smooth message about being connected to the support team.

How do I know if my chatbot is working?

Measure three metrics: resolution rate (conversations that did not escalate), post-chat thumbs up/down rating, and the list of queries with no knowledge base match. Review weekly. Add answers to the top 20 unanswered queries each week. A well-maintained chatbot should reach 75–85% resolution within 6 weeks of launch.

How to Build an AI Chatbot for Your App

Chatbots built with GPT-4 can answer customer questions, guide onboarding, handle support tickets, and provide personalised recommendations, all without a human agent. Here's exactly how to build one inside a no-code app.

The Architecture

A production AI chatbot has four components:

1. **Chat UI**: Input field, message history display, loading state, error handling
2. **Conversation state**: An array of messages (role + content) stored in frontend state
3. **Backend proxy**: A Supabase Edge Function or Xano endpoint that calls OpenAI
4. **System prompt**: The instruction set that defines your chatbot's persona, knowledge, and constraints

The conversation state is the most important concept. OpenAI's API is stateless, every request must include the full conversation history. Your frontend maintains this history and sends it with each message.

Chatbot Architecture Choices: Retrieval vs Generation

Before building, decide whether your chatbot is primarily retrieval-based or generation-based. This choice determines the architecture, cost, and quality characteristics of the finished product.

A retrieval-based chatbot finds the best matching answer from a pre-defined knowledge base. The user asks a question, your system finds the most similar Q&A in the database, and presents it, optionally reformatted by the LLM. This is fast, cheap (few tokens consumed), and highly accurate within the knowledge domain. It fails when users ask questions not covered by the knowledge base, producing unhelpful "I don't have information on that" responses.

A generation-based chatbot sends the user's question to an LLM with context and lets the model compose a novel answer. This handles questions that were never explicitly written, can synthesise across multiple sources, and produces natural conversational responses. The cost is higher (more tokens per message), and the model can occasionally generate plausible-sounding but incorrect information. For most customer-facing support chatbots, the right architecture is hybrid: start with retrieval (pull the top 3 most relevant knowledge base articles) and use generation to compose a coherent answer from those sources, with a hard constraint that the model should not answer from outside those sources. This is RAG, Retrieval Augmented Generation, and it gives you accuracy with flexibility.

Building the Chat UI in WeWeb

In WeWeb, create a page-level variable `messages` (array, default empty). Add two components:

**Message list**: A Repeating Group bound to `messages`. Each item has a conditional style, user messages right-aligned with a primary colour background, assistant messages left-aligned with a neutral background. Bind the text to `item.content`.

**Input area**: A text input bound to a `userInput` variable, plus a "Send" button. On button click: (1) append `{role: "user", content: userInput}` to `messages`, (2) clear `userInput`, (3) call the API action, (4) append the response as `{role: "assistant", content: response}`.

Add a loading spinner that shows while the API call is in progress.

Training the Chatbot on Your Documentation

A chatbot that only knows what GPT-4o was trained on will not know your product's specific features, pricing, policies, or procedures. To make the chatbot genuinely useful for your users, you need to inject your knowledge base into the conversation. There are two approaches: static injection (paste documentation directly into the system prompt) and dynamic injection via RAG.

Static injection works for small knowledge bases under 5,000 words. Write your documentation as structured text, add it to the system prompt, and the model uses it as its primary reference. Disadvantage: the static approach is expensive at scale (sending the entire knowledge base with every message), and keeping the system prompt updated as documentation changes is manual work.

For larger knowledge bases, RAG is the right approach. Process your documentation into chunks of 200–500 words, embed each chunk using OpenAI's text-embedding-3-small model, and store the embeddings in Supabase with the pgvector extension. At query time, embed the user's message, find the top 3–5 most similar chunks via cosine similarity search, and inject only those chunks into the system prompt. This costs dramatically less per message and scales to thousands of documentation pages. When you update documentation, re-embed the changed chunks and update the database, the chatbot picks up the changes on the next query automatically.

The Supabase Edge Function

Your Edge Function receives the message array and system prompt, calls OpenAI, and returns the response:

```typescript
serve(async (req) => {
  const { messages, systemPrompt } = await req.json()
  
  const completion = await openai.chat.completions.create({
    model: "gpt-4o",
    messages: [
      { role: "system", content: systemPrompt },
      ...messages
    ],
    max_tokens: 500,
    temperature: 0.7
  })
  
  return new Response(
    JSON.stringify({ content: completion.choices[0].message.content }),
    { headers: { "Content-Type": "application/json" } }
  )
})
```

The `systemPrompt` can be passed from the frontend (useful for multi-persona apps) or hardcoded in the function (more secure).

Writing an Effective System Prompt

The system prompt determines everything about your chatbot's behaviour. A good production system prompt includes:

- **Role**: "You are a customer support agent for Acme SaaS, a project management tool."
- **Knowledge**: "You help users with: creating projects, inviting team members, setting up integrations, and billing questions."
- **Constraints**: "Only answer questions about Acme SaaS. For unrelated questions, politely redirect. Never discuss competitor products. Never make up features that don't exist."
- **Format**: "Keep responses under 100 words. Use bullet points for steps. Always end support answers with: 'Let me know if this helps!'"
- **Escalation**: "If the user expresses frustration or mentions a billing error, say: 'I'll connect you with our team' and trigger the escalation flow."

Handling Escalation to Human Support

Every production chatbot needs a clear escalation path for questions the AI cannot handle confidently, emotionally charged conversations, and situations that require account-level actions a bot should not take (issuing refunds, deleting accounts, making billing exceptions). Design this into the system prompt and the UI from the beginning, not as an afterthought.

In the system prompt, define escalation triggers explicitly: "If the user mentions a billing dispute, payment failure, or account suspension, do not attempt to resolve it. Instead, respond exactly with: ESCALATE: [brief reason], and nothing else." Your Edge Function detects the ESCALATE prefix and creates a support ticket in your helpdesk (Intercom, Zendesk, or even a Supabase table) rather than displaying the message to the user.

In the UI, when an escalation is detected, replace the chatbot interface with a message: "I'm connecting you with our support team. They'll follow up within [SLA]. You can also email support@yourcompany.com." Email the support team with the full conversation transcript. This approach keeps the handoff smooth for the user while giving the support agent full context. Measure escalation rate as a core chatbot metric, a rate above 20% suggests the knowledge base needs expansion.

Adding Persistent Context

The basic chatbot forgets everything when the page refreshes. To make it smarter:

**User context injection**: When the chatbot session starts, fetch the user's account data (plan, usage, recent activity) and append it to the system prompt: "The user's current plan is Pro. Their last activity was 3 days ago. They have 2 active projects."

**Conversation persistence**: Save messages to a Supabase table (chatbot_sessions) with user_id and session_id. On page load, fetch the last N messages and pre-populate the messages array.

**Knowledge base**: For product documentation, store articles in Supabase with embeddings (using pgvector). Before calling GPT-4o, run a similarity search and inject the most relevant articles into the system prompt. This is called RAG (Retrieval Augmented Generation) and dramatically improves answer accuracy.

Measuring Chatbot Success

A chatbot without measurement is a feature with no feedback loop. The metrics that matter most for a support or product chatbot: resolution rate (percentage of conversations where the user did not escalate or submit a separate support ticket), session length (average messages per conversation, too short suggests the bot is failing early, too long suggests it is not resolving efficiently), and post-chat rating (a simple thumbs up/down after each conversation, stored in Supabase and reviewable in a dashboard).

Beyond these user-facing metrics, instrument your Edge Function to log: which knowledge base articles were retrieved most often (tells you what users ask about most), which queries had no strong match in the vector database (tells you knowledge base gaps), and model latency and token counts per session (tells you your cost per conversation). Review these metrics weekly for the first month after launch.

The most actionable metric is the list of queries with no knowledge base match. Export it weekly, write answers for the top 20 unanswered queries, add them to the knowledge base, and re-embed. A chatbot that improves its resolution rate by 5% per week for the first month will often reach 80%+ resolution within 6 weeks of launch, significantly better than most human-first support workflows at early-stage SaaS companies.

Costs and Performance in Production

For a SaaS with 500 active users each sending 10 messages/day:

- Average message: 50 tokens input + 100 tokens output
- GPT-4o pricing: $2.50/M input + $10/M output
- Daily cost: 500 × 10 × 150 tokens = 750,000 tokens = ~$2.50/day = ~$75/month

To manage costs: implement a session token budget (stop adding history messages when the conversation exceeds 2,000 tokens, start summarising old messages). Use GPT-4o mini for simple queries ($0.15/M input) and reserve GPT-4o for complex ones.

Response time: GPT-4o returns in 1–3 seconds. Add a typing indicator to set expectations. For sub-second UX, implement streaming.

How to Build an AI Chatbot for Your App Without Code