← Back to blog

Building Chattr: Open-Source Support Chatbot

May 1, 2026·12 min read
AITypeScriptOpen SourceFrontend Development

Most AI chatbot demos look impressive for five minutes, then fall apart the moment you ask a practical product question. They answer too broadly, cannot cite your docs, leak across environments, and are awkward to embed into a real website. I wanted to build something closer to an actual support surface, not another fullscreen playground.

That became Chattr, an open-source, MIT-licensed, self-hosted support chatbot that you can drop into any site with a single script tag. Under the hood it combines a Hono API, a framework-agnostic vanilla TypeScript widget, SQLite plus sqlite-vec for retrieval, configurable guardrails, and a multi-tenant architecture that lets one deployment power multiple branded assistants safely.

It is also intentionally flexible about where intelligence comes from. Teams can run Chattr with OpenAI, Anthropic, Azure OpenAI, or fully local providers like Ollama and other OpenAI-compatible runtimes. That means you can optimize for privacy, cost, portability, or simplicity instead of getting locked into one vendor shape.

This post is about the engineering decisions that made Chattr feel product-ready: why the embed model mattered as much as the LLM, how I kept retrieval small and portable, why guardrails had to be first-class instead of an afterthought, and what tradeoffs come with running multiple assistants from one codebase.

Why Chattr Needed a Different Shape

The starting point was simple: most companies do not need a giant AI platform. They need a support chatbot that can answer questions about pricing, onboarding, docs, policies, and setup without forcing the team into a new helpdesk workflow.

That sounds modest, but the product requirements stack up quickly:

  • It has to embed on any site with almost no integration work
  • It has to answer from the company's own content, not generic model knowledge
  • It has to be safe enough for customer-facing use
  • It has to support different model providers, because every team has different privacy and cost constraints
  • It has to work for more than one brand without tenant data leaking across boundaries
  • It has to stay inspectable and adaptable, because customer-facing AI is much easier to trust when it is open source

Those constraints pushed the architecture in a very different direction from a typical single-tenant chat app. The model call is only one slice of the problem. The harder parts are tenant resolution, origin validation, ingestion, retrieval quality, escalation, and making the widget feel native inside someone else's site.

I also wanted Chattr to stay operationally small. If a support chatbot requires a vector database cluster, a queue system, a crawler farm, and three orchestration services before you can test it, it is already too heavy for the teams I had in mind. That is why I kept coming back to SQLite, local-first ingestion flows, config-driven behavior, and a deploy story that works just as well for a Docker container as it does for a small VPS.

Start with the Embed, Not the Model

The biggest product insight was that the assistant experience starts before the first prompt. If the widget is painful to install, visually disconnected from the host site, or difficult to scope per tenant, the best retrieval pipeline in the world will not save it.

So I treated the widget bootstrap as core architecture, not just frontend glue. Each tenant owns branding, starter questions, escalation settings, prompt behavior, allowed origins, and provider configuration. When the widget loads, the server resolves the tenant and returns the branded widget defaults for that session.

type TenantBootstrap = {
  tenantId: string;
  name: string;
  widget: {
    theme?: {
      primaryColor?: string;
      title?: string;
      subtitle?: string;
      avatarUrl?: string;
    };
    welcomeMessage?: string;
    starterQuestions?: string[];
    escalation?: {
      email?: string;
      url?: string;
      phone?: string;
      phoneHours?: string;
    };
  };
};
 
export function buildTenantBootstrap(
  tenantId: string,
  tenant: TenantConfig,
): TenantBootstrap {
  return {
    tenantId,
    name: tenant.name,
    widget: buildTenantWidgetDefaults(tenant),
  };
}

That design pays off in a few ways. First, the widget script stays tiny because it does not need hardcoded customer configuration. Second, every tenant can have a different prompt, model provider, and escalation policy without forking the frontend. Third, tenant resolution and origin validation happen before normal chat traffic starts, which closes off a whole class of accidental cross-site use.

It also keeps integration dead simple. From the host site's perspective, the happy path is still one script tag. That matters a lot. Support tooling loses momentum fast when setup needs a custom SDK wrapper, a framework-specific package, or a dedicated frontend team. Chattr had to fit on a marketing site, a docs portal, or a product dashboard with the same mental model.

The framework-agnostic part mattered too. I did not want teams to adopt React, Next.js, or a specific component library just to add support chat. A vanilla TypeScript widget makes the product much easier to drop into real-world sites without turning the integration into a platform decision.

Retrieval That Stays Small and Portable

Once the embed story was clear, the next challenge was grounding. A support chatbot that cannot answer from real docs is just autocomplete with a nicer face.

I wanted retrieval without dragging in a separate vector service, so Chattr stores embeddings in SQLite with sqlite-vec. Every tenant gets its own database, which makes isolation easy to reason about and local self-hosting stays lightweight. Content comes from two paths: local file ingestion and sitemap-driven scraping. That gives teams a fast way to bootstrap knowledge from the content they already maintain.

The retrieval pipeline is intentionally straightforward:

  1. Ingest pages or documents
  2. Normalize and chunk the content
  3. Generate embeddings with the tenant's configured provider
  4. Store chunks, vectors, metadata, and source URLs in the tenant database
  5. At chat time, retrieve the best matches, rerank them, dedupe sources, and compute confidence

The confidence layer matters more than people think. In support, a mediocre answer is often worse than a short uncertain one, because it sounds authoritative. Chattr uses retrieval scoring to decide whether it should answer directly, hedge, or push the user toward a handoff flow.

const ranked = results
  .map((result) => {
    const meta = JSON.parse(result.metadata || "{}") as {
      source?: string;
      title?: string;
    };
 
    return {
      ...result,
      meta,
      score: scoreResult(
        expandedQuery,
        result.content,
        meta.source,
        meta.title,
        result.distance,
        intent,
      ),
    };
  })
  .sort((a, b) => b.score - a.score)
  .slice(0, topK);
 
const sources = dedupeSources(
  ranked
    .map((result) => ({
      title: result.meta.title || result.meta.source || "Source",
      url: result.meta.source,
    }))
    .filter((source): source is RetrievedSource => Boolean(source.url)),
);
 
return {
  context: ranked
    .map((result) => {
      const source = result.meta.source ? `\nSource URL: ${result.meta.source}` : "";
      return `${result.content}${source}`;
    })
    .join("\n\n---\n\n"),
  sources,
  confidence: classifyConfidence(ranked[0]?.score ?? null, sources),
  topScore: ranked[0]?.score ?? null,
};

This is one of those places where simple beats clever. I did not need a huge retrieval orchestration layer to get useful behavior. I needed good chunking, predictable per-tenant storage, reranking, source deduplication, and a confidence signal I could feed back into the UX.

That signal unlocks better support flows too. A high-confidence answer can show source links and suggested next steps. A low-confidence answer can steer toward a contact action, a human handoff, or a clarifying question instead of bluffing. That is the difference between a chatbot demo and a support product.

Guardrails Are a Product Feature

If you put an AI widget on a public site, people will test it immediately. They will try prompt injection, ask unrelated questions, request private data, and see whether they can extract the system prompt. That makes guardrails a product feature, not a compliance checkbox.

Chattr uses input and output guardrails around the model call. On the way in, it checks rate limits, allowed topics, denied topics, message length, and prompt-injection patterns. On the way out, it can detect system-prompt leakage, filter unsafe content, and replace the answer with a safer fallback when needed.

const inputResult = runInputGuardrails({
  userMessage: lastUserMessage.content,
  messageHistory,
  clientKey,
  config,
  language,
});
 
if (!inputResult.allowed) {
  return {
    blocked: true,
    reason: inputResult.reason,
    message: inputResult.cannedResponse,
  };
}
 
const retrieval = await retrieveContext(sanitizedMessage, 5, tenant.dbPath, { intent });
 
if (retrieval.confidence === "low") {
  return buildFallbackResponse(...);
}
 
const result = streamText({
  model: getModel(),
  system: systemPrompt,
  messages: sanitizedHistory,
});
 
const outputResult = runOutputGuardrails({
  generatedText,
  systemPrompt,
  config,
  language,
});

There are two benefits to this setup. The obvious one is safety. The less obvious one is product control. Once guardrails are config-driven, each tenant can define what "safe enough" means for their support surface. A docs site can allow broader technical questions. A pricing assistant can stay tightly scoped. A regulated team can force a handoff when confidence drops below a threshold.

It also makes the assistant easier to trust internally. Teams are much more comfortable adopting an AI widget when they can see the knobs: allowed origins, topic rules, rate limits, prompt leak protection, escalation paths, and fallback messaging. Black-box behavior is hard to ship to customers. Configurable behavior is much easier to defend.

Multi-Tenant Without Cross-Tenant Leakage

Multi-tenant AI products have one terrifying failure mode: one customer's data or behavior leaking into another customer's session. I wanted the architecture to make that hard by default.

That is why Chattr isolates the important pieces per tenant: database, knowledge base, scrape source, prompt, branding, allowed origins, provider context, and guardrails. One deployment can power multiple assistants, but the runtime always resolves a tenant before it reaches ingestion or chat logic. In practice, that means tenant context is not a convenience object. It is the backbone of every request.

I also preferred explicit isolation over magic abstractions. Separate SQLite databases per tenant are not glamorous, but they are easy to understand, easy to back up, and easy to inspect during debugging. If one tenant has bad chunks, prompt issues, or provider misconfiguration, I can diagnose that boundary directly instead of spelunking through a shared multi-tenant vector table with complex row-level logic.

This choice also keeps self-hosting approachable. A team can run Chattr in Docker, mount persistent storage, and know exactly where content and config live. If I were deploying a client instance to Azure, I would keep the same mental model: one app service or container workload, isolated tenant storage, and clear secret boundaries. If I were running it on Vercel, I would still want ingestion and secrets handled with the same tenant-first design, even if the serving layer becomes more serverless.

From a CI/CD perspective, multi-tenant support changes what you need to validate. It is not enough to check whether the widget renders. You also want smoke tests for origin validation, provider configuration, ingestion status, and fallback behavior when a tenant has no matching knowledge. Those checks are the difference between "the app deploys" and "every tenant still behaves safely after the deploy."

Open Source and Fully Local Were Core Requirements

Open source mattered from the start. If a support chatbot is going to sit on a real production site, teams should be able to inspect it, self-host it, adapt the guardrails, and swap providers instead of treating it like a black box. That is a big part of why Chattr is open source and MIT-licensed rather than a closed hosted-only product.

That choice also shaped the architecture. A project that aims to be forkable and self-hostable has to stay understandable. SQLite is easier to reason about than a separate vector database. Config-driven tenants are easier to customize than hidden admin state. A one-script embed is easier to adopt than a bespoke integration layer. The more practical the stack is, the more real the open-source promise becomes.

The fully local runtime path pushed the design in a useful direction too. Chattr can run against Ollama and other OpenAI-compatible local servers, which means teams can test the whole system without API keys, recurring model costs, or sending data off box. Even if a production deployment ends up using OpenAI or Azure OpenAI, building for a local-first path is a good forcing function: it keeps the boundaries clean and the operational footprint honest.

That provider flexibility is also part of the product spirit. Some teams care most about convenience. Others care about data residency, cost ceilings, or vendor independence. Chattr works better as a self-hosted support layer when model choice is an operator decision instead of a product constraint.

The Widget UX Matters as Much as the Backend

The assistant can have perfect architecture and still feel bad if the widget experience is clumsy. I wanted the UI to feel fast, grounded, and helpful without pretending it was a full support desk.

That led to a few decisions I like a lot:

  • Starter questions reduce blank-state friction
  • Source links make grounded answers inspectable
  • Follow-up suggestions keep the conversation moving
  • Thumbs up and thumbs down feedback create a lightweight quality signal
  • English and Dutch support make the widget usable in more realistic deployments
  • Handoff actions give the user a path forward when AI is not the right tool

This is also where the embeddable widget earns its keep. Because it is framework-agnostic, it can sit inside lots of environments without asking the host team to adopt a specific frontend stack just to add support chat. That keeps the integration story aligned with the product goal: one script, fast setup, branded behavior.

I also wanted the widget to inherit tenant identity instead of feeling like a generic chatbot dropped on top of the page. Colors, logo, copy tone, starter prompts, and escalation label all come from tenant configuration. That may sound cosmetic, but it changes perceived trust immediately. Users are more willing to ask product questions when the assistant looks like part of the product.

Keeping It Operationally Small

Another thing I cared about was day-two usability. It is easy to ship an AI demo that works once. It is much harder to make a support chatbot straightforward to set up, redeploy, inspect, and tune.

So Chattr tries to stay practical operationally too:

  • There is an onboarding flow that helps configure a provider, brand defaults, and initial scrape settings
  • Docker is the fastest path to a production-like deployment
  • Health and retrieval checks make it easier to validate behavior after changes
  • Feedback signals give you a lightweight way to see where answers are helping or failing

That practicality is part of the product, not separate from it. If teams are going to self-host something customer-facing, they need a path from "clone the repo" to "I can trust this in front of users" without standing up an entire platform team.

Tradeoffs and Limitations

There are real tradeoffs in this architecture, and I would rather be explicit about them.

Per-tenant SQLite keeps things simple, but it is not infinite scale. For the target use case, support chatbots for docs and websites, I think that is the right trade. If I needed massive shared ingestion throughput across hundreds of large tenants, I would probably revisit the storage layout.

Scraped content quality is only as good as the source site. If the docs are outdated, duplicated, or structurally messy, retrieval quality drops with them. Chattr can chunk and rank content well, but it cannot invent a clean information architecture for the customer.

Guardrails are never finished. Prompt injection patterns evolve, off-topic behavior changes by model, and different providers fail in different ways. The right mindset is continuous tuning, not one perfect ruleset.

Provider flexibility increases testing cost. Supporting OpenAI, Anthropic, Azure OpenAI, and Ollama is great for adoption, but it also means response shape, latency, and tool behavior can vary. You need stronger regression testing when provider choice becomes a feature.

The widget is intentionally narrow. Chattr is built for grounded support flows, not deep agentic workflows or back-office automation. I think that focus is a strength, but it does mean there are product classes where a broader AI agent platform makes more sense.

The next step I would take is better evaluation. I want a repeatable dataset of support questions per tenant, with expected citations and acceptable fallback behavior, so retrieval and guardrail changes can be tested like any other product surface. That kind of evaluation matters even more when AI is public-facing.

Conclusion

Building Chattr clarified something for me: the hard part of a good support chatbot is not calling the model. It is shaping the system around the model so the answers stay grounded, scoped, safe, portable, and easy to deploy.

The combination that made this work was fairly pragmatic: a Hono API, a framework-agnostic widget, SQLite plus sqlite-vec, tenant-first configuration, provider flexibility, and guardrails that are treated like product behavior instead of backend plumbing. None of that is flashy, but together it makes the assistant much easier to trust.

The open-source part is not incidental either. Chattr works because the stack is inspectable, self-hostable, and adaptable enough for real teams with different privacy needs, brands, and support workflows. That is what makes it feel closer to a usable product than a polished AI demo.

Chattr is one of the best examples in my projects of the kind of practical AI product work I enjoy building. If you are exploring a customer-facing assistant and want help balancing UX, retrieval, safety, and deployment reality, what I offer is a good place to start. You can also explore Chattr on GitHub.

Related Posts