Hearth documentation

Hearth brings local GGUF inference to the standard .NET AI abstractions, so you can register a model once and keep the rest of your app on IChatClient and IEmbeddingGenerator.

Start here

Need	Page
Install the package and run your first prompt	Getting started
Understand `HearthOptions` and model selection	Configuration
Expose OpenAI-compatible endpoints in ASP.NET Core	ASP.NET Core integration
Orchestrate the inference server with .NET Aspire	Aspire integration
Add retrieval-augmented generation to your app	RAG pipeline
Drop a streaming chat UI into a Blazor app	Blazor chat component
Pick the right GPU backend package	GPU backends
Explore the sample apps in this repo	Samples
Browse the public API surface	API reference
Swap an existing OpenAI app to local inference	Replacing OpenAI with Hearth
Pick the right model and quantization for your hardware	Choosing the right model
Define and invoke tools from .NET methods	Tool calling
Docker, Kubernetes, performance tuning	Deploying to production
Understand the trade-offs vs Ollama	Hearth vs Ollama

What Hearth includes

Single-line registration with AddHearth(...)
Local chat and embeddings over Microsoft.Extensions.AI
OpenAI-compatible endpoints via MapHearth() (Hearth.AI.AspNetCore)
Aspire orchestration — container resource + client wiring (Hearth.AI.Aspire.Hosting + Hearth.AI.Aspire)
RAG pipeline — chunking, document loaders, in-memory and SQLite vector stores (Hearth.AI.Rag)
Blazor chat component — streaming UI with themes and Markdown rendering (Hearth.AI.Blazor)
Optional GPU backends for CUDA, Metal, and Vulkan

Core registration example

builder.Services.AddHearth(options =>
{
    options.Model = "./models/qwen2.5-7b-q4_k_m.gguf";
    options.ContextSize = 8192;
    options.GpuLayers = 35;
});

Once registered, inject IChatClient anywhere in your app and keep the rest of your code independent from the model host.