Building Reliable LLM Agents with Pydantic AI: A Coffee Shop Tutorial

Exploring pydantic ai with an example

Feb 05, 2025

Have you ever wondered how to make your LLM applications more reliable and type-safe? If you're building AI applications that process structured data from sources like databases or CSV files, and need consistent responses, you're in the right place. In this tutorial, we'll create a coffee shop analysis system using Pydantic AI, demonstrating how to build robust LLM agents with type safety and validation at their core, ensuring seamless interaction with large datasets.

What We'll Build

We'll develop an AI-powered system capable of:

Processing coffee shop order data
Answering questions about ordering patterns
Making predictions about future orders
Maintaining type safety and minimizing hallucinations

Prerequisites

Before diving in, ensure you have the following:

Python 3.9+ installed
Basic understanding of Python and FastAPI
A Google Cloud account for Vertex AI (we'll be using Gemini)
Sample coffee shop data (CSV or database)

Setting Up the Project

First, clone the repository and set up your environment:

git clone <repository-url>
cd pydantic_coffee

# Install uv for faster dependency management
curl -LsSf <https://astral.sh/uv/install.sh> | sh
uv venv
source .venv/bin/activate

# Install dependencies
uv pip install -e ".[dev]"

Understanding the Building Blocks

1. Modeling Coffee Shop Orders with Pydantic

Pydantic is a powerful data validation and settings management library that ensures our data is structured, reliable, and type-safe. Originally designed to validate JSON and API inputs, it has become a staple for developers who want strong guarantees around their data integrity. In fact, most LLM providers, including OpenAI and Google Vertex AI, already utilize Pydantic under the hood for structured data handling and response validation.

Unlike traditional Python data classes, Pydantic models enforce data validation at runtime. This means that anytime data is passed through a Pydantic model, it is checked against predefined constraints, reducing the chances of errors and unexpected behavior in our AI-powered applications.

Here's how we define our coffee order model using Pydantic:

from pydantic import BaseModel, Field
from enum import Enum
from datetime import datetime

class CoffeeType(Enum):
    AMERICANO = "AMERICANO"
    LATTE = "LATTE"
    CORTADO = "CORTADO"

class CoffeeOrder(BaseModel):
    order_id: int = Field(..., gt=0, description="Unique order ID")
    coffee_type: CoffeeType
    cost: float = Field(..., ge=0, description="Cost of the coffee in USD")
    time: datetime = Field(default_factory=datetime.utcnow, description="Timestamp of the order")

Why Use Pydantic?

✅ Automatic Data Validation – Ensures that inputs follow the expected structure and type constraints.

✅ Type Safety – Enforces strict typing, preventing common bugs related to incorrect data formats.

✅ Declarative Constraints – Easily define constraints such as gt=0 for positive values or default_factory=datetime.utcnow for automatic timestamps.

✅ Seamless Serialization – Converts models to and from JSON, dictionaries, and database entries effortlessly.

✅ Integration with FastAPI – Pydantic is the backbone of FastAPI, making it the ideal choice for building APIs.

✅ Widely Adopted in LLM Ecosystems – Many LLM providers already use Pydantic to ensure structured responses, making it an industry standard.

Example: Validation in Action

If a user tries to create an order with an invalid cost, Pydantic will immediately raise an error:

from pydantic import ValidationError

try:
    invalid_order = CoffeeOrder(order_id=101, coffee_type="LATTE", cost=-5)
except ValidationError as e:
    print(e)

Output:

cost
  ensure this value is greater than or equal to 0 (type=value_error.number.not_ge; limit_value=0)

With this structured approach, we prevent incorrect data from entering our system, ensuring that all AI-powered insights and predictions are based on clean, valid data.

Here's how we define our coffee order model using Pydantic:

from pydantic import BaseModel, Field
from enum import Enum
from datetime import datetime

class CoffeeType(Enum):
    AMERICANO = "AMERICANO"
    LATTE = "LATTE"
    CORTADO = "CORTADO"

class CoffeeOrder(BaseModel):
    order_id: int = Field(..., gt=0, description="Unique order ID")
    coffee_type: CoffeeType
    cost: float = Field(..., ge=0, description="Cost of the coffee in USD")
    time: datetime = Field(default_factory=datetime.utcnow, description="Timestamp of the order")

2. Introducing Pydantic AI

While Pydantic ensures structured, validated data at the application level, Pydantic AI extends these guarantees to LLM interactions. With Pydantic AI, we can enforce strict schemas on LLM inputs and outputs, reducing the risk of hallucinations and ensuring that responses conform to expected structures.

Pydantic AI integrates seamlessly with Google Vertex AI, OpenAI models, and other LLM providers, enabling us to define response models with type safety while leveraging AI’s natural language capabilities.

Let’s initialize an LLM model using Pydantic AI:

from pydantic_ai.models.vertexai import VertexAIModel
from pydantic_ai import Agent

# Set up Google Cloud credentials
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "path/to/credentials.json"

# Initialize the model
model = VertexAIModel("gemini-1.5-flash-002")

Why Use Pydantic AI?

✅ Structured LLM Outputs – Ensures responses follow predefined schemas.

✅ Type-Safe Responses – Eliminates ambiguity in AI-generated content.

✅ Confidence Scoring – Helps assess the reliability of model outputs.

✅ Hallucination Prevention – Reduces AI-generated errors by enforcing structured responses.

✅ Seamless Integration with Pydantic – Uses the same validation principles as regular Pydantic models.

With Pydantic AI, we treat LLMs as structured data providers rather than unstructured text generators, making them more predictable and reliable for real-world applications.

Pydantic AI extends this type safety to LLM interactions. Let’s initialize an LLM model:

from pydantic_ai.models.vertexai import VertexAIModel
from pydantic_ai import Agent

# Set up Google Cloud credentials
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "path/to/credentials.json"

# Initialize the model
model = VertexAIModel("gemini-1.5-flash-002")

3. Structuring LLM Responses

To prevent hallucinations and ensure reliable AI-driven insights, we define a strict response model using Pydantic AI. This enforces structured outputs and mitigates the risks of incorrect or misleading responses from the LLM.

Here’s how we define our expected response structure:

from pydantic import BaseModel, Field
from typing import List, Dict, Any

class PatternResponse(BaseModel):
    pattern: str = Field(..., description="Identified ordering pattern")
    confidence: float = Field(ge=0, le=1, description="Confidence score of the pattern detection")
    reasoning: str = Field(..., description="Explanation of how the pattern was identified")
    supporting_data: List[Dict[str, Any]] = Field(..., description="Raw data supporting the pattern analysis")

How This Prevents Hallucinations

✅ Strict Formatting – The LLM must adhere to predefined response structures, reducing ambiguity.

✅ Confidence Scoring – Helps assess the reliability of model-generated insights.

✅ Reasoning & Justification – Forces the model to explain its conclusions, improving transparency.

✅ Supporting Evidence – Ensures AI-driven decisions are based on actual data rather than arbitrary assumptions.

By enforcing these constraints, we make LLM outputs predictable, auditable, and less prone to errors. This approach transforms LLMs from freeform text generators into structured, reliable AI agents that can be trusted in business-critical applications.

To prevent hallucinations, we define a strict response model:

from pydantic import BaseModel, Field
from typing import List, Dict, Any

class PatternResponse(BaseModel):
    pattern: str
    confidence: float = Field(ge=0, le=1)
    reasoning: str
    supporting_data: List[Dict[str, Any]]

This enforces:

✅ Consistent formatting

✅ Confidence scoring

✅ Supporting evidence

4. Building the Coffee Order Analysis Agent

Now, let’s define an LLM-powered agent that can analyze ordering trends:

class OrderPatternAgent:
    def __init__(self, model: VertexAIModel):
        self.agent = Agent(
            model=model,
            deps_type=Orders,
            result_type=PatternResponse,
            system_prompt=(
                "You are an AI analyzing coffee ordering patterns. "
                "Base your analysis ONLY on the provided data. "
            )
        )

    async def analyze(self, orders: Orders):
        return await self.agent.run(
            user_prompt="What are the dominant patterns?",
            deps=orders
        )

5. Adding Observability with Logging

Monitoring LLM interactions is crucial. Let’s integrate logfire for logging:

import logfire
from fastapi import FastAPI

logfire.configure(service_name="pydantic_coffee")
app = FastAPI()

@app.get("/analyze")
async def analyze_orders():
    logfire.info("Starting order analysis")

    try:
        orders = order_service.get_orders()
        agent = OrderPatternAgent(model)
        response = await agent.analyze(orders)

        logfire.info("Analysis complete", confidence=response.confidence)
        return response
    except Exception as e:
        logfire.error("Analysis failed", error=str(e))
        raise

6. Handling Large Datasets Efficiently

To process large volumes of orders, we use batch processing:

async def analyze_large_dataset(self, orders: Orders):
    chunk_size = 1000
    patterns = []

    for chunk in self._chunk_orders(orders, chunk_size):
        response = await self.agent.run(
            user_prompt="Analyze this data segment",
            deps=chunk
        )
        patterns.append(response)

    return self._aggregate_patterns(patterns)

This ensures scalability when working with large data sets.

Testing Our System

Let’s put everything to the test:

# Initialize components
model = VertexAIModel("gemini-1.5-flash-002")
agent = OrderPatternAgent(model)

# Analyze orders
response = await agent.analyze(orders)
print(f"Pattern: {response.pattern}")
print(f"Confidence: {response.confidence}")
print(f"Supporting Data: {response.supporting_data}")

Best Practices for LLM Development

✅ Define Response Models – Ensures predictable LLM output and minimizes hallucinations

✅ Use Type Safety Throughout – Prevents runtime errors and improves maintainability

✅ Monitor Everything – Log LLM interactions, confidence scores, and response times

Conclusion

We’ve built a robust and scalable LLM system that:

Processes structured data safely
Generates reliable AI-driven insights
Scales efficiently for large datasets
Provides observability with logging

By combining Pydantic AI with LLM capabilities, we achieve both the flexibility of AI and the reliability of strong typing.

Next Steps 🚀

Try extending this system to:

Handle more complex queries
Add advanced analytics
Integrate real-time data sources

The complete code for this tutorial is available on GitHub. Have questions or suggestions? Open an issue or contribute to the project! 🎯

Anand’s Substack

Discussion about this post