12 min read

Laravel AI SDK Tutorial Part 2: Build a RAG-Powered Support Bot with Tools and Memory

Go beyond basic agents. Build a support bot that remembers users, searches your docs, and queries your database.

Laravel AI SDK Tutorial Part 2: Build a RAG-Powered Support Bot with Tools and Memory

In Part 1 of this series, we built a document analyzer. It took text, returned structured output, and streamed the response. Useful, but honestly? Pretty basic. The agent couldn't remember who it was talking to. It couldn't look things up. It couldn't search your data for answers.

This time we're building something you'd actually ship to production.

We're creating a customer support agent that remembers conversation history, searches your help documentation using vector embeddings, looks up order details from your database, and falls back to web search when it can't find the answer internally. Plus full test coverage. All with the Laravel AI SDK.

If you haven't read Part 1, go do that first. We're picking up where we left off.

What We're Building

Here's what our support bot will do:

It remembers who it's talking to. A customer asks about their order, follows up three messages later with "what about the shipping?", and the bot knows exactly which order they mean. That's the RemembersConversations trait handling automatic conversation persistence.

It searches your knowledge base. You upload your help docs, FAQ pages, and product guides. The agent searches through them using vector embeddings to find relevant answers. This is RAG (Retrieval-Augmented Generation), and the SDK makes it surprisingly painless.

It queries your database. When a customer asks "where's my order?", the agent calls a custom tool that hits your orders table and returns the actual status. No hallucination. Real data.

It falls back to web search. If your knowledge base doesn't have the answer, the agent can search the web using the built-in WebSearch provider tool.

Let's build each piece.

Step 1: Set Up the Database for Embeddings

RAG needs vector embeddings, and vector embeddings need PostgreSQL with pgvector. If you're not familiar with how different search approaches compare in Laravel, I wrote a deep dive on full-text, semantic, and vector search that covers the foundations. For this tutorial, we'll assume PostgreSQL.

First, create the migration for your knowledge base:

php artisan make:model Article -m
// database/migrations/xxxx_create_articles_table.php
use Illuminate\Database\Migrations\Migration;
use Illuminate\Database\Schema\Blueprint;
use Illuminate\Support\Facades\Schema;

return new class extends Migration
{
    public function up(): void
    {
        Schema::ensureVectorExtensionExists();

        Schema::create('articles', function (Blueprint $table) {
            $table->id();
            $table->string('title');
            $table->text('content');
            $table->string('category')->nullable();
            $table->vector('embedding', dimensions: 1536)->index();
            $table->timestamps();
        });
    }
};

The vector column type is native to Laravel on PostgreSQL via pgvector. The ->index() call creates an HNSW index for fast similarity searches. 1536 dimensions matches OpenAI's text-embedding-3-small model, which is the default.

Now the model:

// app/Models/Article.php
namespace App\Models;

use Illuminate\Database\Eloquent\Model;

class Article extends Model
{
    protected $fillable = ['title', 'content', 'category', 'embedding'];

    protected function casts(): array
    {
        return [
            'embedding' => 'array',
        ];
    }
}

Cast the embedding column to array. Laravel handles the conversion between PHP arrays and PostgreSQL vector types automatically.

Step 2: Build the Embedding Pipeline

You need a way to generate embeddings when articles are created or updated. Here's an artisan command that seeds your knowledge base:

// app/Console/Commands/EmbedArticles.php
namespace App\Console\Commands;

use App\Models\Article;
use Illuminate\Console\Command;
use Illuminate\Support\Str;

class EmbedArticles extends Command
{
    protected $signature = 'articles:embed';
    protected $description = 'Generate embeddings for articles missing them';

    public function handle(): void
    {
        $articles = Article::whereNull('embedding')->get();

        $this->info("Embedding {$articles->count()} articles...");

        foreach ($articles as $article) {
            $text = "{$article->title}\n\n{$article->content}";

            $article->update([
                'embedding' => Str::of($text)->toEmbeddings(),
            ]);

            $this->line("  Embedded: {$article->title}");
        }

        $this->info('Done.');
    }
}

That Str::of($text)->toEmbeddings() call is one of my favorite things about this SDK. One line. No HTTP client setup, no JSON parsing, no error handling boilerplate.

For production, enable caching to avoid redundant API calls for identical content by setting ai.caching.embeddings.cache to true in your config/ai.php. The SDK also supports batching multiple inputs via Embeddings::for([$text1, $text2])->generate().

Step 3: Create the Custom Tools

This is where it gets interesting. Tools let your agent do things beyond generating text. It can query your database, call APIs, or run any PHP code you want.

The Order Lookup Tool

php artisan make:tool LookupOrder
// app/Ai/Tools/LookupOrder.php
namespace App\Ai\Tools;

use App\Models\Order;
use Illuminate\Contracts\JsonSchema\JsonSchema;
use Laravel\Ai\Contracts\Tool;
use Laravel\Ai\Tools\Request;
use Stringable;

class LookupOrder implements Tool
{
    public function description(): Stringable|string
    {
        return 'Look up a customer order by order number or email address. '
             . 'Returns order status, items, and tracking information.';
    }

    public function handle(Request $request): Stringable|string
    {
        $order = Order::with('items')
            ->where('order_number', $request['order_number'])
            ->orWhere('customer_email', $request['customer_email'] ?? '')
            ->first();

        if (! $order) {
            return 'No order found with that information.';
        }

        return json_encode([
            'order_number' => $order->order_number,
            'status' => $order->status,
            'placed_at' => $order->created_at->format('M j, Y'),
            'items' => $order->items->map(fn ($item) => [
                'name' => $item->product_name,
                'quantity' => $item->quantity,
                'price' => $item->price,
            ])->toArray(),
            'tracking_number' => $order->tracking_number,
            'estimated_delivery' => $order->estimated_delivery?->format('M j, Y'),
        ]);
    }

    public function schema(JsonSchema $schema): array
    {
        return [
            'order_number' => $schema->string()
                ->description('The order number (e.g., ORD-12345)')
                ->required(),
            'customer_email' => $schema->string()
                ->description('Customer email as fallback identifier'),
        ];
    }
}

The schema tells the AI model what parameters the tool accepts. The description tells it when to use the tool. Be specific here. Vague descriptions lead to the agent calling tools at wrong times.

One thing that tripped me up: the handle method must return a string. If you return an object or array, you'll get a type error. Always json_encode your structured data (I usually validate it with a JSON formatter first to make sure the structure looks right before hardcoding it).

Step 4: Build the Support Agent

Now we combine everything. Create the agent:

php artisan make:agent SupportBot
// app/Ai/Agents/SupportBot.php
namespace App\Ai\Agents;

use App\Ai\Tools\LookupOrder;
use App\Models\Article;
use Laravel\Ai\Attributes\MaxSteps;
use Laravel\Ai\Attributes\Provider;
use Laravel\Ai\Attributes\Temperature;
use Laravel\Ai\Concerns\RemembersConversations;
use Laravel\Ai\Contracts\Agent;
use Laravel\Ai\Contracts\Conversational;
use Laravel\Ai\Contracts\HasTools;
use Laravel\Ai\Enums\Lab;
use Laravel\Ai\Promptable;
use Laravel\Ai\Providers\Tools\WebSearch;
use Laravel\Ai\Tools\SimilaritySearch;
use Stringable;

#[Provider(Lab::Anthropic)]
#[Temperature(0.3)]
#[MaxSteps(5)]
class SupportBot implements Agent, Conversational, HasTools
{
    use Promptable, RemembersConversations;

    public function instructions(): Stringable|string
    {
        return <<<'PROMPT'
        You are a helpful customer support agent for our e-commerce store.

        Rules:
        - Always search the knowledge base first before answering product or policy questions.
        - Use the order lookup tool when customers ask about order status, shipping, or tracking.
        - If the knowledge base doesn't have the answer, use web search as a last resort.
        - Be concise and friendly. Don't repeat information the customer already provided.
        - If you genuinely can't help, suggest they contact support@example.com.
        PROMPT;
    }

    public function tools(): iterable
    {
        return [
            SimilaritySearch::usingModel(
                model: Article::class,
                column: 'embedding',
                minSimilarity: 0.5,
                limit: 5,
            )->withDescription(
                'Search the knowledge base for help articles, FAQs, and product guides.'
            ),
            new LookupOrder,
            (new WebSearch)->max(3),
        ];
    }
}

Let's break down what's happening here.

RemembersConversations is doing the heavy lifting for memory. It automatically stores every message (user and assistant) to the agent_conversations and agent_conversation_messages tables that shipped with the SDK migrations. No manual database work.

SimilaritySearch::usingModel() creates a RAG tool. When the agent decides it needs to search your knowledge base, it generates an embedding for the search query, runs a vector similarity search against your articles table, and returns the most relevant matches. The minSimilarity: 0.5 threshold filters out weak matches so the agent doesn't get confused by irrelevant content.

MaxSteps(5) limits how many tool calls the agent can make per prompt. Without this, a confused agent could loop through tools forever (and burn through your API budget).

Temperature(0.3) keeps responses focused and factual. For a support bot, you want consistency, not creativity.

Step 5: Wire Up the Routes

// routes/web.php
use App\Ai\Agents\SupportBot;
use Illuminate\Http\Request;

// Start a new conversation
Route::post('/support/start', function (Request $request) {
    $response = (new SupportBot)
        ->forUser($request->user())
        ->prompt($request->input('message'));

    return response()->json([
        'conversation_id' => $response->conversationId,
        'message' => (string) $response,
    ]);
});

// Continue an existing conversation
Route::post('/support/reply', function (Request $request) {
    $response = (new SupportBot)
        ->continue($request->input('conversation_id'), as: $request->user())
        ->prompt($request->input('message'));

    return response()->json([
        'message' => (string) $response,
    ]);
});

The forUser() method starts a new conversation tied to the authenticated user. The continue() method picks up where the last message left off. All previous messages are loaded automatically.

Want streaming? Swap ->prompt() for ->stream() and the SDK returns a proper SSE response you can consume with any JavaScript EventSource client or the Vercel AI SDK on the frontend. You can also call ->usingVercelDataProtocol() on the stream for direct compatibility.

Step 6: Add Agent Middleware

Middleware lets you intercept prompts before they hit the provider. This is incredibly useful for logging, cost tracking, rate limiting, or injecting dynamic context.

php artisan make:agent-middleware TrackUsage
// app/Ai/Middleware/TrackUsage.php
namespace App\Ai\Middleware;

use Closure;
use Illuminate\Support\Facades\Log;
use Laravel\Ai\Prompts\AgentPrompt;
use Laravel\Ai\Responses\AgentResponse;

class TrackUsage
{
    public function handle(AgentPrompt $prompt, Closure $next)
    {
        $startTime = microtime(true);

        return $next($prompt)->then(function (AgentResponse $response) use ($startTime) {
            $duration = round(microtime(true) - $startTime, 2);

            Log::channel('ai')->info('Agent response', [
                'tokens_in' => $response->usage->inputTokens ?? null,
                'tokens_out' => $response->usage->outputTokens ?? null,
                'duration' => $duration,
            ]);
        });
    }
}

Then add it to your agent by implementing HasMiddleware:

use App\Ai\Middleware\TrackUsage;
use Laravel\Ai\Contracts\HasMiddleware;

class SupportBot implements Agent, Conversational, HasTools, HasMiddleware
{
    // ... existing code ...

    public function middleware(): array
    {
        return [
            new TrackUsage,
        ];
    }
}

The then callback runs after the response is complete (works for both sync and streaming). You get access to the full response including token usage, which is exactly what you need for tracking costs per conversation.

Step 7: Add Provider Failover

Production apps can't afford downtime because OpenAI had a bad day. Failover takes one line:

// On the agent class
#[Provider([Lab::Anthropic, Lab::OpenAI])]
class SupportBot implements Agent

If Anthropic returns an error or hits a rate limit, the SDK automatically retries with OpenAI. No try/catch, no retry logic, no circuit breakers. Your users never see the failure.

Step 8: Test Everything

This is where the SDK really shines compared to rolling your own. Every single component can be faked.

// tests/Feature/SupportBotTest.php
namespace Tests\Feature;

use App\Ai\Agents\SupportBot;
use App\Models\Article;
use App\Models\Order;
use App\Models\User;
use Laravel\Ai\Embeddings;
use Laravel\Ai\Prompts\AgentPrompt;
use Tests\TestCase;

class SupportBotTest extends TestCase
{
    public function test_bot_responds_to_general_questions(): void
    {
        SupportBot::fake(['Happy to help! Our return policy allows returns within 30 days.']);
        Embeddings::fake();

        $user = User::factory()->create();

        $response = (new SupportBot)
            ->forUser($user)
            ->prompt('What is your return policy?');

        $this->assertStringContainsString('return policy', (string) $response);

        SupportBot::assertPrompted(function (AgentPrompt $prompt) {
            return $prompt->contains('return policy');
        });
    }

    public function test_bot_looks_up_orders(): void
    {
        SupportBot::fake(function (AgentPrompt $prompt) {
            return 'Your order ORD-12345 is currently being shipped. '
                 . 'Tracking number: TRK-ABC123.';
        });

        $user = User::factory()->create();
        $order = Order::factory()->create([
            'order_number' => 'ORD-12345',
            'status' => 'shipped',
            'tracking_number' => 'TRK-ABC123',
        ]);

        $response = (new SupportBot)
            ->forUser($user)
            ->prompt('Where is my order ORD-12345?');

        $this->assertStringContainsString('ORD-12345', (string) $response);
    }

    public function test_bot_continues_conversations(): void
    {
        SupportBot::fake([
            'I see your order ORD-12345 is shipped!',
            'The estimated delivery is March 5, 2026.',
        ]);

        $user = User::factory()->create();

        $first = (new SupportBot)
            ->forUser($user)
            ->prompt('Check order ORD-12345');

        $second = (new SupportBot)
            ->continue($first->conversationId, as: $user)
            ->prompt('When will it arrive?');

        $this->assertStringContainsString('March 5', (string) $second);
    }

    public function test_no_stray_prompts(): void
    {
        SupportBot::fake()->preventStrayPrompts();

        // If the bot is prompted without a matching fake, the test fails.
        // This catches unintended AI calls in other tests.
    }
}

SupportBot::fake() intercepts all calls to the AI provider. No actual API requests are made. Your tests run fast, cost nothing, and are deterministic.

The preventStrayPrompts() method is something I'd recommend adding to your test suite's base class. It catches accidental AI calls that would slow down your CI pipeline and cost money.

You should also fake embeddings in any test that triggers the embedding pipeline. Embeddings::fake() prevents real API calls and keeps your CI fast and free.

How It All Flows Together

When a customer sends "where's my order ORD-12345?", here's what happens: the SDK loads conversation history from the database, the agent reads the instructions and recognizes this as an order query, it calls LookupOrder with the order number, the tool queries your database and returns JSON, the agent formulates a friendly response, and the middleware logs everything. Both messages get persisted for future context.

If they follow up with "what about the refund on that?", the agent already knows which order they mean. No session hacks. No passing IDs around.

Trade-Offs and What to Watch For

I'm bullish on this setup, but let me be honest about the rough edges.

PostgreSQL is required for embeddings. The whereVectorSimilarTo method and vector columns only work with pgvector on PostgreSQL. If you're running MySQL, you can still use the FileSearch provider tool with OpenAI's vector stores instead of local embeddings. It's a different approach (files stored with the provider rather than in your database), but it works without PostgreSQL.

Conversation history grows. RemembersConversations loads all previous messages for a conversation. For long support threads, this can eat your context window. You'll want to implement a rolling window or summarization strategy for conversations that go past 20-30 messages.

Tool descriptions matter more than you think. If the agent keeps calling the wrong tool (or not calling tools when it should), the problem is almost always the description. Be specific about when the tool should be used and what it returns.

Cost awareness. Every tool call is a round trip to the AI provider. An agent with 5 tools and MaxSteps(10) could theoretically make 10 API calls per user message. The middleware we built tracks this, so use it. Watch your logs.

My recommendation: start with Anthropic as your primary provider with OpenAI failover. Claude handles tool use and long conversations really well. Set MaxSteps conservatively (3-5 for most use cases), and always test with preventStrayPrompts() in your test suite.

Frequently Asked Questions

Can I use MySQL instead of PostgreSQL for RAG?

Not for local vector embeddings. The whereVectorSimilarTo method requires pgvector on PostgreSQL. But you can use the FileSearch provider tool with OpenAI's hosted vector stores, which works regardless of your database. You upload files to OpenAI, they handle the embeddings and searching.

How do I limit conversation history to avoid hitting the context window?

The RemembersConversations trait loads all messages by default. For production, implement the Conversational interface manually and add a ->limit(30) to your messages query. You could also build a summarization step that condenses older messages into a single summary message.

What happens when the SimilaritySearch tool returns no results?

The agent receives "no results found" and falls back to its other tools (like WebSearch) or answers from its training knowledge. That's why the tool priority in your system instructions matters. Tell the agent: search knowledge base first, then web search, then general knowledge.

Can I use this with Livewire instead of API routes?

Yes. The agent doesn't care how it's called. You can prompt it from a Livewire component, a controller, a queue job, or an Artisan command. For streaming with Livewire, use the broadcasting approach instead of SSE.

How much does this cost to run per conversation?

It depends on the provider and conversation length. A typical 5-message support conversation with Anthropic (Claude Haiku) and 2-3 tool calls runs roughly $0.01-0.03. Using the #[UseCheapestModel] attribute on your agent class is an easy way to minimize costs for straightforward support queries.

Wrapping Up

In Part 1, we proved the SDK works. In this tutorial, we built something you'd actually put in front of customers. Memory, RAG, custom tools, middleware, failover, and proper tests. That's a production-ready support bot.

The Laravel AI SDK is at v0.2.1 right now and moving fast. Features like agent orchestration (agents delegating to other agents) are on the roadmap. But what's already there is solid enough to build real products with.

If you're building a SaaS with AI features and want help getting this kind of setup running, get in touch. I build MVPs and AI-powered features for Laravel apps. Fixed price, real code, deployed and ready.


Got a Product Idea?

I build MVPs, web apps, and SaaS platforms in 7 days. Fixed price, real code, deployed and ready to use.

⚡ Currently available for 2-3 new projects

Hafiz Riaz

About Hafiz

Full Stack Developer from Italy. I help founders turn ideas into working products fast. 9+ years of experience building web apps, mobile apps, and SaaS platforms.

View My Work →

Get web development tips via email

Join 50+ developers • No spam • Unsubscribe anytime