LaravelRAGChatbot

Membangun Chatbot AI di Laravel Menggunakan RAG, Ollama, dan Llama 3

Oleh Aditya Nursyahbani5 menit baca42
Bagikan:
Membangun Chatbot AI di Laravel Menggunakan RAG, Ollama, dan Llama 3

Building the AI Chat Endpoint

Now we are ready to connect:

This transforms Laravel into a real AI application.


Creating Chat Service

php
<?php namespace App\Services; use Illuminate\Support\Facades\Http; class ChatService { public function ask(string $question): string { $documents = app(RetrievalService::class) ->search($question); $context = collect($documents) ->pluck('content') ->join("\n\n"); $prompt = " You are a helpful AI assistant. Use only the provided context. Context: {$context} Question: {$question} "; $response = Http::post( 'http://localhost:11434/api/generate', [ 'model' => 'llama3', 'prompt' => $prompt, 'stream' => false, ] ); return $response->json()['response']; } }

Creating Chat API

Routes:

php
<?php use App\Http\Controllers\ChatController; Route::post('/chat', ChatController::class);

Controller:

php
<?php namespace App\Http\Controllers; use App\Services\ChatService; use Illuminate\Http\Request; class ChatController extends Controller { public function __invoke(Request $request) { $answer = app(ChatService::class) ->ask($request->message); return response()->json([ 'answer' => $answer, ]); } }

Testing the Chatbot

Request:

json
{ "message": "How does Laravel queue work?" }

Response:

json
{ "answer": "Laravel queues allow background job processing..." }

Streaming Responses

Streaming makes AI feel significantly faster.

Example:

text
ChatGPT-style typing effect

Ollama supports streaming.

php
<?php Http::withOptions([ 'stream' => true, ]);

Queueing Embedding Jobs

Embedding generation can be expensive.

Use queues.

Create job:

bash
php artisan make:job GenerateEmbeddingJob

Example:

php
<?php class GenerateEmbeddingJob implements ShouldQueue { public function handle(): void { // Generate embeddings } }

Recommended Production Architecture

Production-ready architecture:

text
Nginx ↓ Laravel API ↓ Redis Queue ↓ Embedding Workers ↓ PostgreSQL pgvector ↓ Ollama GPU Server

Recommended Open Source Models

PurposeModel
ChatLlama 3
Embeddingsnomic-embed-text
Fast chatPhi-3
CodingDeepSeek Coder
MultilingualQwen

Scaling Strategies

As your dataset grows:

  • Use chunking
  • Add caching
  • Use Redis
  • Use background workers
  • Optimize vector indexes
  • Separate AI servers
  • Add GPU acceleration

Common RAG Problems

ProblemSolution
HallucinationsBetter prompts
Slow searchVector indexes
Poor retrievalBetter chunking
High latencyCaching
Expensive inferenceQuantized models

Final Thoughts

Laravel is fully capable of powering modern AI applications.

With:

  • RAG
  • Vector search
  • Open source LLMs
  • Semantic retrieval
  • Local AI infrastructure

You can build:

  • AI assistants
  • Internal company chatbots
  • Knowledge bases
  • AI customer support
  • AI search engines
  • Document intelligence systems

without relying entirely on expensive external AI APIs.

This architecture also gives:

  • Better privacy
  • Lower costs
  • Full control
  • Offline capability
  • Custom domain knowledge

The future of Laravel applications is not only CRUD anymore.

It is AI-powered software.