
In the previous article, we discussed how to build a RAG system pipeline in Laravel.
Now that we already store embeddings in PostgreSQL, the next step is implementing semantic search.
Semantic search allows the AI system to understand meaning instead of exact keywords.
Traditional search:
text"authentication"
Semantic search:
text"how users login securely"
Both can return similar results.
Understanding Vector Similarity
Vector similarity measures how close two embeddings are.
Common metrics:
| Metric | Description |
|---|---|
| Cosine Similarity | Measures angle |
| Euclidean Distance | Measures distance |
| Inner Product | Measures projection |
| pgvector supports all of them. |
Most RAG systems use cosine similarity.
Creating Semantic Search Query
Generate query embedding:
php<?php $queryEmbedding = app(EmbeddingService::class) ->embed($query);
Search similar documents:
php<?php $embeddingString = '[' . implode(',', $queryEmbedding) . ']'; $documents = DB::select( " SELECT id, title, content, embedding <=> ?::vector AS distance FROM documents ORDER BY distance ASC LIMIT 5 ", [$embeddingString] );
What is <=> Operator?
pgvector provides special operators.
| Operator | Meaning |
|---|---|
<-> | Euclidean distance |
<#> | Inner product |
<=> | Cosine distance |
Lower distance means more similar.
Creating Search API Endpoint
Routes:
php<?php use App\Http\Controllers\SearchController; Route::post('/search', SearchController::class);
Controller:
php<?php namespace App\Http\Controllers; use App\Services\EmbeddingService; use Illuminate\Http\Request; use Illuminate\Support\Facades\DB; class SearchController extends Controller { public function __invoke(Request $request) { $query = $request->query; $embedding = app(EmbeddingService::class) ->embed($query); $embeddingString = '[' . implode(',', $embedding) . ']'; $documents = DB::select( " SELECT id, title, content, embedding <=> ?::vector AS distance FROM documents ORDER BY distance ASC LIMIT 5 ", [$embeddingString] ); return response()->json($documents); } }
Optimizing Vector Search Performance
Without indexes, vector search becomes slow.
Create vector index:
sqlCREATE INDEX documents_embedding_idx ON documents USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);
Analyze table:
sqlANALYZE documents;
Understanding Chunking
Large documents should be split into smaller chunks.
Bad approach:
text1 huge PDF β 1 embedding
Good approach:
text1 PDF β multiple chunks β multiple embeddings
Benefits:
- Better retrieval
- More accurate context
- Faster search
- Reduced hallucination
Example Chunking Strategy
php<?php $text = file_get_contents($path); $chunks = str($text) ->split(1000);
Store each chunk separately.
Building a Context Builder
After retrieving documents, combine them into context.
php<?php $context = collect($documents) ->pluck('content') ->join("\n\n");
Example:
textYou are an AI assistant. Use the following context: [DOCUMENTS] Question: How does Laravel queue work?
Preventing Hallucinations
One major advantage of RAG is reducing hallucinations.
Good prompt:
textOnly answer using the provided context. If the answer does not exist, say you do not know.
This dramatically improves reliability.
Building a Retrieval Service
php<?php namespace App\Services; use Illuminate\Support\Facades\DB; class RetrievalService { public function search(string $query): array { $embedding = app(EmbeddingService::class) ->embed($query); $embeddingString = '[' . implode(',', $embedding) . ']'; return DB::select( " SELECT title, content, embedding <=> ?::vector AS distance FROM documents ORDER BY distance ASC LIMIT 5 ", [$embeddingString] ); } }
Next Part
In the next article, we will build:
- Full AI chat endpoint
- Streaming responses
- Chat memory
- Conversation history
- Production optimization
- Queue workers
- Background embedding jobs
- Multi-model architecture