When you start adding semantic search to an application, the obvious options are often Pinecone, Weaviate, Qdrant, Milvus, or another dedicated vector database.
That can be the right choice.
But many applications already have a PostgreSQL database running. And for a large class of semantic search use cases, that database can do the job directly.
The key is pgvector, an open-source PostgreSQL extension that adds vector types and vector similarity search to Postgres. It lets you store embeddings next to your relational data and query them with SQL.
The advantage is not only fewer moving parts. It is also architectural:
No separate sync pipeline, no second source of truth, and no extra infrastructure until you actually need it.
Semantic Search in One Sentence
Classic search compares words. Semantic search compares meaning.
The query "How do I get over the pass?" can find a text about a mountain pass rather than a school hallway because both the query and the documents are represented as numerical vectors, also called embeddings.
The basic flow is:
- Convert text into an embedding vector.
- Store that vector in PostgreSQL.
- Convert the search query into the same vector space.
- Find the stored vectors closest to the query vector.
The phrase "same vector space" matters. You cannot freely mix embeddings from different models.
The Most Important Design Decision: The Embedding Model
Before creating the database schema, choose the embedding model.
That choice determines:
- vector dimension
- embedding quality
- cost
- latency
- privacy profile
- whether text leaves your environment
For example, if you use OpenAI text-embedding-3-small, a vector(1536) column is a common fit. If you use another model, including a local model through Ollama, the dimension may be different.
This is not a detail. PostgreSQL will reject vectors with the wrong number of dimensions.
If you later change the embedding model, plan to:
- add a new vector column or table
- re-embed all documents
- rebuild the vector index
- avoid comparing old and new embeddings directly
Embedding model changes are data migrations.
Installing pgvector
Enable the extension once in the database:
CREATE EXTENSION IF NOT EXISTS vector;
With managed PostgreSQL providers, check whether pgvector is available on your plan and PostgreSQL version. Many providers support it, but you should verify this before designing around it.
Schema: Store Chunks, Not Just Documents
For small records, one vector per row is fine.
For real documents, it is usually better to embed chunks. A long document may cover several topics. One vector for the entire document often becomes too blurry.
A practical schema:
CREATE TABLE documents (
id BIGSERIAL PRIMARY KEY,
title TEXT NOT NULL,
source TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE TABLE document_chunks (
id BIGSERIAL PRIMARY KEY,
document_id BIGINT NOT NULL REFERENCES documents(id) ON DELETE CASCADE,
chunk_index INTEGER NOT NULL,
content TEXT NOT NULL,
embedding vector(1536) NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE (document_id, chunk_index)
);
Use vector(1536) only if your embedding model actually returns 1536-dimensional vectors. Otherwise, change the dimension to match the model.
Integration in .NET with Npgsql
Install the packages:
dotnet add package Npgsql
dotnet add package Pgvector
If you use Dapper, also add:
dotnet add package Pgvector.Dapper
For raw Npgsql, configure the data source with UseVector():
using Npgsql;
using Pgvector;
var dataSourceBuilder = new NpgsqlDataSourceBuilder(connectionString);
dataSourceBuilder.UseVector();
await using var dataSource = dataSourceBuilder.Build();
In production, create the extension and tables through migrations or database provisioning. If you create the extension at runtime, reload PostgreSQL types on the connection before using the new type.
Generating Embeddings with OpenAI
With the OpenAI .NET SDK:
using OpenAI.Embeddings;
var embeddingClient = new EmbeddingClient("text-embedding-3-small", apiKey);
async Task<float[]> GetEmbeddingAsync(string text)
{
var result = await embeddingClient.GenerateEmbeddingAsync(text);
return result.Value.ToFloats().ToArray();
}
For local embeddings, use a local embedding model through your preferred provider. The important rule is the same:
The model used for indexing and the model used for querying must be the same, or at least intentionally compatible.
Do not index documents with one model and query them with another.
Storing a Document Chunk
async Task InsertChunkAsync(
long documentId,
int chunkIndex,
string content,
CancellationToken cancellationToken = default)
{
var embedding = await GetEmbeddingAsync(content);
var vector = new Vector(embedding);
await using var conn = await dataSource.OpenConnectionAsync(cancellationToken);
await using var cmd = new NpgsqlCommand("""
INSERT INTO document_chunks (document_id, chunk_index, content, embedding)
VALUES (@documentId, @chunkIndex, @content, @embedding)
ON CONFLICT (document_id, chunk_index)
DO UPDATE SET
content = EXCLUDED.content,
embedding = EXCLUDED.embedding
""", conn);
cmd.Parameters.AddWithValue("documentId", documentId);
cmd.Parameters.AddWithValue("chunkIndex", chunkIndex);
cmd.Parameters.AddWithValue("content", content);
cmd.Parameters.AddWithValue("embedding", vector);
await cmd.ExecuteNonQueryAsync(cancellationToken);
}
The important part is the update path. If the content changes, the embedding must change too.
Semantic Search
The <=> operator calculates cosine distance. A smaller value means higher similarity. ORDER BY ... ASC returns the nearest vectors first.
public sealed record SearchResult(
long DocumentId,
long ChunkId,
string Title,
string Content,
double Distance);
async Task<List<SearchResult>> SearchAsync(
string query,
int limit = 5,
CancellationToken cancellationToken = default)
{
var queryEmbedding = await GetEmbeddingAsync(query);
var queryVector = new Vector(queryEmbedding);
await using var conn = await dataSource.OpenConnectionAsync(cancellationToken);
await using var cmd = new NpgsqlCommand("""
SELECT d.id,
c.id,
d.title,
c.content,
c.embedding <=> @queryVector AS distance
FROM document_chunks c
JOIN documents d ON d.id = c.document_id
ORDER BY c.embedding <=> @queryVector
LIMIT @limit
""", conn);
cmd.Parameters.AddWithValue("queryVector", queryVector);
cmd.Parameters.AddWithValue("limit", limit);
var results = new List<SearchResult>();
await using var reader = await cmd.ExecuteReaderAsync(cancellationToken);
while (await reader.ReadAsync(cancellationToken))
{
results.Add(new SearchResult(
DocumentId: reader.GetInt64(0),
ChunkId: reader.GetInt64(1),
Title: reader.GetString(2),
Content: reader.GetString(3),
Distance: reader.GetDouble(4)));
}
return results;
}
Notice that the query orders by the distance expression directly. With pgvector indexes, the ORDER BY embedding <=> @query LIMIT n shape is important.
Distance Operators
pgvector supports several distance operators:
| Operator | Meaning | Typical use |
|---|---|---|
<=> |
cosine distance | text embeddings and semantic search |
<-> |
L2 / Euclidean distance | general vector distance, images, spatial-like embeddings |
<#> |
negative inner product | inner-product search; multiply by -1 for the actual value |
<+> |
L1 distance | absolute-distance use cases |
<~> |
Hamming distance | binary vectors |
<%> |
Jaccard distance | binary vectors |
For most text embedding use cases, cosine distance is a good default.
Creating an Index with HNSW
Without an index, PostgreSQL scans the table for every vector search. That is exact and simple, but it becomes slow as the number of vectors grows.
HNSW is often the best starting index for application search:
CREATE INDEX document_chunks_embedding_hnsw_idx
ON document_chunks
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
The parameters:
-
m: maximum number of connections per layer; default is 16 -
ef_construction: candidate list size during index construction; default is 64
Higher values can improve recall, but increase memory usage and build time.
You can tune search recall per query:
SET hnsw.ef_search = 100;
or for one transaction:
BEGIN;
SET LOCAL hnsw.ef_search = 100;
SELECT ...
COMMIT;
Important: HNSW is approximate. It trades perfect recall for speed.
IVFFlat as an Alternative
IVFFlat can use less memory and build faster than HNSW, but usually has a weaker speed-recall trade-off.
CREATE INDEX document_chunks_embedding_ivfflat_idx
ON document_chunks
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
Do not create an IVFFlat index on an empty table. It needs data to form the lists.
A practical starting point for lists:
-
rows / 1000for up to 1M rows -
sqrt(rows)for over 1M rows
For 10,000 vectors, start around 10 lists, not 100.
At query time, tune probes:
SET ivfflat.probes = 10;
Higher probes improve recall and reduce speed.
Filtering and Hybrid Search
One of the strongest reasons to use pgvector is that vectors live next to relational data.
You can filter before searching:
SELECT d.id, d.title, c.content, c.embedding <=> @queryVector AS distance
FROM document_chunks c
JOIN documents d ON d.id = c.document_id
WHERE d.created_at > NOW() - INTERVAL '30 days'
AND d.source = 'documentation'
ORDER BY c.embedding <=> @queryVector
LIMIT 10;
For true hybrid search, do not just calculate keyword rank and then ignore it. Combine semantic and keyword ranks.
One practical pattern is Reciprocal Rank Fusion-style scoring:
WITH semantic AS (
SELECT c.id,
row_number() OVER (ORDER BY c.embedding <=> @queryVector) AS semantic_rank
FROM document_chunks c
LIMIT 100
),
keyword AS (
SELECT c.id,
row_number() OVER (
ORDER BY ts_rank_cd(
to_tsvector('english', c.content),
plainto_tsquery('english', @query)
) DESC
) AS keyword_rank
FROM document_chunks c
WHERE to_tsvector('english', c.content) @@ plainto_tsquery('english', @query)
LIMIT 100
)
SELECT d.id AS document_id,
c.id AS chunk_id,
d.title,
c.content,
COALESCE(1.0 / (60 + semantic.semantic_rank), 0) +
COALESCE(1.0 / (60 + keyword.keyword_rank), 0) AS score
FROM semantic
FULL OUTER JOIN keyword ON keyword.id = semantic.id
JOIN document_chunks c ON c.id = COALESCE(semantic.id, keyword.id)
JOIN documents d ON d.id = c.document_id
ORDER BY score DESC
LIMIT 10;
This is not the only way to do hybrid search, but it makes the ranking logic explicit.
When pgvector Is the Right Choice
pgvector is a strong choice when:
- PostgreSQL is already your primary database
- vectors need to join naturally with relational data
- transactional consistency matters
- your team already knows PostgreSQL operations
- the dataset is small to medium sized
- you want to avoid another service and sync pipeline
It is especially good for internal search, product search, support tools, documentation search, and RAG systems where the relational database is already the source of truth.
When a Dedicated Vector Store May Be Better
A dedicated vector database is worth considering when:
- vector count grows into tens of millions or more
- query volume is very high and latency budgets are strict
- you need specialized managed operations around vector search
- you need advanced built-in hybrid search, reranking, or filtering behavior
- your team does not operate PostgreSQL deeply
- multi-tenant isolation and scaling are central requirements
The decision is not ideology. It is operational fit.
Complete ASP.NET Core Example
app.MapGet("/search", async (
string q,
NpgsqlDataSource db,
CancellationToken cancellationToken) =>
{
var queryEmbedding = await GetEmbeddingAsync(q);
var queryVector = new Vector(queryEmbedding);
await using var conn = await db.OpenConnectionAsync(cancellationToken);
await using var cmd = new NpgsqlCommand("""
SELECT d.id,
c.id,
d.title,
c.content,
c.embedding <=> @v AS distance
FROM document_chunks c
JOIN documents d ON d.id = c.document_id
ORDER BY c.embedding <=> @v
LIMIT 5
""", conn);
cmd.Parameters.AddWithValue("v", queryVector);
var results = new List<object>();
await using var reader = await cmd.ExecuteReaderAsync(cancellationToken);
while (await reader.ReadAsync(cancellationToken))
{
results.Add(new
{
documentId = reader.GetInt64(0),
chunkId = reader.GetInt64(1),
title = reader.GetString(2),
content = reader.GetString(3),
distance = reader.GetDouble(4)
});
}
return Results.Ok(results);
});
Conclusion
Semantic search does not always require a separate vector database.
If PostgreSQL is already your source of truth, pgvector lets you store embeddings next to relational data, query them with SQL, filter with normal PostgreSQL conditions, and keep transactions in one system.
But the clean version has a few rules:
- choose the embedding model before choosing the vector dimension
- do not mix embeddings from different models
- chunk documents when they are long or multi-topic
- index only when the table is large enough to need it
- remember that HNSW and IVFFlat are approximate
- treat model changes as data migrations
For many production applications, pgvector is enough for a long time. Not because dedicated vector databases are unnecessary, but because the simplest reliable architecture is often the one that keeps the data where it already lives.
References
- pgvector README: https://github.com/pgvector/pgvector
- pgvector .NET support: https://github.com/pgvector/pgvector-dotnet
- OpenAI .NET SDK: https://github.com/openai/openai-dotnet
- OpenAI embeddings guide: https://platform.openai.com/docs/guides/embeddings
- Microsoft.Extensions.AI: https://learn.microsoft.com/en-us/dotnet/ai/microsoft-extensions-ai













