This content originally appeared on DEV Community and was authored by Talemul Islam
🔍 Fuzzy String Matching in PostgreSQL with pg_trgm
When working with user data — names, addresses, search queries — exact string matching isn’t always enough. People make typos, use alternate spellings, or abbreviations.
PostgreSQL provides a powerful extension called pg_trgm (trigram search) that allows you to perform fuzzy text matching and find “close enough” results.
đź§© What is a Trigram?
A trigram is simply a group of three consecutive characters from a string.
For example, the word:
Talemul
Would be split into trigrams (with start/end markers):
" T", "Tal", "ale", "lem", "emu", "mul", "ul "
👉 Why the spaces?
- PostgreSQL pads the string at the beginning and end with spaces to catch edge cases.
Now, when comparing two strings (Talemul vs Talimul), PostgreSQL compares how many trigrams they share:
-
Talemul:[" T","Tal","ale","lem","emu","mul","ul "] -
Talimul:[" T","Tal","ali","lim","imu","mul","ul "]
They share trigrams like [" T","Tal","mul","ul "].
More shared trigrams = higher similarity score.
This makes trigram search excellent for catching typos or small spelling differences.
âś… Setup
Enable the extension:
CREATE EXTENSION IF NOT EXISTS pg_trgm;
Check installed extensions:
\dx
📊 Example: Finding Similar Names
Let’s say we have a table of names:
CREATE TABLE people (
id SERIAL PRIMARY KEY,
name TEXT
);
INSERT INTO people (name) VALUES
('Talemul'),
('Talimul'),
('Talimul Islam'),
('Tamim'),
('Talim');
Now, search for names similar to "Talemul":
SELECT name, similarity(name, 'Talemul') AS score
FROM people
WHERE name % 'Talemul' -- % means "is similar to"
ORDER BY score DESC;
đź–Ą Actual Output:
name | score
----------------+-------------
Talemul | 1.00000000
Talimul | 0.45454547
Talimul Islam | 0.29411766
Talim | 0.27272728
👉 Notice how "Talimul Islam" and "Talim" get lower scores, and may even be excluded depending on the similarity threshold.
⚡ Boost Performance with Indexes
For large datasets, use a GIN index to speed up similarity searches:
CREATE INDEX idx_people_name_trgm ON people USING gin (name gin_trgm_ops);
Now your searches with % and similarity() will scale much better.
🎯 Real-World Use Cases
-
User search: Find users by name despite typos (
JonvsJohn). - Autocomplete: Suggest terms even when slightly misspelled.
- Deduplication: Detect near-duplicate entries.
- Search engines: Power “Did you mean?” features.
📝 Wrap Up
Trigrams are a simple but powerful idea: break words into three-character chunks, compare overlaps, and compute similarity.
PostgreSQL’s pg_trgm extension brings this technique directly into SQL, letting you build typo-tolerant search features without needing an external search engine.
đź”— References:
This content originally appeared on DEV Community and was authored by Talemul Islam
Talemul Islam | Sciencx (2025-10-01T20:41:56+00:00) Fuzzy String Matching in PostgreSQL with pg_trgm (Trigram Search Tutorial). Retrieved from https://www.scien.cx/2025/10/01/fuzzy-string-matching-in-postgresql-with-pg_trgm-trigram-search-tutorial/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.