Postgres Full Text Search
Postgres has built-in functions to handle Full Text Search queries. This is like a "search engine" within Postgres.
to_tsvector()
Converts your data into searchable "tokens". to_tsvector() stands for "to text search vector". For example:
select to_tsvector("the green eggs and ham")
-- Returns 'egg':2 'green':1 'ham':4
select to_tsvector("the green egg and ham")
-- Returns 'egg':2 'green':1 'ham':4
Collectively these tokens are called a document which Postgres can use for comparisons,
where every token is a lexeme (unit of lexical meaning). The stopwords (and, or, the, ...)
are conveniently omitted. As you can see, the eggs will be normalized as a lexeme in English: egg.
to_tsquery()
to_tsquery(), which accepts a list of words that will be checked against
the normalized vector we created with to_tsvector().
The @@ operator to check if tsquery matches tsvector, it's returns true (t) if matched,
otherwise returns false (f).
Let's see some queries below:
select to_tsvector("the green eggs and ham") @@ to_tsquery("egg")
?column?
--------
t
select to_tsvector("the green eggs and ham") @@ to_tsquery("eggs")
?column?
--------
t
Use & for AND in the search query:
select to_tsvector("the green eggs and ham") @@ to_tsquery("eggs & red")
?column?
--------
f
Use | for OR in the search query:
select to_tsvector("the green eggs and ham") @@ to_tsquery("eggs | red")
?column?
--------
t
Use the proximity symbol <-> for searching for terms that are a certain "distance" apart.
For example, search the phase green egg, green is followed immediately by a match for egg.
select to_tsvector("the green eggs and ham") @@ to_tsquery("green <-> egg")
?column?
--------
t
For example, search for the phase egg [1 word] ham, find egg and ham within 2 words of each other:
select to_tsvector("the green eggs and ham") @@ to_tsquery("egg <1> ham")
?column?
--------
t
Use the negation symbol ! to find phrases which don't contain a search term.
For example, search for the phase that have egg but not ham:
select to_tsvector("the green eggs and ham") @@ to_tsquery("egg & !ham")
?column?
--------
f