Search in Postgres using ZomboDb and elasticsearch

ноября 09, 2017

At some point in the development of the project the question arose of the large number of texts. Moreover, the texts are of different length: from tweets to large articles. First, the main search engine was chosen as the built-in Postgres _tsvector. To search for simple rules it was quite enough. An array of texts grew with great speed, and search rules become more complex, so a built engine is not covered claims.

Yes, there is a sphinx, it has great integration with Postgres, but the goal was to find a solution to use elasticsearch with Postgres. Why? elasticsearch has shown good results in some case-Ah project. Yes, and were server with him to keep logs logstash. It was also the desire to find a tool that will take data synchronization.

as a result of all the vast network has been found, the project ZomboDb, which fit the requirements.

the project Page on github.

the

install the extension

This section is a paraphrase of the official instructions.

the currently Supported versions of the packages:

the the the

Package	Version
Elasticsearch	1.7.1+ (not 2.0)
Postgres	9.3, 9.4, 9.5

My configuration is: Postgres 9.4, elasticsearch 1.7.5

With page you need to download and install a package with a plugin for Postgres (deb or rpm)
In postgresql.conf to add the line:

the
```
local_preload_libraries = 'zombodb.so"
```
Restart the database and create the extension:

the
```
psql db_name -c "CREATE EXTENSION zombodb;"
```
Next, with the same page you need to download a plugin for elasticsearch and install it:

the
```
bin/plugin -i zombodb -u file:///path/to/zombodb-plugin-X.X.X.zip
```

Add elasticsearch.yml:

the

threadpool.bulk.queue_size: 1024
threadpool.bulk.size: 12
http.max_content_length: 1024mb
index.query.bool.max_clause_count: 1000000

to Restart elasticsearch.

the installation is finished.

the

Creating test index

Suppose there is a table which contains the tags:

the

CREATE TABLE public.tags (
id INTEGER PRIMARY KEY NOT NULL DEFAULT nextval('tags_id_seq'::regclass),
word CHARACTER VARYING(100) NOT NULL
);

Create index:

the

CREATE INDEX tags_zdb_search_idx
ON tags
USING zombodb (zdb('tags', tags.ctid), zdb(tags))
WITH (url='http://localhost:9200/');

the result of this query to create the index and the data go straight to elasticsearch.
A query that will find the word mom and daddy:

the

SELECT *
FROM tags
WHERE zdb('tags', ctid) ==> 'word:(mother,father)';

Where word is the name of the field that will be searched. Search is implemented using operator ==>.

Also, ZomboDb provides domains phrase and fulltext, based on the type text. Using your own domains, you can define a mapping to elasticsearch.

the

query Language

by using queries, you can search on individual fields of the indexed table, and all fields.
Queries support logical operations (and, or, not), parentheses.
It is possible to use various search operators. For example, the query

the

SELECT *
FROM texts
WHERE zdb('texts', ctid) ==> 'text:dad';

where the operator is a colon, and returns texts that contain the word dad.
Also supports more like this and fuzzy like this using the operators@ and :@~, respectively.
Example:

the

SELECT *
FROM texts
WHERE zdb('texts', ctid) ==> '(text:@папа and title:@мама) or text:aunt';

Also, there is support for the comparison operators:

the

SELECT *
FROM texts
WHERE zdb('texts', ctid) ==> 'comments > 10';

a Detailed description of the query language in documentation.

the

Insights

the Project is a good product that works out of the box. Well documented, updated (the latest supported version of Postgres, the last commit at the time of writing, 27-day-old). If well and consistently will show itself in production, write a wrapper for sqlalchemy.

UPD Written in haste sqlalchemy extension

Article based on information from habrahabr.ru

Поиск по этому блогу

computer express