Plain table is a basic element for non-percolate searching. It can be defined only in a configuration file using the Plain mode, and is not supported in the RT mode. It is typically used in conjunction with a source to process data from the external storage and can later be attached to a real-time table.

👍 What you can do with a plain table:

⛔ What you cannot do with a plain table:

Numeric attributes, including MVAs, are the only elements that can be updated in a plain table. All other data in the table is immutable. If updates or new records are required, the table must be rebuilt. During the rebuilding process, the existing table remains available to serve requests, and a process called rotation is performed when the new version is ready, bringing it online and discarding the old version.

How to create a plain table

To create a plain table, you’ll need to define it in a configuration file. It’s not supported by the CREATE TABLE command.

Here’s an example of a plain table configuration and a source for fetching data from a MySQL database:

source source {
  type             = mysql
  sql_host         = localhost
  sql_user         = myuser
  sql_pass         = mypass
  sql_db           = mydb
  sql_query        = SELECT id, title, description, category_id  from mytable
  sql_attr_uint    = category_id
  sql_field_string = title
 }

table tbl {
  type   = plain
  source = source
  path   = /path/to/table
 }

Plain table building performance

The speed at which a plain table is indexed depends on several factors, including: * Data source retrieval speed * tokenization settings * The hardware specifications (such as CPU, RAM, and disk performance)

Plain table building scenarios

Rebuild fully when needed

For small data sets, the simplest option is to have a single plain table that is fully rebuilt as needed. This approach is acceptable when: * The data in the table is not as fresh as the data in the source * The time it takes to build the table increases as the data set grows

Main+delta scenario

For larger data sets, a plain table can be used instead of a Real-Time. The main+delta scenario involves: * Creating a smaller table for incremental indexing * Combining the two tables using a distributed table

This approach allows for infrequent rebuilding of the larger table and more frequent processing of updates from the source. The smaller table can be rebuilt more often (e.g. every minute or even every few seconds).

However, as time goes on, the indexing duration for the smaller table will become too long, requiring a rebuild of the larger table and the emptying of the smaller one.

The mechanism of kill list and killlist_target directive is used to ensure that documents from the current table take precedence over those from the other table.

Plain table files structure