What is declarative partitioning in MariaDB? How declarative partitioning benefit MariaDB performance? Monitoring MariaDB partitioning and disk I/O operations

Declarative partitioning in MariaDB is a feature that allows you to divide a table into smaller, more manageable partitions based on a partitioning key. This can help improve query performance by allowing the database to quickly filter rows based on the partitioning key, rather than scanning the entire table.

Declarative partitioning in MariaDB benefits performance in several ways:

  1. Reduced disk I/O: With partitioning, the database can quickly skip over partitions that don’t contain the data being searched for, reducing the number of disk reads and writes required.
  2. Improved query performance: By filtering out partitions that don’t contain relevant data, the database can perform queries more quickly.
  3. Easier data management: Partitioning can make it easier to manage large datasets, as you can easily add and remove partitions as needed.

Declarative partitioning is a feature introduced in MariaDB 10.3 that allows you to create and manage partitions in a more simplified way, using SQL statements rather than writing custom code. Here are some examples of how to use declarative partitioning in MariaDB:

Range partitioning

MariaDB Range Partitioning is a partitioning method that allows you to distribute data across multiple partitions based on a range of values in a column. Each partition contains a range of values for that column, and each row is assigned to the partition whose range it falls within.

Here’s an example of how to create a range-partitioned table in MariaDB:

CREATE TABLE my_table (
id INT NOT NULL,
name VARCHAR(50),
age INT,
PRIMARY KEY (id)
)
PARTITION BY RANGE(age)
(
PARTITION p1 VALUES LESS THAN (20),
PARTITION p2 VALUES LESS THAN (30),
PARTITION p3 VALUES LESS THAN (40),
PARTITION p4 VALUES LESS THAN MAXVALUE);

In this example, we’re creating a table called my_table with three columns (id, name, and age). The age column is our partitioning key, and we’re using the RANGE partitioning method to partition the table into 4 partitions, each containing a range of age values.

When a new row is inserted into my_table, MariaDB will compare the value of the age column with the partitioning ranges and store the row in the partition whose range it falls within. For example, if the age value is 25, MariaDB will store the row in partition p2.

You can query a range-partitioned table like any other table in MariaDB. However, because the data is distributed across multiple partitions, you may need to use the PARTITION keyword to specify which partition to query. For example:

SELECT * FROM my_table PARTITION (p2);

This query selects all rows from the my_table table that are stored in partition p2.

Range partitioning is useful when you want to distribute data across partitions based on a specific column value range. However, it can result in uneven data distribution if the data is not evenly distributed across the ranges. In such cases, hash partitioning or list partitioning may be a better choice.

List partitioning

MariaDB List Partitioning is a partitioning method that allows you to group data into partitions based on specific values of a column. Each partition contains a list of values for that column, and each row is assigned to the partition whose value list it matches.

Here’s an example of how to create a list-partitioned table in MariaDB:

CREATE TABLE my_table (
id INT NOT NULL,
name VARCHAR(50),
age INT,
PRIMARY KEY (id)
)
PARTITION BY LIST(age)
(
PARTITION p1 VALUES IN (18, 19, 20),
PARTITION p2 VALUES IN (21, 22, 23),
PARTITION p3 VALUES IN (24, 25, 26),
PARTITION p4 VALUES IN (27, 28, 29),
PARTITION p5 VALUES IN (30, 31, 32));

In this example, we’re creating a table called my_table with three columns (id, name, and age). The age column is our partitioning key, and we’re using the LIST partitioning method to partition the table into 5 partitions, each containing a list of specific age values.

When a new row is inserted into my_table, MariaDB will compare the value of the age column with the value lists in each partition and store the row in the partition whose value list it matches. For example, if the age value is 25, MariaDB will store the row in partition p3.

You can query a list-partitioned table like any other table in MariaDB. However, because the data is distributed across multiple partitions, you may need to use the PARTITION keyword to specify which partition to query. For example:

SELECT * FROM my_table PARTITION (p3);

This query selects all rows from the my_table table that are stored in partition p3.

List partitioning is useful when you want to group data into specific categories based on a specific column value. However, it can result in an excessive number of partitions if there are too many possible values for the partitioning column. In such cases, range partitioning or hash partitioning may be a better choice.

Hash partitioning

MariaDB Hash Partitioning is a partitioning method that allows you to distribute data across multiple partitions based on a hash function. The hash function takes a partitioning key as input and returns a partition number, which determines which partition the row should be stored in.

Here’s an example of how to create a hash-partitioned table in MariaDB:

CREATE TABLE my_table (
id INT NOT NULL,
name VARCHAR(50),
age INT,
PRIMARY KEY (id)
)
PARTITION BY HASH(id)
PARTITIONS 4;

In this example, we’re creating a table called my_table with three columns (id, name, and age). The id column is our partitioning key, and we’re using the HASH partitioning method with 4 partitions.

When a new row is inserted into my_table, MariaDB will compute the hash of the id value and use the result to determine which partition the row should be stored in. For example, if the hash of id is 123456 and we have 4 partitions, MariaDB might store the row in partition number 2.

This script will show you if a table is partitioned, list all partitions for the table, show the number of rows in each partition, and display disk I/O statistics for the partitioned table. You can use this information to optimize your partitioning strategy and improve query performance.

You can query a hash-partitioned table like any other table in MariaDB. However, because the data is distributed across multiple partitions, you may need to use the PARTITION keyword to specify which partition to query. For example:

SELECT * FROM my_table WHERE id = 1234 PARTITION (p2);

This query selects all rows from the my_table table where id is 1234, but it only looks in partition p2.

Hash partitioning is useful when you want to evenly distribute data across multiple partitions without any specific ordering. However, it can result in uneven data distribution if the partitioning key is not evenly distributed. In such cases, range partitioning or list partitioning may be a better choice.

How declarative partitioning benefits in MariaDB performance?

Declarative partitioning is a feature in MariaDB that allows you to split a large table into smaller, more manageable pieces called partitions. Each partition acts as a separate table, but together they form a logical whole. Declarative partitioning can provide several benefits in terms of performance, including:

  1. Improved query performance: With declarative partitioning, queries can be targeted to specific partitions, allowing them to be processed more quickly. For example, if you have a table that is partitioned by date, and you only need data from a specific date range, MariaDB can skip scanning the partitions that do not contain the relevant data, resulting in faster query performance.
  2. Faster data loading: When loading large amounts of data into a partitioned table, the data can be distributed across multiple partitions, allowing for faster loading times.
  3. Better management of large tables: Declarative partitioning allows you to manage large tables more easily, by breaking them down into smaller, more manageable pieces. This can make it easier to perform maintenance tasks, such as backing up or restoring data.
  4. Improved scalability: Partitioning can improve the scalability of your database, by allowing you to distribute data across multiple servers or disks.

Overall, declarative partitioning can be a powerful tool for improving the performance and manageability of large MariaDB databases. However, it is important to design your partitioning strategy carefully, taking into account the specific needs of your application and data.

Monitoring MariaDB Disk I/O Operations

— Show the partitioning configuration for a specific table
SHOW CREATE TABLE table_name;
— Show the partitions used by a specific query
EXPLAIN PARTITIONS SELECT * FROM table_name WHERE partition_key = ‘partition_value’;
— Show partitioning information for all partitioned tables
SELECT table_schema, table_name, partition_expression, partition_description
FROM information_schema.partitions
WHERE partition_method = ‘RANGE’;
— Show disk I/O statistics for the MariaDB server
SHOW GLOBAL STATUS LIKE ‘Innodb_data_read%’;
SHOW GLOBAL STATUS LIKE ‘Innodb_data_written%’;
SHOW GLOBAL STATUS LIKE ‘Innodb_buffer_pool_read%’;
SHOW GLOBAL STATUS LIKE ‘Innodb_buffer_pool_write%’;
— Show operating system level disk I/O statistics
SELECT * FROM INFORMATION_SCHEMA.SYSTEM_IO_GLOBAL_BY_WAIT_BY_BYTES;

This script includes commands to show the partitioning configuration, partitioning information for all partitioned tables, and partition usage by a specific query. It also includes commands to show various disk I/O statistics for both MariaDB and the operating system.

You can run this script periodically and analyze the output to identify any potential issues or bottlenecks in your database. You can also modify the script to include additional monitoring commands as needed.

About Shiv Iyer 157 Articles
Open Source Database Systems Engineer with a deep understanding of Optimizer Internals, Performance Engineering, Scalability and Data SRE. Shiv currently is the Founder, Investor, Board Member and CEO of multiple Database Systems Infrastructure Operations companies in the Transaction Processing Computing and ColumnStores ecosystem. He is also a frequent speaker in open source software conferences globally.