Choosing Aria storage engine as a crash-safe replacement for MyISAM ( Cross-post from Roger Eisentrager, MariaDB Corporation)
MariaDB Storage Engines
Current storage engine recommendations based on workload:
- Read-heavy workloads: Aria
- General purpose: InnoDB
- ACID: InnoDB
- Write-heavy workloads: MyRocks
- Compression: MyRocks
- Sharded: Spider
- Analytical workloads: MariaDB ColumnStore
Why Use Aria for MariaDB
While InnoDB is the default storage engine for databases created with MariaDB, Aria is used for most of the internal system tables from MariaDB Enterprise Server 10.4. Aria has a small footprint and allows for easy copying between systems and is particularly suited for read-heavy workloads. Aria has an advanced page-based storage format that allows for superior caching performance and is optimized for concurrency.
In real-world environments, Aria produces better results for aggregated constructs (such as GROUP BY or ORDER BY) that are prevalent in analytics and transaction processing. Analytics and transactional processing find simple things like totals, maximums, and statistics. Most applications heavily rely on queries that use these aggregate functions.
MariaDB uses the Aria storage engine for all internal on-disk temporary tables. Since these tables are often created internally to execute GROUP BY and DISTINCT queries, these types of queries can benefit from Aria’s performance, even if the queried table uses a different engine.
The Aria storage engine is compiled-in by default in MariaDB. Aria is included with MariaDB Server (Enterprise and Community) with no additional installation or setup.
Aria Considerations
How best to use Aria will depend on the use case, table design, access patterns (SQL queries), performance targets, and recovery goals.
Advantages of Aria | Disadvantages of Aria |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Consider using Aria:
- If your application does a lot of FULL-TEXT searches.
- If your application does a lot of GROUP BY queries.
- If your application does not need ACID compliance or foreign keys.
- If your application currently uses MyISAM and requires a more advanced engine to allow better speed, backups, and automated crash recovery.
Performance Comparisons
It’s one thing to compare isolated queries and quite another to look at real-world examples. We did some tests comparing Aria, InnoDB, and MyISAM.
Performance Results
InnoDB | Aria | MyISAM | |
SQL 1 | 2.389 | 0.580 | 0.634 |
SQL 2 | 2.169 | 0.530 | 0.598 |
Simple Benchmark: Aria vs MyISAM vs InnoDB
In this simple benchmark, a table was created and populated with 2 million rows. GROUP BY and ORDER BY queries were done and copied into all three alternative storage engines: InnoDB, Aria, and MyISAM. Each test ran 10 times, with the average time (seconds) recorded for all tests.
Setup and commands used for testing:
1 2 3 4 5 |
INSERT INTO fact VALUES (1,1,'Justin',md5(''), .1); INSERT INTO fact SELECT FLOOR(1+ rand()*9999), FLOOR(1 + rand()*499), (select name from names where id = 1 + rand() * 4), MD5(1+rand()*9999), rand() FROM fact; |
Keep running the above insert until about 2 million rows are in the table.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
MariaDB [test2]> SELECT count(*) from fact; +----------+ | count(*) | +----------+ | 2097152 | +----------+ MariaDB [test2]> SHOW CREATE TABLE fact; +-------+------------------------------- | Table | Create Table +-------+------------------------------- | fact | CREATE TABLE `fact` ( `dim1` int(11) DEFAULT NULL, `dim2` int(11) DEFAULT NULL, `name` varchar(20) DEFAULT NULL, `hash` varchar(32) DEFAULT NULL, `measure1` double DEFAULT NULL) |
ENGINE=InnoDB
1 |
DEFAULT CHARSET=utf8 | |
Confirm that a good ratio of distinct vs total rows are present:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
MariaDB [test2]> SELECT count(distinct (dim1)) from fact; +------------------------+ | count(distinct (dim1)) | +------------------------+ | 9999 | +------------------------+ MariaDB [test2]> SELECT count(distinct (dim2)) from fact; +------------------------+ | count(distinct (dim2)) | +------------------------+ | 499 | +------------------------+ MariaDB [test2]> SELECT count(distinct (name)) from fact; +------------------------+ | count(distinct (name)) | +------------------------+ | 1 | +------------------------+ |
Aria
1 2 3 4 5 6 |
CREATE TABLE `test2`.`fact2` ( `dim1` INT(11) NULL DEFAULT NULL, `dim2` INT(11) NULL DEFAULT NULL, `name` VARCHAR(20) NULL DEFAULT NULL, `hash` VARCHAR(32) NULL DEFAULT NULL, `measure1` DOUBLE NULL DEFAULT NULL ) |
ENGINE=ARIA
1 2 3 |
TRANSACTIONAL=1; INSERT INTO `test2`.`fact2` (`dim1`, `dim2`, `name`, `hash`, `measure1`) SELECT `dim1`, `dim2`, `name`, `hash`, `measure1` FROM `fact`; |
MyISAM
1 2 3 4 5 6 7 |
CREATE TABLE `test2`.`fact3` ( `dim1` INT(11) NULL DEFAULT NULL, `dim2` INT(11) NULL DEFAULT NULL, `name` VARCHAR(20) NULL DEFAULT NULL, `hash` VARCHAR(32) NULL DEFAULT NULL, `measure1` DOUBLE NULL DEFAULT NULL ) COLLATE='latin1_swedish_ci' |
ENGINE=MyISAM
1 2 3 |
; INSERT INTO `test2`.`fact3` (`dim1`, `dim2`, `name`, `hash`, `measure1`) SELECT `dim1`, `dim2`, `name`, `hash`, `measure1` FROM `fact`; |
Test 2 different sqls across InnoDB, Aria, and MyISAM:
— sql 1:
1 2 |
SELECT dim1, dim2 from fact group by dim1 order by dim1; -- 9999 rows in set |
— sql 2:
1 2 |
SELECT dim1, dim2 from fact group by dim2 order by dim2; -- 499 rows in set |
Benchmark Recap
Testing shows Aria is much faster at the GROUP BY level than either InnoDB or MyISAM. Aria is slightly faster than MyISAM and ensures a crash-safe environment. The performance tests revealed Aria is four times faster than InnoDB. Because of the differences in use cases, each one should undergo testing in the MariaDB Enterprise Server using both Aria and InnoDB (or another, depending on need).
Summary
Depending on your workload, Aria could provide performance advantages. Because it is included with MariaDB Server and can be applied per database or per table (ENGINE=ARIA), you can test performance for your specific workload with no additional cost and very little additional effort–just plug it in and give it a try.
For more on storage engines and storage architecture, visit workload-optimized storage engines.