Troubleshooting InnoDB History Length with Hung MySQL Transaction
In this post we have explained in detail how a hung transaction had lead InnoDB history length to grow uncontrolled and affected MySQL performance negatively. Sometimes we get emergency support calls in MinervaDB that SELECT queries are running unacceptably slower and slower, Eventually customer should restart the MySQL server again and that’s not a solution for troubleshooting MySQL performance, The symptoms of this ticket were so misleading first (we even spend initially sometime on query performance analysis and index operation forensics to reach nowhere) and finally we got some clarity on the output of the query “show engine innodb status”, We have copied the same below:
1 |
Trx read view will not see trx with id >= 75163496241, sees < 53179273519 ---TRANSACTION 75392513195, ACTIVE 915248 sec 5 lock struct(s), heap size 81751, 3 row lock(s), undo log entries 2 ... |
Simulation
We decided to simulate the entire issue with sysbench (used Sysbench script from Percona for this )
1 |
conn=" --db-driver=mysql --mysql-host=localhost --mysql-user=user --mysql-password=password --mysql-db=sbtest " sysbench --test=/usr/share/sysbench/tests/include/oltp_legacy/oltp.lua --mysql-table-engine=InnoDB --oltp-table-size=20000000 $conn prepare sysbench --num-threads=64 --max-requests=0 --max-time=240 --test=/usr/share/sysbench/tests/include/oltp_legacy/oltp.lua --oltp-table-size=20000000$conn --oltp-test-mode=complex --oltp-point-selects=0 --oltp-simple-ranges=0 --oltp-sum-ranges=0 --oltp-order-ranges=0 --oltp-distinct-ranges=0 --oltp-index-updates=1 --oltp-non-index-updates=0 run |
The above output led us to check InnoDB_history_list_length:
1 2 3 4 5 6 7 |
mdbperflab> show status like 'Innodb_history_list_%'; +----------------------------+-------+ | Variable_name | Value | +----------------------------+-------+ | Innodb_history_list_length | 7625 | +----------------------------+-------+ 1 row in set (0.00 sec) |
When innodb transaction history grows continuously, The SELECTs need to scan more and more for the previous version of the rows, This leads to a major performance bottleneck. But, Why InnoDB_history_list_length is growing continuously ? There are some 1278 transactions in this state and active for 915248 sec. The output of MySQL processlist showed they were in “Sleep” state, Which clearly says those transaction were either “lost” or “hung”. We could also see each of these transactions were holding two lock structures and one undo record, They were not committed and not rolled-back. These transactions were doing absolutely nothing. This happens due to the default transaction isolation level implementation REPEATABLE-READ in the InnoDB. InnoDB is an MVCC storage engine, which means you can start a transaction and continue to see a consistent snapshot even as the data changes. This is implemented by keeping old versions of rows as they are modified. So InnoDB history list is the undo logs which are used to store these modifications. They are a fundamental part of InnoDB’s transaction processing architecture.
How do we measure the the InnoDB History List or Do we have some units for that ?
What does it really means if InnoDB history list length is 7625 ?
This topic is super confusing because there are several claims made on units of InnoDB history list length:
- unpurged old row versions
- unpurged transactions in the undo space
- modifications to the database
- undo segments
- unflushed segments in the undo log
- undo pages, or number of pages in the undo log
- undo logs
So what is it, really?
Let’s go to the source:
- Here’s where it’s printed, from trx_sys->rseg_history_len
- Which is defined here
- And modified in places like this
- Using a parameter to a function named trx_purge_add_update_undo_to_history that’s only called from this location
- Which is called from a couple of places, such as this
Actually this will encourage you to learn more about InnoDb internals.
The best name for the unit is undo logs, which are “update undo logs for committed transactions”, to quote the source. But an “undo log” is a special term with specific meaning in InnoDB. An undo log is a single set of atomic changes a transaction makes, which may actually modify multiple records.
Troubleshooting InnoDB transaction history issue
We have copied below InnoDB transaction isolation levels and how they impact on InnoDB History Length:
MySQL Version | Transaction isolation | InnoDB History Length |
---|---|---|
MySQL 5.6 | repeatable read | InnoDB history is not purged until “hung” transactions are completed |
MySQL 5.6 | repeatable read | InnoDB history is purged (this solution worked for most of our customers) |
Aurora | repeatable read | InnoDB history is not purged until “hung” transactions are completed |
Aurora | repeatable read | InnoDB history is not purged until “hung” transactions are completed |
In the InnoDB transaction isolation level READ COMMITTED, InnoDB does not need to maintain history length when other transactions have committed changes, This setting works with troubleshooting InnoDB History Length in MySQL 5.6 and later. This recommendation (READ COMMITTED isolation level) doesn’t work with Amazon Aurora, the history length still continues to grow.
Forensics on the hung transactions and list the queries
There are several methods to audit performance bottleneck on the transactions and queries related to the same in MySQL:
- MySQL slow query log
- Percona Server Audit Log Plugin (If you are using Percona Server for MySQL)
- MySQL General Query Log
- Percona Monitoring and Management
- Performance Schema
Finding queries from the hung transactions with Performance Schema
To identify the hung queries in MySQL with Performance Schema, Follow the steps below:
- Enable Performance Schema
- Enable events_statement_history
1 2 3 |
mysql> update performance_schema.setup_consumers set ENABLED = 'YES' where NAME='events_statements_history'; Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0 |
- To find all transactions started in 60 seconds ago, Please run the query below:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
SELECT ps.id as processlist_id, trx_started, trx_isolation_level, esh.EVENT_ID, esh.TIMER_WAIT, esh.event_name as EVENT_NAME, esh.sql_text as SQL, esh.RETURNED_SQLSTATE, esh.MYSQL_ERRNO, esh.MESSAGE_TEXT, esh.ERRORS, esh.WARNINGS FROM information_schema.innodb_trx trx JOIN information_schema.processlist ps ON trx.trx_mysql_thread_id = ps.id LEFT JOIN performance_schema.threads th ON th.processlist_id = trx.trx_mysql_thread_id LEFT JOIN performance_schema.events_statements_history esh ON esh.thread_id = th.thread_id WHERE trx.trx_started < CURRENT_TIME - INTERVAL 60 SECOND AND ps.USER != 'SYSTEM_USER' ORDER BY esh.EVENT_ID\G *************************** 1. row *************************** processlist_id: 5420618 trx_started: 2016-05-18 18:41:30 trx_isolation_level: READ COMMITTED EVENT_ID: 1 TIMER_WAIT: 66057000 EVENT_NAME: statement/sql/select SSQL: select @@version_comment limit 1 RETURNED_SQLSTATE: NULL MYSQL_ERRNO: 0 MESSAGE_TEXT: NULL ERRORS: 0 WARNINGS: 0 *************************** 2. row *************************** processlist_id: 5420618 trx_started: 2016-05-18 18:41:56 trx_isolation_level: READ COMMITTED EVENT_ID: 2 TIMER_WAIT: 135117000 EVENT_NAME: statement/sql/show_processlist SSQL: show processlist RETURNED_SQLSTATE: NULL MYSQL_ERRNO: 0 MESSAGE_TEXT: NULL ERRORS: 0 WARNINGS: 0 2 rows in set (0.01 sec) |
Note: Please change the number of seconds to match your workload.
Conclusions
The hung transactions in InnoDB will lead to uncontrolled growth of InnoDB History Length, This will directly impact SELECT query performance in MySQL negatively. We have addressed this problem for several customers by changing the TRANSACTION ISOLATION LEVEL to READ-COMMITTED. Technically, InnoDB History Length are the undo logs which can be used for recreating the history for MVCC purposes or can be rolled back. Let us know your comments, Thank you !
References:
- https://planet.mysql.com/
- InnoDB Transaction Isolation Levels – https://minervadb.com/index.php/2018/02/21/innodb-transaction-isolation-level/
- https://www.xaprb.com
- http://blog.jcole.us/2014/04/16/the-basics-of-the-innodb-undo-logging-and-history-system/
- http://vividcortex.com/
- https://www.percona.com/blog/2014/10/17/innodb-transaction-history-often-hides-dangerous-debt/
☛ MinervaDB contacts – Sales & General Inquiries
Business Function | Contact |
---|---|
☎ CONTACT GLOBAL SALES (24*7) | 📞 (844) 588-7287 (USA) 📞 (415) 212-6625 (USA) 📞 (778) 770-5251 (Canada) |
☎ TOLL FREE PHONE (24*7) | 📞 (844) 588-7287 |
🚩 MINERVADB FAX | +1 (209) 314-2364 |
📨 MinervaDB Email - General / Sales / Consulting | contact@minervadb.com |
📨 MinervaDB Email - Support | support@minervadb.com |
📨 MinervaDB Email -Remote DBA | remotedba@minervadb.com |
📨 Shiv Iyer Email - Founder and Principal | shiv@minervadb.com |
🏠 CORPORATE ADDRESS: CALIFORNIA | MinervaDB Inc. 440 N BARRANCA AVE #9718 COVINA, CA 91723 |
🏠 CORPORATE ADDRESS: DELAWARE | MinervaDB Inc., PO Box 2093 PHILADELPHIA PIKE #3339 CLAYMONT, DE 19703 |
🏠 CORPORATE ADDRESS: HOUSTON | MinervaDB Inc., 1321 Upland Dr. PMB 19322, Houston, TX, 77043, US |