Files
my-docs/Databases/MySQL/04-Innodb.md
2026-04-29 00:34:46 +03:30

9.1 KiB
Raw Permalink Blame History

InnoDB Storage Engine:

This document provides an in-depth explanation of the InnoDB storage engine, its on-disk structures, memory management mechanisms (buffer pool), and change buffering. The target audience is database administrators and DevOps engineers who need to understand and tune InnoDB for performance and reliability.

Table of Contents

  1. What is InnoDB?
  2. MySQL Data Directories Related to InnoDB
  3. Pages in InnoDB
  4. Index Pages
  5. Tablespaces
  6. Buffer and Buffer Pool
    • Buffer Pool Metrics
    • Configuration Example
  7. Change Buffering
    • Configuration Parameters

1. What is InnoDB?

InnoDB is a storage engine for MySQL that provides:

  • ACID compliance (Atomicity, Consistency, Isolation, Durability)
  • Row-level locking
  • Foreign key constraints
  • Crash recovery
  • Multi-version concurrency control (MVCC)

It is the default storage engine for MySQL since version 5.5. InnoDB stores data in tablespaces, which are composed of pages.


In a typical MySQL installation, several directories are used to store InnoDBrelated files. Understanding their purpose helps with backup, recovery, and capacity planning.

Directory Description
innodb_redo Contains redo log files. Redo logs record changes made to InnoDB data to ensure durability.
innodb_temp Stores temporary tablespaces used for internal temporary tables and ondisk temporary objects.
mysql The system schema that holds metadata (database names, tables, privileges, etc.).

Even though mysql is not exclusively InnoDB, many system tables now use InnoDB by default.


3. Pages in InnoDB

A page is the smallest unit of storage in InnoDB. All data (table rows, indexes, etc.) is stored in pages.

  • Default page size: 16 KB (can be configured to 4 KB, 8 KB, 32 KB, or 64 KB via innodb_page_size).
  • Structure: Each page contains a header, a trailer (checksum), and the actual data.
  • When a page is full, InnoDB allocates a new page to hold more data.

Pages are read from disk into memory (the buffer pool) and written back to disk when modified.


4. Index Page in InnoDB

An index page is a special type of page that stores index entries. InnoDB uses a Btree data structure for both primary and secondary indexes.

  • Primary key index (clustered index): The leaf pages contain the actual row data for the table. The entire table is organised as a Btree based on the primary key.
  • Secondary index: Leaf pages contain the indexed column value and the primary key value (which is used to look up the full row in the clustered index).

Index pages are also 16 KB by default. Each index page contains pointers to child pages (for nonleaf levels) or row pointers (for leaf levels).


5. Tablespace

A tablespace is a logical storage container that holds InnoDB data. There are several types of tablespaces:

Tablespace Type Description
System tablespace Contains the data dictionary, doublewrite buffer, change buffer, and undo logs.
Filepertable tablespace Each table has its own .ibd file (controlled by innodb_file_per_table).
General tablespaces Usercreated tablespaces that can hold multiple tables.
Undo tablespace Stores undo logs for MVCC and transaction rollback.
Temporary tablespace Stores temporary tables created during queries or sessions (nonpersistent).

Each tablespace is divided into pages. The system tablespace (usually ibdata1) starts at 12 MB and grows as needed.


6. Buffer and Buffer Pool

What is a Buffer?

A buffer is a memory area that temporarily holds data read from disk to reduce the number of direct disk I/O operations. In InnoDB, the main buffer is called the buffer pool.

Buffer Pool

When a query requests data, InnoDB first checks whether the required pages are already present in the buffer pool:

  • If yes (cache hit): The data is returned directly from memory (extremely fast).
  • If no (cache miss): InnoDB reads the relevant pages from disk into the buffer pool, then serves the data from memory.

A common best practice is to set the buffer pool size to approximately 75% of the available system memory on a dedicated database server. For shared servers, reduce the percentage accordingly.

Configuration Example

In MySQL configuration file (my.cnf or my.ini):

[mysqld]
innodb_buffer_pool_size = 1G

Alternatively, change it dynamically at runtime (MySQL 8.0+):

SET PERSIST innodb_buffer_pool_size = 1073741824;   -- value in bytes

Buffer Pool Metrics

These status variables help monitor buffer pool efficiency. Query them with:

SHOW GLOBAL STATUS LIKE 'innodb_buffer_pool%';
Metric Description
Innodb_buffer_pool_reads Number of times InnoDB had to read a page from disk because it was not available in the buffer pool. High values indicate a shortage of buffer pool memory.
Innodb_buffer_pool_read_requests Total number of logical read requests (page accesses) made to the buffer pool.
Innodb_buffer_pool_wait_free Count of times a thread had to wait for a clean page to become available. Nonzero values suggest the buffer pool is under pressure (e.g., dirty page flushing is slow).
Innodb_buffer_pool_pages_free Number of free pages currently in the buffer pool. Low values mean the buffer pool is nearly full.

Interpreting Metrics

  • Cache hit ratio = (Innodb_buffer_pool_read_requests - Innodb_buffer_pool_reads) / Innodb_buffer_pool_read_requests. Aim for >99%.
  • If Innodb_buffer_pool_wait_free keeps increasing, consider increasing the buffer pool size or tuning flushing behaviour (innodb_io_capacity, innodb_max_dirty_pages_pct).
  • Low Innodb_buffer_pool_pages_free alone is not a problem; it just shows the buffer pool is actively used.

7. Change Buffering

Change buffering is a feature that delays writing changes to secondary index pages. Instead of immediately updating the index pages on disk when a nonunique secondary index is modified, InnoDB records the change in a special area called the change buffer (which is part of the system tablespace). Later, when the index pages are read into the buffer pool by other queries, the buffered changes are merged (applied) to the pages.

This reduces random disk I/O and improves performance for workloads with many Data Manipulation Language (DML) operations (INSERT, UPDATE, DELETE) that affect secondary indexes.

Configuration Parameters

Both parameters are set in the MySQL configuration file.

innodb_change_buffering

Controls which operations are buffered. Possible values:

Value Description
none Do not buffer any changes.
inserts Buffer only insert operations.
deletes Buffer only delete operations (including purge operations).
changes Buffer inserts and deletemarking operations (but not actual purges).
purges Buffer only the physical deletion of rows that occur during background purge.
all Buffer inserts, deletemarking, and purges (default value).

Example configuration:

[mysqld]
innodb_change_buffering = all

innodb_change_buffer_max_size

Specifies the maximum size of the change buffer as a percentage of the total buffer pool size. The default is 25 (meaning 25% of the buffer pool). Valid range is 0 to 50.

Increasing this value allows more space for buffered changes, which can help workloads with heavy DML on secondary indexes, but it reduces the space available for cached data pages.

Example:

[mysqld]
innodb_change_buffer_max_size = 30

When to Tune Change Buffering

  • Writeheavy OLTP: Keep innodb_change_buffering = all and possibly increase innodb_change_buffer_max_size to 3040.
  • Readonly or mostly reads: Set innodb_change_buffering = none to avoid wasting buffer pool memory.
  • Unique indexes: Change buffering does not apply to unique secondary indexes because uniqueness checks require immediate disk access.