Horizontally Scaling MariaDB

Understanding MariaDB Horizontal Scaling



Horizontal scaling (scale-out) involves distributing database workload across multiple servers rather than upgrading a single server’s hardware. MariaDB offers several approaches to achieve horizontal scalability.

Core Infrastructure Requirements

1. Hardware and Network Infrastructure

Server Requirements:

  • Multiple commodity servers with consistent specifications
  • High-speed network connectivity (10GbE or higher recommended)
  • Low-latency network between nodes (< 1ms for synchronous replication)
  • Redundant network paths for fault tolerance
  • Shared or distributed storage depending on the scaling approach

Storage Considerations:

  • SSD storage for better I/O performance
  • Network-attached storage (NAS) or Storage Area Network (SAN) for shared storage scenarios
  • Local storage with replication for distributed architectures

2. MariaDB Clustering Technologies

MariaDB Galera Cluster:

-- Galera cluster configuration
wsrep_on = ON
wsrep_provider = /usr/lib/galera/libgalera_smm.so
wsrep_cluster_address = "gcomm://node1,node2,node3"
wsrep_node_address = "192.168.1.10"
wsrep_node_name = "node1"
wsrep_sst_method = rsync

MariaDB MaxScale (Database Proxy):

  • Load balancing across multiple database servers
  • Read/write splitting
  • Connection pooling and multiplexing
  • Automatic failover capabilities

3. Replication Architecture

Master-Slave Replication:

-- Master configuration
server-id = 1
log-bin = mysql-bin
binlog-format = ROW

-- Slave configuration
server-id = 2
relay-log = relay-bin
read-only = 1

Master-Master Replication:

-- Node 1 configuration
server-id = 1
auto-increment-increment = 2
auto-increment-offset = 1
log-bin = mysql-bin

-- Node 2 configuration
server-id = 2
auto-increment-increment = 2
auto-increment-offset = 2
log-bin = mysql-bin

Scaling Strategies and Requirements

4. Read Scaling (Read Replicas)

Implementation Requirements:

  • Multiple read-only replicas for distributing read workload
  • Application-level read/write splitting or proxy-based routing
  • Monitoring replication lag to ensure data consistency
-- Create read-only user for replicas
CREATE USER 'readonly'@'%' IDENTIFIED BY 'password';
GRANT SELECT ON *.* TO 'readonly'@'%';

5. Write Scaling (Sharding)

Horizontal Partitioning Requirements:

  • Sharding key selection based on data distribution patterns
  • Application-level sharding logic or middleware
  • Cross-shard query handling mechanisms

Spider Storage Engine:

-- Create Spider table for sharding
CREATE TABLE users (
    id INT PRIMARY KEY,
    name VARCHAR(100),
    email VARCHAR(100)
) ENGINE=Spider
PARTITION BY HASH(id) PARTITIONS 4
COMMENT='wrapper "mysql", table "users"'
CONNECTION='server "shard1", server "shard2", server "shard3", server "shard4"';

6. High Availability Components

Load Balancers:

  • HAProxy or NGINX for connection distribution
  • Health checks and automatic failover
  • SSL termination and connection pooling

Monitoring and Management:

  • MariaDB Enterprise Monitor or open-source alternatives
  • Automated backup and recovery systems
  • Performance monitoring and alerting

Implementation Architecture Patterns

7. Multi-Master Setup (Galera Cluster)

Requirements:

  • Minimum 3 nodes for quorum-based decisions
  • Synchronous replication for data consistency
  • Application awareness of multi-master capabilities
# Galera cluster bootstrap
systemctl start mariadb --wsrep-new-cluster

# Join additional nodes
systemctl start mariadb

8. Federated Architecture

MariaDB Federation Requirements:

  • FederatedX storage engine configuration
  • Network connectivity between federated servers
  • Schema synchronization across federated instances
-- Create federated table
CREATE TABLE federated_users (
    id INT,
    name VARCHAR(100)
) ENGINE=FederatedX
CONNECTION='mysql://user:password@remote_host:3306/database/users';

Operational Requirements

9. Backup and Recovery Strategy

Distributed Backup Requirements:

  • Consistent backup across multiple nodes
  • Point-in-time recovery capabilities
  • Cross-datacenter backup replication
# Galera cluster backup
mariabackup --backup --target-dir=/backup/full --user=backup --password=password

10. Security and Access Control

Multi-Node Security:

  • SSL/TLS encryption for inter-node communication
  • Firewall configuration for cluster communication
  • Centralized user management across nodes
-- SSL configuration for Galera
wsrep_provider_options = "socket.ssl_key=/path/to/key.pem;socket.ssl_cert=/path/to/cert.pem"

Performance and Monitoring Requirements

11. Connection Management

Connection Pooling:

  • MaxScale connection pooling or application-level pooling
  • Connection limits per node
  • Connection routing based on query type

12. Query Optimization

Distributed Query Requirements:

  • Query routing based on data location
  • Cross-shard join optimization
  • Caching strategies for frequently accessed data

Best Practices for Implementation

13. Capacity Planning

  • Baseline performance measurement before scaling
  • Growth projection and scaling triggers
  • Resource allocation per node and workload type

14. Testing and Validation

  • Load testing across scaled infrastructure
  • Failover testing and disaster recovery drills
  • Performance benchmarking at each scaling level

15. Application Design Considerations

  • Database-agnostic application design
  • Graceful degradation during node failures
  • Eventual consistency handling in distributed scenarios

Conclusion

Implementing horizontally scalable MariaDB infrastructure requires careful planning of hardware resources, network architecture, replication strategies, and operational procedures. The choice between Galera clustering, traditional replication, or sharding depends on specific workload requirements, consistency needs, and operational complexity tolerance.

Key success factors include:

  • Proper network infrastructure with low latency and high bandwidth
  • Appropriate clustering technology selection based on use case
  • Comprehensive monitoring and automated management tools
  • Application-level awareness of the distributed architecture
  • Robust backup and recovery strategies for distributed data

Regular testing, monitoring, and capacity planning ensure the scaled infrastructure meets performance and availability requirements as the system grows.



Understanding InnoDB Cluster: Internal Mechanics for MySQL Horizontal Scaling

 

Expert Guide to MySQL Performance Troubleshooting: Best Practices and Optimization Techniques

 

Troubleshooting Galera Cluster: Tips and Tricks for a Healthy Database Replication Environment

About MinervaDB Corporation 102 Articles
Full-stack Database Infrastructure Architecture, Engineering and Operations Consultative Support(24*7) Provider for PostgreSQL, MySQL, MariaDB, MongoDB, ClickHouse, Trino, SQL Server, Cassandra, CockroachDB, Yugabyte, Couchbase, Redis, Valkey, NoSQL, NewSQL, Databricks, Amazon Resdhift, Amazon Aurora, CloudSQL, Snowflake and AzureSQL with core expertize in Performance, Scalability, High Availability, Database Reliability Engineering, Database Upgrades/Migration, and Data Security.