Configuring RocksDB for Enhanced WRITE Throughput on Multicore Servers

Introduction

Configuring RocksDB, a persistent key-value store, for optimal WRITE throughput on a multicore server infrastructure involves tuning various parameters. Below are key settings with recommended values and descriptions.

Tuning RocksDB for Enhanced WRITE Throughput

  1. Hardware Selection
    • SSDs over HDDs: SSDs offer faster write speeds and lower latency compared to HDDs.
    • High-Speed Networking: Essential for reducing data transfer bottlenecks.
  2. CPU Utilization
    • Concurrent Background Jobs (max_background_jobs):
      • Recommended Value: Set to 2-4 times the number of CPU cores.
      • Description: Controls the number of concurrent background jobs like compaction and flushing, utilizing multiple cores effectively.
    • Thread Pool Size:
      • Recommended Method: Env::SetBackgroundThreads.
      • Description: Sets the number of threads for flush and compaction, balancing between CPU usage and throughput.
  3. Write Buffers
    • Write Buffer Size (write_buffer_size):
      • Recommended Value: 64MB to 128MB.
      • Description: Determines the size of the memory buffer before data is flushed to disk.
    • Max Write Buffers (max_write_buffer_number):
      • Recommended Value: 3 to 4.
      • Description: Controls the number of write buffers, affecting memory usage and write throughput.
  4. Compaction
    • Compaction Style:
      • Recommended: Level-Style or Universal, depending on workload.
      • Description: Influences how data is organized and compacted on disk.
    • Compaction Priority (compaction_pri):
      • Recommended Value: MinCompactionPri.
      • Description: Determines the priority of compaction tasks, optimizing CPU and I/O resources.
  5. Write-Ahead Log (WAL)
    • WAL Configuration:
      • Recommended: Balance between synchronous and asynchronous modes.
      • Description: Affects the durability and throughput of writes.
  6. Column Families
    • Configuration: Adjust each column family based on specific workload requirements.
  7. Memory Management
    • Max Allocated Memory (max_allocated_memory):
      • Recommended Value: Depends on available system memory.
      • Description: Controls overall memory usage by RocksDB.
    • Block Cache Size (block_cache_size):
      • Recommended Value: 25-50% of available memory.
      • Description: Affects read performance, indirectly influencing write throughput due to read-modify-write cycles in compactions.
  8. Compression
    • Compression Algorithms: Snappy or LZ4.
    • Description: Reduces disk space usage, improving write throughput but consider CPU overhead.
  9. Batch Writes
    • Utilization: Combine multiple write operations into a single atomic operation.
    • Description: Reduces write amplification and improves overall throughput.
  10. Monitoring and Profiling
    • Approach: Regular performance monitoring to identify bottlenecks and adjust configurations.
  11. RocksDB Version
    • Recommendation: Use the latest version for performance improvements.
  12. Custom Environment (Env):
    • Recommendation: Customize to integrate with specific hardware or filesystem.

Conclusion

The recommended values are starting points and should be adjusted based on specific workload, hardware characteristics, and performance metrics observed during benchmarking and monitoring. It's crucial to test these settings under realistic conditions and continuously monitor the system to ensure optimal performance. Here is another post that explains how LSMs like RocksDB influence WRITE performance.

About Shiv Iyer 444 Articles
Open Source Database Systems Engineer with a deep understanding of Optimizer Internals, Performance Engineering, Scalability and Data SRE. Shiv currently is the Founder, Investor, Board Member and CEO of multiple Database Systems Infrastructure Operations companies in the Transaction Processing Computing and ColumnStores ecosystem. He is also a frequent speaker in open source software conferences globally.