MongoDB Indexing Strategy and Execution Plans

MongoDB Indexing Strategy and Execution Plans: Mastering Query Performance Optimization – Mastering MongoDB Indexing



Introduction

Mastering MongoDB Indexing is crucial for developers looking to enhance their database efficiency and speed.

MongoDB’s query performance hinges on effective indexing strategies and understanding how the query planner operates. While MongoDB’s query planner is sophisticated, it’s not infallible. Database administrators and developers who master indexing concepts, execution plan analysis, and query planner behavior can dramatically improve application performance and avoid common pitfalls that lead to slow queries and inefficient resource utilization.

Understanding MongoDB’s Query Planner Architecture

How the Query Planner Works

MongoDB’s query planner is a sophisticated component that evaluates multiple execution strategies for each query. When a query is executed, the planner:

  1. Analyzes Available Indexes: Examines all indexes that could potentially satisfy the query
  2. Generates Candidate Plans: Creates multiple execution plans using different index combinations
  3. Executes Trial Runs: Performs trial executions of candidate plans
  4. Selects Optimal Plan: Chooses the plan with the best performance characteristics
  5. Caches the Plan: Stores the winning plan for future use

Query Plan Caching Mechanism

The query planner caches execution plans to avoid repeated plan selection overhead:

// View cached plans for a collection
db.collection.getPlanCache().list()

// Clear plan cache
db.collection.getPlanCache().clear()

// Clear specific cached plan
db.collection.getPlanCache().clearPlansByQuery({field: "value"})

Plan Cache Invalidation

Cached plans are invalidated when:

  • Indexes are created or dropped
  • Collection statistics change significantly
  • Server restarts occur
  • Manual cache clearing is performed

Index Fundamentals and Types

Single Field Indexes

Single field indexes are the foundation of MongoDB indexing:

// Create ascending index
db.users.createIndex({email: 1})

// Create descending index
db.users.createIndex({createdAt: -1})

// Create text index for full-text search
db.articles.createIndex({title: "text", content: "text"})

Compound Indexes

Compound indexes support queries on multiple fields and follow the ESR (Equality, Sort, Range) rule:

// Optimal compound index following ESR rule
db.orders.createIndex({
    status: 1,        // Equality
    createdAt: -1,    // Sort
    amount: 1         // Range
})

// Query that benefits from the above index
db.orders.find({status: "completed", amount: {$gte: 100}})
         .sort({createdAt: -1})

Multikey Indexes

MongoDB automatically creates multikey indexes when indexing array fields:

// Document with array field
db.products.insertOne({
    name: "Laptop",
    tags: ["electronics", "computers", "portable"],
    categories: ["tech", "office"]
})

// Index on array field becomes multikey
db.products.createIndex({tags: 1})

Advanced Indexing Strategies

Index Intersection

MongoDB can use multiple indexes simultaneously through index intersection:

// Two separate indexes
db.inventory.createIndex({category: 1})
db.inventory.createIndex({price: 1})

// Query that can use both indexes
db.inventory.find({category: "electronics", price: {$lt: 500}})

However, compound indexes are generally more efficient than index intersection:

// More efficient compound index
db.inventory.createIndex({category: 1, price: 1})

Partial Indexes

Partial indexes only index documents that meet specified criteria:

// Index only active users
db.users.createIndex(
    {email: 1},
    {partialFilterExpression: {status: "active"}}
)

// Index only documents with specific field values
db.orders.createIndex(
    {customerId: 1, orderDate: -1},
    {partialFilterExpression: {amount: {$gte: 100}}}
)

Sparse Indexes

Sparse indexes exclude documents that don’t contain the indexed field:

// Sparse index excludes documents without phone field
db.users.createIndex({phone: 1}, {sparse: true})

TTL Indexes

Time-to-Live indexes automatically delete documents after a specified time:

// Delete documents 30 days after creation
db.sessions.createIndex(
    {createdAt: 1},
    {expireAfterSeconds: 2592000}
)

Compound Index Field Order Optimization

The ESR Rule Explained

The ESR (Equality, Sort, Range) rule provides guidance for compound index field ordering:

  1. Equality: Fields used in equality matches should come first
  2. Sort: Fields used for sorting should come next
  3. Range: Fields used in range queries should come last
// Suboptimal index order
db.sales.createIndex({amount: 1, region: 1, date: -1})

// Optimal index order following ESR
db.sales.createIndex({region: 1, date: -1, amount: 1})

// Query that benefits from optimal ordering
db.sales.find({region: "North", amount: {$gte: 1000}})
         .sort({date: -1})

Index Prefix Utilization

Compound indexes can satisfy queries using any prefix of the index:

// Compound index
db.products.createIndex({category: 1, brand: 1, price: 1})

// These queries can use the index:
db.products.find({category: "electronics"})
db.products.find({category: "electronics", brand: "Apple"})
db.products.find({category: "electronics", brand: "Apple", price: {$lt: 1000}})

// This query cannot use the index efficiently:
db.products.find({brand: "Apple", price: {$lt: 1000}})

Execution Plan Analysis

Using explain() Method

The explain() method provides detailed information about query execution:

// Basic explain
db.users.find({email: "user@example.com"}).explain()

// Detailed execution statistics
db.users.find({email: "user@example.com"}).explain("executionStats")

// All available information
db.users.find({email: "user@example.com"}).explain("allPlansExecution")

Key Execution Plan Metrics

Understanding critical metrics in execution plans:

// Example execution plan analysis
const plan = db.orders.find({status: "pending"}).explain("executionStats")

// Key metrics to examine:
// - executionStats.totalDocsExamined: Documents scanned
// - executionStats.totalDocsReturned: Documents returned
// - executionStats.executionTimeMillis: Query execution time
// - winningPlan.stage: Execution stage type
// - winningPlan.indexName: Index used (if any)

Identifying Performance Issues

Common execution plan indicators of performance problems:

// COLLSCAN indicates full collection scan
{
    "stage": "COLLSCAN",
    "direction": "forward"
}

// High ratio of examined to returned documents
{
    "totalDocsExamined": 10000,
    "totalDocsReturned": 10
}

// Multiple candidate plans suggest index optimization opportunities
{
    "rejectedPlans": [
        // Multiple alternative plans
    ]
}

Query Planner Statistics and Staleness

Understanding Plan Statistics

MongoDB maintains statistics about query execution to inform future planning decisions:

// View collection statistics
db.collection.stats()

// View index statistics
db.collection.aggregate([{$indexStats: {}}])

Stale Statistics Problem

Query planner statistics can become stale, leading to suboptimal plan selection:

Causes of Stale Statistics:

  • Significant data distribution changes
  • Large bulk operations
  • Index modifications
  • Long-running applications without restarts

Solutions:

// Force plan cache refresh
db.collection.getPlanCache().clear()

// Reindex to update statistics
db.collection.reIndex()

// Use hint() to force specific index usage
db.collection.find({field: "value"}).hint({field: 1})

Advanced Query Optimization Techniques

Index Hints

Force MongoDB to use specific indexes when the planner makes suboptimal choices:

// Force use of specific index
db.orders.find({customerId: 123, status: "pending"})
         .hint({customerId: 1, status: 1})

// Force collection scan (for testing)
db.orders.find({customerId: 123}).hint({$natural: 1})

Covered Queries

Optimize queries to return data directly from indexes:

// Index covers all query fields
db.users.createIndex({email: 1, name: 1, status: 1})

// Covered query - no document access needed
db.users.find({email: "user@example.com"}, {name: 1, status: 1, _id: 0})

Query Shape Optimization

Optimize queries based on their shape and access patterns:

// Analyze query shapes
db.collection.aggregate([
    {$currentOp: {}},
    {$match: {"command.find": {$exists: true}}},
    {$group: {
        _id: "$command.filter",
        count: {$sum: 1},
        avgDuration: {$avg: "$secs_running"}
    }}
])

Index Maintenance and Monitoring

Index Usage Monitoring

Monitor index usage to identify unused or inefficient indexes:

// Check index usage statistics
db.collection.aggregate([{$indexStats: {}}])

// Identify unused indexes
db.collection.aggregate([
    {$indexStats: {}},
    {$match: {"accesses.ops": {$lt: 10}}}
])

Index Maintenance Best Practices

// Background index creation (non-blocking)
db.collection.createIndex({field: 1}, {background: true})

// Monitor index build progress
db.currentOp({"command.createIndexes": {$exists: true}})

// Drop unused indexes
db.collection.dropIndex({unusedField: 1})

Performance Optimization Strategies

Index Strategy for Different Query Patterns

Point Queries:

// Single field index for exact matches
db.users.createIndex({userId: 1})
db.users.find({userId: "12345"})

Range Queries:

// Compound index with range field last
db.events.createIndex({type: 1, timestamp: -1})
db.events.find({type: "login", timestamp: {$gte: ISODate("2024-01-01")}})

Sort Queries:

// Index matches sort order
db.posts.createIndex({publishedAt: -1, views: -1})
db.posts.find().sort({publishedAt: -1, views: -1})

Memory and Resource Optimization

// Monitor index memory usage
db.serverStatus().indexCounters

// Optimize working set size
db.collection.stats().indexSizes

Common Indexing Pitfalls and Solutions

Over-Indexing

Problem: Too many indexes slow down write operations
Solution: Regular index auditing and removal of unused indexes

// Audit index usage
db.runCommand({collStats: "collection", indexDetails: true})

Under-Indexing

Problem: Missing indexes cause slow queries
Solution: Systematic query analysis and index planning

// Identify slow queries
db.setProfilingLevel(2, {slowms: 100})
db.system.profile.find().sort({ts: -1}).limit(5)

Incorrect Index Order

Problem: Compound indexes with suboptimal field ordering
Solution: Apply ESR rule and analyze query patterns

// Before: Suboptimal ordering
db.orders.createIndex({amount: 1, customerId: 1, status: 1})

// After: Optimal ordering
db.orders.createIndex({customerId: 1, status: 1, amount: 1})

Monitoring and Alerting

Key Performance Indicators

Monitor these metrics for index performance:

// Query execution time
db.collection.find().explain("executionStats").executionTimeMillis

// Index hit ratio
db.serverStatus().metrics.queryExecutor

// Lock contention
db.serverStatus().locks

Automated Monitoring Setup

// Custom monitoring script
function monitorSlowQueries() {
    return db.system.profile.aggregate([
        {$match: {ts: {$gte: new Date(Date.now() - 3600000)}}},
        {$group: {
            _id: "$command.filter",
            avgDuration: {$avg: "$millis"},
            count: {$sum: 1}
        }},
        {$sort: {avgDuration: -1}},
        {$limit: 10}
    ])
}

Best Practices and Recommendations

Index Design Guidelines

  1. Start with Query Patterns: Design indexes based on actual query patterns
  2. Follow ESR Rule: Order compound index fields appropriately
  3. Use Covered Queries: Design indexes to cover frequently accessed fields
  4. Monitor Regularly: Continuously monitor index usage and performance
  5. Test Thoroughly: Test index changes in staging environments

Performance Testing

// Benchmark query performance
function benchmarkQuery(query, iterations = 100) {
    const start = new Date()
    for (let i = 0; i < iterations; i++) {
        db.collection.find(query).toArray()
    }
    const end = new Date()
    return (end - start) / iterations
}

Conclusion

Mastering MongoDB’s indexing strategy and execution plans requires deep understanding of query planner behavior, index types, and performance optimization techniques. By applying the concepts covered in this guide—from compound index field ordering to execution plan analysis—you can significantly improve query performance and avoid common pitfalls.

Remember that indexing is an iterative process. Regularly monitor query performance, analyze execution plans, and adjust your indexing strategy based on changing application requirements and data patterns. With proper indexing strategies and continuous optimization, you can ensure your MongoDB deployment delivers optimal performance for your applications.

The key to success lies in understanding that the query planner, while sophisticated, requires guidance through well-designed indexes and regular maintenance. By combining theoretical knowledge with practical monitoring and optimization techniques, you can build robust, high-performance MongoDB applications that scale with your business needs.

Further Reading:

TiDB Selection Framework for Enterprise Architecture

Tuning TiDB Server Parameters for Optimal Performance

Galera Cluster for MySQL 8.0.42-26.23 and 8.4.5-26.23

Building Horizontally Scalable RDS Infrastructure

Vector Index Algorithms in Milvus

About MinervaDB Corporation 120 Articles
Full-stack Database Infrastructure Architecture, Engineering and Operations Consultative Support(24*7) Provider for PostgreSQL, MySQL, MariaDB, MongoDB, ClickHouse, Trino, SQL Server, Cassandra, CockroachDB, Yugabyte, Couchbase, Redis, Valkey, NoSQL, NewSQL, Databricks, Amazon Resdhift, Amazon Aurora, CloudSQL, Snowflake and AzureSQL with core expertize in Performance, Scalability, High Availability, Database Reliability Engineering, Database Upgrades/Migration, and Data Security.

Be the first to comment

Leave a Reply