Index Management
Index Management in Elasticsearch
Learn how to create, configure, and manage indices in Elasticsearch. This guide covers index settings, mappings, aliases, and best practices for index management.
Understanding Indices
An index in Elasticsearch is similar to a database in traditional systems. It's a collection of documents that share similar characteristics and are stored together.
Creating Indices
Basic Index Creation
PUT /my_index
Index with Settings
PUT /products
{
"settings": {
"number_of_shards": 3,
"number_of_replicas": 1,
"refresh_interval": "30s"
}
}
Index with Mappings
PUT /users
{
"mappings": {
"properties": {
"username": {
"type": "keyword"
},
"email": {
"type": "keyword",
"index": true
},
"age": {
"type": "integer"
},
"bio": {
"type": "text",
"analyzer": "standard"
},
"created_at": {
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"
}
}
}
}
Index Settings
Static Settings (Set at Creation)
PUT /logs
{
"settings": {
"number_of_shards": 5,
"codec": "best_compression",
"routing_partition_size": 3
}
}
Dynamic Settings (Can be Updated)
PUT /logs/_settings
{
"index": {
"number_of_replicas": 2,
"refresh_interval": "5s",
"max_result_window": 50000,
"blocks": {
"read_only": false,
"write": false
}
}
}
Mappings
Dynamic Mapping
Elasticsearch automatically detects and adds new fields:
PUT /dynamic_index/_doc/1
{
"title": "Auto-detected as text",
"count": 42, // Auto-detected as long
"price": 19.99, // Auto-detected as float
"is_active": true, // Auto-detected as boolean
"created": "2024-01-15" // Auto-detected as date
}
Explicit Mapping
Define field types explicitly:
PUT /products/_mapping
{
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"description": {
"type": "text",
"analyzer": "english"
},
"price": {
"type": "scaled_float",
"scaling_factor": 100
},
"tags": {
"type": "keyword"
},
"location": {
"type": "geo_point"
}
}
}
Nested Objects
PUT /orders
{
"mappings": {
"properties": {
"order_id": {
"type": "keyword"
},
"items": {
"type": "nested",
"properties": {
"product_id": {
"type": "keyword"
},
"quantity": {
"type": "integer"
},
"price": {
"type": "float"
}
}
}
}
}
}
Index Templates
Create Index Template
PUT /_index_template/logs_template
{
"index_patterns": ["logs-*"],
"priority": 1,
"template": {
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0,
"refresh_interval": "5s"
},
"mappings": {
"properties": {
"@timestamp": {
"type": "date"
},
"level": {
"type": "keyword"
},
"message": {
"type": "text"
}
}
}
}
}
Component Templates
// Create reusable component
PUT /_component_template/common_settings
{
"template": {
"settings": {
"number_of_replicas": 1
}
}
}
// Use in index template
PUT /_index_template/app_logs
{
"index_patterns": ["app-logs-*"],
"composed_of": ["common_settings"],
"template": {
"mappings": {
"properties": {
"app_name": {
"type": "keyword"
}
}
}
}
}
Index Aliases
Create Alias
POST /_aliases
{
"actions": [
{
"add": {
"index": "products_v1",
"alias": "products"
}
}
]
}
Filtered Alias
POST /_aliases
{
"actions": [
{
"add": {
"index": "orders",
"alias": "recent_orders",
"filter": {
"range": {
"created_at": {
"gte": "now-7d"
}
}
}
}
}
]
}
Alias for Zero-Downtime Reindexing
// Step 1: Create new index
PUT /products_v2
{
"mappings": { /* new mapping */ }
}
// Step 2: Reindex data
POST /_reindex
{
"source": {
"index": "products_v1"
},
"dest": {
"index": "products_v2"
}
}
// Step 3: Switch alias atomically
POST /_aliases
{
"actions": [
{ "remove": { "index": "products_v1", "alias": "products" } },
{ "add": { "index": "products_v2", "alias": "products" } }
]
}
Index Lifecycle Management (ILM)
Create ILM Policy
PUT /_ilm/policy/logs_policy
{
"policy": {
"phases": {
"hot": {
"actions": {
"rollover": {
"max_age": "7d",
"max_size": "50gb"
}
}
},
"warm": {
"min_age": "7d",
"actions": {
"shrink": {
"number_of_shards": 1
},
"forcemerge": {
"max_num_segments": 1
}
}
},
"delete": {
"min_age": "30d",
"actions": {
"delete": {}
}
}
}
}
}
Apply ILM Policy
PUT /logs-000001
{
"settings": {
"index.lifecycle.name": "logs_policy",
"index.lifecycle.rollover_alias": "logs"
}
}
Index Operations
Get Index Information
// Get settings
GET /products/_settings
// Get mappings
GET /products/_mapping
// Get both
GET /products
// Get stats
GET /products/_stats
Close and Open Index
// Close index (makes it read-only)
POST /products/_close
// Open index
POST /products/_open
Clone Index
// Put source index in read-only mode
PUT /products/_settings
{
"settings": {
"index.blocks.write": true
}
}
// Clone the index
POST /products/_clone/products_copy
{
"settings": {
"index.number_of_replicas": 0
}
}
Shrink Index
// Prepare for shrink
PUT /logs/_settings
{
"settings": {
"index.blocks.write": true
}
}
// Shrink index
POST /logs/_shrink/logs_shrinked
{
"settings": {
"index.number_of_replicas": 1,
"index.number_of_shards": 1
}
}
Best Practices
1. Naming Conventions
// Use lowercase and descriptive names
PUT /user_profiles // Good
PUT /UserProfiles // Bad
// Use versioning for schema changes
PUT /products_v1
PUT /products_v2
// Use date patterns for time-series
PUT /logs-2024.01.15
2. Shard Sizing
// Calculate shards based on data volume
// Target: 20-40GB per shard
PUT /large_dataset
{
"settings": {
"number_of_shards": 10, // For ~300GB of data
"number_of_replicas": 1
}
}
3. Mapping Optimization
PUT /optimized_index
{
"mappings": {
"properties": {
"exact_match_field": {
"type": "keyword" // Don't analyze
},
"full_text_field": {
"type": "text",
"fields": {
"keyword": { // Multi-field for aggregations
"type": "keyword"
}
}
},
"numeric_field": {
"type": "integer",
"index": false // Don't index if not searched
}
}
}
}
4. Index Settings for Performance
PUT /performance_optimized
{
"settings": {
"refresh_interval": "30s", // Reduce refresh frequency
"number_of_shards": 3,
"number_of_replicas": 0, // Add replicas after bulk load
"index.translog.durability": "async",
"index.translog.sync_interval": "5s"
}
}
Monitoring Indices
Check Index Health
GET /_cat/indices?v&health=yellow&s=index
GET /_cat/shards/products?v
Index Statistics
GET /products/_stats/store,docs,indexing
Segment Information
GET /products/_segments
Common Issues and Solutions
1. Too Many Shards
// Check shard count
GET /_cat/shards?v
// Solution: Use ILM with shrink action
2. Mapping Explosion
// Limit dynamic fields
PUT /controlled_index
{
"mappings": {
"dynamic": "strict", // Reject unknown fields
"properties": {
"known_field": {
"type": "text"
}
}
}
}
3. Large Documents
// Set limits
PUT /limited_index
{
"settings": {
"index.max_result_window": 10000,
"index.max_inner_result_window": 100,
"index.max_terms_count": 65536
}
}
Next Steps
- Learn about replication strategies
- Understand sharding in detail
- Explore advanced mapping techniques
- Study index performance optimization