Wazuh Indexer Cluster - OpenSearch and Data Storage
The Wazuh indexer cluster is built on OpenSearch and provides storage, indexing, and full-text search of security events. The indexer receives data from Wazuh servers via Filebeat, organizes it into indices with configurable shards and replicas, and exposes an API for search and data lifecycle management.
Cluster Architecture
Node Roles
Each node in an OpenSearch cluster performs one or more roles. Proper role distribution is critical for performance and fault tolerance.
Cluster-manager
A node with the cluster_manager role manages cluster state:
- Creating and deleting indices
- Distributing shards across nodes
- Tracking node health
- Managing index templates and ISM policies
In production clusters, dedicate 3 nodes with the cluster_manager role to maintain quorum. These nodes should not store data (do not assign the data role to them).
# opensearch.yml - dedicated cluster-manager node
node.name: indexer-manager-01
node.roles:
- cluster_managerData node
Nodes with the data role store index shards and execute search and aggregation operations:
- Storing primary shards and replicas
- Executing search queries
- Running aggregations
- Indexing new documents
Data nodes consume the most resources (CPU, RAM, disk).
# opensearch.yml - dedicated data node
node.name: indexer-data-01
node.roles:
- dataIngest node
Nodes with the ingest role perform pre-processing on documents before indexing:
- Field transformations
- Data enrichment
- Format conversions
In a typical Wazuh deployment, the ingest role is combined with the data role since Filebeat delivers data in a ready-to-index format.
Coordinating node
A node with no explicitly assigned roles acts as a coordinator:
- Routing search requests to the appropriate data nodes
- Merging results from multiple shards
- Reducing load on data nodes during heavy concurrent query traffic
# opensearch.yml - coordinating-only node
node.name: indexer-coord-01
node.roles: []Typical Cluster Configurations
Minimal cluster (3 nodes)
Node 1: cluster_manager + data
Node 2: cluster_manager + data
Node 3: cluster_manager + dataProduction cluster (5+ nodes)
Nodes 1-3: cluster_manager (dedicated, no data)
Nodes 4-6: data
Node 7: coordinating (optional)Discovery and Cluster Formation
Discovery Configuration
Cluster nodes find each other through the discovery mechanism. Configuration is set in opensearch.yml:
# Seed node list for initial discovery
discovery.seed_hosts:
- 192.168.1.20
- 192.168.1.21
- 192.168.1.22
# Nodes participating in initial cluster bootstrap
cluster.initial_cluster_manager_nodes:
- indexer-01
- indexer-02
- indexer-03discovery.seed_hosts contains a list of IP addresses or FQDNs of nodes that a new node contacts to join the cluster. Include all nodes with the cluster_manager role.
cluster.initial_cluster_manager_nodes is used only during the first cluster startup (bootstrap). After successful cluster formation, this parameter can be removed. It contains node names (node.name), not IP addresses.
Complete opensearch.yml Example
cluster.name: wazuh-cluster
node.name: indexer-01
node.roles:
- cluster_manager
- data
network.host: 0.0.0.0
http.port: 9200
transport.port: 9300
discovery.seed_hosts:
- 192.168.1.20
- 192.168.1.21
- 192.168.1.22
cluster.initial_cluster_manager_nodes:
- indexer-01
- indexer-02
- indexer-03
plugins.security.ssl.transport.enabled: true
plugins.security.ssl.http.enabled: true
plugins.security.ssl.transport.pemcert_filepath: /etc/wazuh-indexer/certs/indexer.pem
plugins.security.ssl.transport.pemkey_filepath: /etc/wazuh-indexer/certs/indexer-key.pem
plugins.security.ssl.transport.pemtrustedcas_filepath: /etc/wazuh-indexer/certs/root-ca.pem
plugins.security.ssl.http.pemcert_filepath: /etc/wazuh-indexer/certs/indexer.pem
plugins.security.ssl.http.pemkey_filepath: /etc/wazuh-indexer/certs/indexer-key.pem
plugins.security.ssl.http.pemtrustedcas_filepath: /etc/wazuh-indexer/certs/root-ca.pem
compatibility.override_main_response_version: trueShard Allocation Strategy
Sharding Principles
Shards are the fundamental units of data storage and distribution in OpenSearch. Each index consists of one or more primary shards, each of which can have replicas.
Shard count recommendations:
- The number of primary shards should equal the number of data nodes in the cluster
- Optimal single shard size: 10-50 GB
- Too many small shards create excessive overhead on the cluster-manager
- Too few large shards limit search parallelism
The number of shards cannot be changed after index creation - modification requires reindexing.
Replica Configuration
Replicas provide fault tolerance and increase read throughput:
- For single-node clusters:
number_of_replicas: 0 - For clusters with 3+ nodes:
number_of_replicas: 1 - For mission-critical data:
number_of_replicas: 2
Replica count can be changed at any time without reindexing:
curl -sk -u admin:password \
-X PUT "https://localhost:9200/wazuh-alerts-*/_settings" \
-H "Content-Type: application/json" \
-d '{"index": {"number_of_replicas": 1}}'Index Template Configuration
Wazuh uses index templates to automatically apply settings to new indices. Configuration is performed through the indexer API:
curl -sk -u admin:password \
-X PUT "https://localhost:9200/_template/wazuh-custom" \
-H "Content-Type: application/json" \
-d '{
"order": 1,
"index_patterns": ["wazuh-alerts-*"],
"settings": {
"index.number_of_shards": 3,
"index.number_of_replicas": 1,
"index.refresh_interval": "5s"
}
}'The order: 1 parameter ensures the custom template is applied on top of the default Filebeat template.
Wazuh Index Templates
Wazuh creates several index types for different data categories:
| Index Pattern | Description | Data Volume |
|---|---|---|
wazuh-alerts-* | Security alerts (rule match results) | Primary data stream |
wazuh-archives-* | All events (including non-matching rules) | Very large (when enabled) |
wazuh-statistics-* | Manager performance statistics | Small |
wazuh-monitoring-* | Agent monitoring status data | Medium |
The wazuh-archives-* indices are disabled by default. Enabling them significantly increases disk space consumption and requires appropriate storage scaling.
Index Lifecycle Management (ISM)
ISM Policy Overview
Index State Management (ISM) automates index management based on age, size, or document count. A typical lifecycle includes:
- Hot - active reads and writes, maximum performance
- Warm - read-only, reduced resources
- Cold - archive storage, minimal resources
- Delete - automatic removal
Creating an ISM Policy
curl -sk -u admin:password \
-X PUT "https://localhost:9200/_plugins/_ism/policies/wazuh-alerts-policy" \
-H "Content-Type: application/json" \
-d '{
"policy": {
"description": "Wazuh alerts lifecycle policy",
"default_state": "hot",
"states": [
{
"name": "hot",
"actions": [
{
"rollover": {
"min_size": "25gb",
"min_index_age": "1d"
}
}
],
"transitions": [
{
"state_name": "warm",
"conditions": {
"min_index_age": "7d"
}
}
]
},
{
"name": "warm",
"actions": [
{
"replica_count": {
"number_of_replicas": 0
}
},
{
"force_merge": {
"max_num_segments": 1
}
}
],
"transitions": [
{
"state_name": "cold",
"conditions": {
"min_index_age": "30d"
}
}
]
},
{
"name": "cold",
"actions": [
{
"read_only": {}
}
],
"transitions": [
{
"state_name": "delete",
"conditions": {
"min_index_age": "90d"
}
}
]
},
{
"name": "delete",
"actions": [
{
"delete": {}
}
],
"transitions": []
}
],
"ism_template": [
{
"index_patterns": ["wazuh-alerts-*"],
"priority": 100
}
]
}
}'Rollover - Index Rotation
Rollover creates a new index when specified conditions are met:
min_size- maximum index size (e.g.,25gb)min_index_age- maximum index age (e.g.,1d)min_doc_count- maximum document count
Conditions are combined using OR logic - rotation occurs when any condition is met.
Performance Tuning
JVM Heap Size
JVM heap size is the most critical indexer performance parameter.
Guidelines:
- Set
-Xmsand-Xmxto the same value to prevent runtime heap resizing - Allocate no more than 50% of system RAM (the remaining memory is used for OS file caches)
- Maximum value: 32 GB (exceeding this disables Compressed OOPs, reducing memory efficiency)
Configuration in /etc/wazuh-indexer/jvm.options:
-Xms4g
-Xmx4gSizing recommendations:
| System RAM | JVM Heap | Notes |
|---|---|---|
| 8 GB | 4 GB | Minimum for production |
| 16 GB | 8 GB | Medium workload |
| 32 GB | 16 GB | Heavy workload |
| 64 GB | 32 GB | Maximum value |
Memory Locking
To prevent JVM from being swapped, enable memory locking:
# opensearch.yml
bootstrap.memory_lock: trueIn systemd configuration:
# /etc/systemd/system/wazuh-indexer.service.d/override.conf
[Service]
LimitMEMLOCK=infinityRefresh Interval
The refresh_interval parameter controls how frequently data becomes searchable. The default value is 1 second. Increasing the interval improves write performance:
curl -sk -u admin:password \
-X PUT "https://localhost:9200/wazuh-alerts-*/_settings" \
-H "Content-Type: application/json" \
-d '{"index": {"refresh_interval": "5s"}}'For bulk data loading, you can temporarily disable refresh:
curl -sk -u admin:password \
-X PUT "https://localhost:9200/wazuh-alerts-*/_settings" \
-H "Content-Type: application/json" \
-d '{"index": {"refresh_interval": "-1"}}'Merge Settings
Reducing segment count through force merge improves search performance on immutable indices:
curl -sk -u admin:password \
-X POST "https://localhost:9200/wazuh-alerts-2024.01.01/_forcemerge?max_num_segments=1"Apply force merge only to indices that are no longer receiving writes (warm/cold state).
Shards per Node
Limit the maximum number of shards per node to prevent degradation:
curl -sk -u admin:password \
-X PUT "https://localhost:9200/_cluster/settings" \
-H "Content-Type: application/json" \
-d '{"persistent": {"cluster.max_shards_per_node": 1000}}'Monitoring Cluster Health
Essential Monitoring Queries
# Cluster status (green/yellow/red)
curl -sk -u admin:password \
"https://localhost:9200/_cluster/health?pretty"
# Node list with resource usage
curl -sk -u admin:password \
"https://localhost:9200/_cat/nodes?v&h=name,role,heap.percent,disk.used_percent,cpu"
# Index list with sizes
curl -sk -u admin:password \
"https://localhost:9200/_cat/indices/wazuh-*?v&h=index,health,status,pri,rep,docs.count,store.size&s=index:desc"
# Shard distribution
curl -sk -u admin:password \
"https://localhost:9200/_cat/shards/wazuh-*?v&h=index,shard,prirep,state,docs,store,node"
# Unassigned shards
curl -sk -u admin:password \
"https://localhost:9200/_cat/shards?v&h=index,shard,prirep,state,unassigned.reason&s=state:desc"
# Disk usage
curl -sk -u admin:password \
"https://localhost:9200/_cat/allocation?v"Interpreting Cluster Status
| Status | Description | Action |
|---|---|---|
| Green | All primary and replica shards are allocated | No action needed |
| Yellow | All primary shards are allocated, some replicas are not | Check node count and replica settings |
| Red | Some primary shards are unallocated | Immediate investigation and recovery required |
Backup and Recovery (Snapshot/Restore)
Repository Registration
Filesystem (NFS):
curl -sk -u admin:password \
-X PUT "https://localhost:9200/_snapshot/wazuh-backups" \
-H "Content-Type: application/json" \
-d '{
"type": "fs",
"settings": {
"location": "/mnt/snapshots/wazuh",
"compress": true
}
}'Amazon S3:
curl -sk -u admin:password \
-X PUT "https://localhost:9200/_snapshot/wazuh-s3-backups" \
-H "Content-Type: application/json" \
-d '{
"type": "s3",
"settings": {
"bucket": "wazuh-snapshots",
"region": "us-east-1",
"base_path": "indexer-snapshots",
"compress": true
}
}'The filesystem repository path must be specified in opensearch.yml:
path.repo: ["/mnt/snapshots/wazuh"]Creating a Snapshot
# Full snapshot
curl -sk -u admin:password \
-X PUT "https://localhost:9200/_snapshot/wazuh-backups/snapshot-$(date +%Y%m%d)?wait_for_completion=false" \
-H "Content-Type: application/json" \
-d '{
"indices": "wazuh-alerts-*,wazuh-archives-*",
"ignore_unavailable": true,
"include_global_state": false
}'
# Check snapshot status
curl -sk -u admin:password \
"https://localhost:9200/_snapshot/wazuh-backups/_status?pretty"
# List all snapshots
curl -sk -u admin:password \
"https://localhost:9200/_snapshot/wazuh-backups/_all?pretty"Restoring from a Snapshot
curl -sk -u admin:password \
-X POST "https://localhost:9200/_snapshot/wazuh-backups/snapshot-20240101/_restore" \
-H "Content-Type: application/json" \
-d '{
"indices": "wazuh-alerts-2024.01.01",
"ignore_unavailable": true,
"include_global_state": false,
"rename_pattern": "(.+)",
"rename_replacement": "restored-$1"
}'Automating Snapshots
To automate snapshot creation, add a cron job:
# /etc/cron.d/wazuh-snapshot
0 2 * * * root curl -sk -u admin:password \
-X PUT "https://localhost:9200/_snapshot/wazuh-backups/snapshot-$(date +\%Y\%m\%d)" \
-H "Content-Type: application/json" \
-d '{"indices":"wazuh-alerts-*","ignore_unavailable":true,"include_global_state":false}'Comparison with Other Platforms
Elasticsearch Cluster
| Feature | Wazuh Indexer (OpenSearch) | Elasticsearch |
|---|---|---|
| Foundation | OpenSearch 2.x (Elasticsearch 7.10 fork) | Elasticsearch 8.x |
| License | Apache 2.0 | Elastic License / SSPL |
| Security | OpenSearch Security (built-in) | X-Pack Security (formerly paid) |
| ISM | Index State Management | Index Lifecycle Management (ILM) |
| ML | ML Commons plugin | ML features (X-Pack) |
| Cost | Free | Free (Basic) / Commercial |
Splunk Indexer Cluster
| Feature | Wazuh Indexer | Splunk Indexer |
|---|---|---|
| Storage format | JSON documents in Lucene indices | Proprietary (tsidx + journal) |
| Sharding | Configurable shards and replicas | Replication via search factor / replication factor |
| Lifecycle | ISM policies (hot/warm/cold/delete) | SmartStore, frozen tier |
| Search | DSL queries, PPL | SPL (Search Processing Language) |
| Scaling | Horizontal (add data nodes) | Horizontal (indexer peers) |
| Cost | Free | By data volume |
Troubleshooting
Cluster Status Yellow
Cause: replica shards are unallocated. Typical for single-node clusters or when the replica count exceeds the number of data nodes minus one.
Resolution:
# Check unallocated shards
curl -sk -u admin:password \
"https://localhost:9200/_cluster/allocation/explain?pretty"
# Set replicas to 0 for single-node cluster
curl -sk -u admin:password \
-X PUT "https://localhost:9200/wazuh-alerts-*/_settings" \
-H "Content-Type: application/json" \
-d '{"index": {"number_of_replicas": 0}}'Cluster Status Red
Cause: primary shards are unallocated. Possible reasons include insufficient disk space, node failure, or data corruption.
Resolution:
- Check the reason for non-allocation:
curl -sk -u admin:password \
"https://localhost:9200/_cluster/allocation/explain?pretty"- Check disk space:
curl -sk -u admin:password \
"https://localhost:9200/_cat/allocation?v"- If necessary, force shard reallocation:
curl -sk -u admin:password \
-X POST "https://localhost:9200/_cluster/reroute?retry_failed=true"Disk Watermarks
OpenSearch stops writing when disk usage thresholds are reached:
| Threshold | Default Value | Behavior |
|---|---|---|
| Low watermark | 85% | New shards are not placed on the node |
| High watermark | 90% | Shards are relocated away from the node |
| Flood stage | 95% | Indices are switched to read-only mode |
Adjusting thresholds:
curl -sk -u admin:password \
-X PUT "https://localhost:9200/_cluster/settings" \
-H "Content-Type: application/json" \
-d '{
"persistent": {
"cluster.routing.allocation.disk.watermark.low": "85%",
"cluster.routing.allocation.disk.watermark.high": "90%",
"cluster.routing.allocation.disk.watermark.flood_stage": "95%"
}
}'To remove the read-only block after freeing space:
curl -sk -u admin:password \
-X PUT "https://localhost:9200/wazuh-alerts-*/_settings" \
-H "Content-Type: application/json" \
-d '{"index.blocks.read_only_allow_delete": null}'JVM Out of Memory (OOM)
Symptoms: node crashes or becomes unresponsive, logs contain java.lang.OutOfMemoryError.
Resolution:
Increase JVM heap in
/etc/wazuh-indexer/jvm.options(no more than 50% of RAM and no more than 32 GB).Check heap consumption:
curl -sk -u admin:password \
"https://localhost:9200/_cat/nodes?v&h=name,heap.percent,heap.max"Reduce the number of shards or indices (consider an ISM policy to delete old data).
Add data nodes to distribute the workload.
Unassigned Shards
Diagnosing the cause:
curl -sk -u admin:password \
"https://localhost:9200/_cluster/allocation/explain?pretty" \
-H "Content-Type: application/json" \
-d '{"index":"wazuh-alerts-2024.01.01","shard":0,"primary":true}'Common reasons:
NODE_LEFT- node left the cluster (restore the node or wait for it to rejoin)ALLOCATION_FAILED- allocation error (check node logs)INDEX_CREATED- index was just created, shards are being allocatedCLUSTER_RECOVERED- cluster is recovering after a restart
For a general infrastructure overview, see the Wazuh infrastructure section . Data flows into the indexer from the server cluster , and management is performed through the REST API .