Wazuh Indexer Cluster - OpenSearch and Data Storage

The Wazuh indexer cluster is built on OpenSearch and provides storage, indexing, and full-text search of security events. The indexer receives data from Wazuh servers via Filebeat, organizes it into indices with configurable shards and replicas, and exposes an API for search and data lifecycle management.

Cluster Architecture

Node Roles

Each node in an OpenSearch cluster performs one or more roles. Proper role distribution is critical for performance and fault tolerance.

Cluster-manager

A node with the cluster_manager role manages cluster state:

Creating and deleting indices
Distributing shards across nodes
Tracking node health
Managing index templates and ISM policies

In production clusters, dedicate 3 nodes with the cluster_manager role to maintain quorum. These nodes should not store data (do not assign the data role to them).

# opensearch.yml - dedicated cluster-manager node
node.name: indexer-manager-01
node.roles:
  - cluster_manager

Data node

Nodes with the data role store index shards and execute search and aggregation operations:

Storing primary shards and replicas
Executing search queries
Running aggregations
Indexing new documents

Data nodes consume the most resources (CPU, RAM, disk).

# opensearch.yml - dedicated data node
node.name: indexer-data-01
node.roles:
  - data

Ingest node

Nodes with the ingest role perform pre-processing on documents before indexing:

Field transformations
Data enrichment
Format conversions

In a typical Wazuh deployment, the ingest role is combined with the data role since Filebeat delivers data in a ready-to-index format.

Coordinating node

A node with no explicitly assigned roles acts as a coordinator:

Routing search requests to the appropriate data nodes
Merging results from multiple shards
Reducing load on data nodes during heavy concurrent query traffic

# opensearch.yml - coordinating-only node
node.name: indexer-coord-01
node.roles: []

Typical Cluster Configurations

Minimal cluster (3 nodes)

Node 1: cluster_manager + data
Node 2: cluster_manager + data
Node 3: cluster_manager + data

Production cluster (5+ nodes)

Nodes 1-3: cluster_manager (dedicated, no data)
Nodes 4-6: data
Node 7:    coordinating (optional)

Discovery and Cluster Formation

Discovery Configuration

Cluster nodes find each other through the discovery mechanism. Configuration is set in opensearch.yml:

# Seed node list for initial discovery
discovery.seed_hosts:
  - 192.168.1.20
  - 192.168.1.21
  - 192.168.1.22

# Nodes participating in initial cluster bootstrap
cluster.initial_cluster_manager_nodes:
  - indexer-01
  - indexer-02
  - indexer-03

discovery.seed_hosts contains a list of IP addresses or FQDNs of nodes that a new node contacts to join the cluster. Include all nodes with the cluster_manager role.

cluster.initial_cluster_manager_nodes is used only during the first cluster startup (bootstrap). After successful cluster formation, this parameter can be removed. It contains node names (node.name), not IP addresses.

Complete opensearch.yml Example

cluster.name: wazuh-cluster
node.name: indexer-01
node.roles:
  - cluster_manager
  - data

network.host: 0.0.0.0
http.port: 9200
transport.port: 9300

discovery.seed_hosts:
  - 192.168.1.20
  - 192.168.1.21
  - 192.168.1.22

cluster.initial_cluster_manager_nodes:
  - indexer-01
  - indexer-02
  - indexer-03

plugins.security.ssl.transport.enabled: true
plugins.security.ssl.http.enabled: true
plugins.security.ssl.transport.pemcert_filepath: /etc/wazuh-indexer/certs/indexer.pem
plugins.security.ssl.transport.pemkey_filepath: /etc/wazuh-indexer/certs/indexer-key.pem
plugins.security.ssl.transport.pemtrustedcas_filepath: /etc/wazuh-indexer/certs/root-ca.pem
plugins.security.ssl.http.pemcert_filepath: /etc/wazuh-indexer/certs/indexer.pem
plugins.security.ssl.http.pemkey_filepath: /etc/wazuh-indexer/certs/indexer-key.pem
plugins.security.ssl.http.pemtrustedcas_filepath: /etc/wazuh-indexer/certs/root-ca.pem

compatibility.override_main_response_version: true

Shard Allocation Strategy

Sharding Principles

Shards are the fundamental units of data storage and distribution in OpenSearch. Each index consists of one or more primary shards, each of which can have replicas.

Shard count recommendations:

The number of primary shards should equal the number of data nodes in the cluster
Optimal single shard size: 10-50 GB
Too many small shards create excessive overhead on the cluster-manager
Too few large shards limit search parallelism

The number of shards cannot be changed after index creation - modification requires reindexing.

Replica Configuration

Replicas provide fault tolerance and increase read throughput:

For single-node clusters: number_of_replicas: 0
For clusters with 3+ nodes: number_of_replicas: 1
For mission-critical data: number_of_replicas: 2

Replica count can be changed at any time without reindexing:

curl -sk -u admin:password \
  -X PUT "https://localhost:9200/wazuh-alerts-*/_settings" \
  -H "Content-Type: application/json" \
  -d '{"index": {"number_of_replicas": 1}}'

Index Template Configuration

Wazuh uses index templates to automatically apply settings to new indices. Configuration is performed through the indexer API:

curl -sk -u admin:password \
  -X PUT "https://localhost:9200/_template/wazuh-custom" \
  -H "Content-Type: application/json" \
  -d '{
    "order": 1,
    "index_patterns": ["wazuh-alerts-*"],
    "settings": {
      "index.number_of_shards": 3,
      "index.number_of_replicas": 1,
      "index.refresh_interval": "5s"
    }
  }'

The order: 1 parameter ensures the custom template is applied on top of the default Filebeat template.

Wazuh Index Templates

Wazuh creates several index types for different data categories:

Index Pattern	Description	Data Volume
`wazuh-alerts-*`	Security alerts (rule match results)	Primary data stream
`wazuh-archives-*`	All events (including non-matching rules)	Very large (when enabled)
`wazuh-statistics-*`	Manager performance statistics	Small
`wazuh-monitoring-*`	Agent monitoring status data	Medium

The wazuh-archives-* indices are disabled by default. Enabling them significantly increases disk space consumption and requires appropriate storage scaling.

Index Lifecycle Management (ISM)

ISM Policy Overview

Index State Management (ISM) automates index management based on age, size, or document count. A typical lifecycle includes:

Hot - active reads and writes, maximum performance
Warm - read-only, reduced resources
Cold - archive storage, minimal resources
Delete - automatic removal

Creating an ISM Policy

curl -sk -u admin:password \
  -X PUT "https://localhost:9200/_plugins/_ism/policies/wazuh-alerts-policy" \
  -H "Content-Type: application/json" \
  -d '{
    "policy": {
      "description": "Wazuh alerts lifecycle policy",
      "default_state": "hot",
      "states": [
        {
          "name": "hot",
          "actions": [
            {
              "rollover": {
                "min_size": "25gb",
                "min_index_age": "1d"
              }
            }
          ],
          "transitions": [
            {
              "state_name": "warm",
              "conditions": {
                "min_index_age": "7d"
              }
            }
          ]
        },
        {
          "name": "warm",
          "actions": [
            {
              "replica_count": {
                "number_of_replicas": 0
              }
            },
            {
              "force_merge": {
                "max_num_segments": 1
              }
            }
          ],
          "transitions": [
            {
              "state_name": "cold",
              "conditions": {
                "min_index_age": "30d"
              }
            }
          ]
        },
        {
          "name": "cold",
          "actions": [
            {
              "read_only": {}
            }
          ],
          "transitions": [
            {
              "state_name": "delete",
              "conditions": {
                "min_index_age": "90d"
              }
            }
          ]
        },
        {
          "name": "delete",
          "actions": [
            {
              "delete": {}
            }
          ],
          "transitions": []
        }
      ],
      "ism_template": [
        {
          "index_patterns": ["wazuh-alerts-*"],
          "priority": 100
        }
      ]
    }
  }'

Rollover - Index Rotation

Rollover creates a new index when specified conditions are met:

min_size - maximum index size (e.g., 25gb)
min_index_age - maximum index age (e.g., 1d)
min_doc_count - maximum document count

Conditions are combined using OR logic - rotation occurs when any condition is met.

Performance Tuning

JVM Heap Size

JVM heap size is the most critical indexer performance parameter.

Guidelines:

Set -Xms and -Xmx to the same value to prevent runtime heap resizing
Allocate no more than 50% of system RAM (the remaining memory is used for OS file caches)
Maximum value: 32 GB (exceeding this disables Compressed OOPs, reducing memory efficiency)

Configuration in /etc/wazuh-indexer/jvm.options:

-Xms4g
-Xmx4g

Sizing recommendations:

System RAM	JVM Heap	Notes
8 GB	4 GB	Minimum for production
16 GB	8 GB	Medium workload
32 GB	16 GB	Heavy workload
64 GB	32 GB	Maximum value

Memory Locking

To prevent JVM from being swapped, enable memory locking:

# opensearch.yml
bootstrap.memory_lock: true

In systemd configuration:

# /etc/systemd/system/wazuh-indexer.service.d/override.conf
[Service]
LimitMEMLOCK=infinity

Refresh Interval

The refresh_interval parameter controls how frequently data becomes searchable. The default value is 1 second. Increasing the interval improves write performance:

curl -sk -u admin:password \
  -X PUT "https://localhost:9200/wazuh-alerts-*/_settings" \
  -H "Content-Type: application/json" \
  -d '{"index": {"refresh_interval": "5s"}}'

For bulk data loading, you can temporarily disable refresh:

curl -sk -u admin:password \
  -X PUT "https://localhost:9200/wazuh-alerts-*/_settings" \
  -H "Content-Type: application/json" \
  -d '{"index": {"refresh_interval": "-1"}}'

Merge Settings

Reducing segment count through force merge improves search performance on immutable indices:

curl -sk -u admin:password \
  -X POST "https://localhost:9200/wazuh-alerts-2024.01.01/_forcemerge?max_num_segments=1"

Apply force merge only to indices that are no longer receiving writes (warm/cold state).

Shards per Node

Limit the maximum number of shards per node to prevent degradation:

curl -sk -u admin:password \
  -X PUT "https://localhost:9200/_cluster/settings" \
  -H "Content-Type: application/json" \
  -d '{"persistent": {"cluster.max_shards_per_node": 1000}}'

Monitoring Cluster Health

Essential Monitoring Queries

# Cluster status (green/yellow/red)
curl -sk -u admin:password \
  "https://localhost:9200/_cluster/health?pretty"

# Node list with resource usage
curl -sk -u admin:password \
  "https://localhost:9200/_cat/nodes?v&h=name,role,heap.percent,disk.used_percent,cpu"

# Index list with sizes
curl -sk -u admin:password \
  "https://localhost:9200/_cat/indices/wazuh-*?v&h=index,health,status,pri,rep,docs.count,store.size&s=index:desc"

# Shard distribution
curl -sk -u admin:password \
  "https://localhost:9200/_cat/shards/wazuh-*?v&h=index,shard,prirep,state,docs,store,node"

# Unassigned shards
curl -sk -u admin:password \
  "https://localhost:9200/_cat/shards?v&h=index,shard,prirep,state,unassigned.reason&s=state:desc"

# Disk usage
curl -sk -u admin:password \
  "https://localhost:9200/_cat/allocation?v"

Interpreting Cluster Status

Status	Description	Action
Green	All primary and replica shards are allocated	No action needed
Yellow	All primary shards are allocated, some replicas are not	Check node count and replica settings
Red	Some primary shards are unallocated	Immediate investigation and recovery required

Backup and Recovery (Snapshot/Restore)

Repository Registration

Filesystem (NFS):

curl -sk -u admin:password \
  -X PUT "https://localhost:9200/_snapshot/wazuh-backups" \
  -H "Content-Type: application/json" \
  -d '{
    "type": "fs",
    "settings": {
      "location": "/mnt/snapshots/wazuh",
      "compress": true
    }
  }'

Amazon S3:

curl -sk -u admin:password \
  -X PUT "https://localhost:9200/_snapshot/wazuh-s3-backups" \
  -H "Content-Type: application/json" \
  -d '{
    "type": "s3",
    "settings": {
      "bucket": "wazuh-snapshots",
      "region": "us-east-1",
      "base_path": "indexer-snapshots",
      "compress": true
    }
  }'

The filesystem repository path must be specified in opensearch.yml:

path.repo: ["/mnt/snapshots/wazuh"]

Creating a Snapshot

# Full snapshot
curl -sk -u admin:password \
  -X PUT "https://localhost:9200/_snapshot/wazuh-backups/snapshot-$(date +%Y%m%d)?wait_for_completion=false" \
  -H "Content-Type: application/json" \
  -d '{
    "indices": "wazuh-alerts-*,wazuh-archives-*",
    "ignore_unavailable": true,
    "include_global_state": false
  }'

# Check snapshot status
curl -sk -u admin:password \
  "https://localhost:9200/_snapshot/wazuh-backups/_status?pretty"

# List all snapshots
curl -sk -u admin:password \
  "https://localhost:9200/_snapshot/wazuh-backups/_all?pretty"

Restoring from a Snapshot

curl -sk -u admin:password \
  -X POST "https://localhost:9200/_snapshot/wazuh-backups/snapshot-20240101/_restore" \
  -H "Content-Type: application/json" \
  -d '{
    "indices": "wazuh-alerts-2024.01.01",
    "ignore_unavailable": true,
    "include_global_state": false,
    "rename_pattern": "(.+)",
    "rename_replacement": "restored-$1"
  }'

Automating Snapshots

To automate snapshot creation, add a cron job:

# /etc/cron.d/wazuh-snapshot
0 2 * * * root curl -sk -u admin:password \
  -X PUT "https://localhost:9200/_snapshot/wazuh-backups/snapshot-$(date +\%Y\%m\%d)" \
  -H "Content-Type: application/json" \
  -d '{"indices":"wazuh-alerts-*","ignore_unavailable":true,"include_global_state":false}'

Comparison with Other Platforms

Elasticsearch Cluster

Feature	Wazuh Indexer (OpenSearch)	Elasticsearch
Foundation	OpenSearch 2.x (Elasticsearch 7.10 fork)	Elasticsearch 8.x
License	Apache 2.0	Elastic License / SSPL
Security	OpenSearch Security (built-in)	X-Pack Security (formerly paid)
ISM	Index State Management	Index Lifecycle Management (ILM)
ML	ML Commons plugin	ML features (X-Pack)
Cost	Free	Free (Basic) / Commercial

Splunk Indexer Cluster

Feature	Wazuh Indexer	Splunk Indexer
Storage format	JSON documents in Lucene indices	Proprietary (tsidx + journal)
Sharding	Configurable shards and replicas	Replication via search factor / replication factor
Lifecycle	ISM policies (hot/warm/cold/delete)	SmartStore, frozen tier
Search	DSL queries, PPL	SPL (Search Processing Language)
Scaling	Horizontal (add data nodes)	Horizontal (indexer peers)
Cost	Free	By data volume

Troubleshooting

Cluster Status Yellow

Cause: replica shards are unallocated. Typical for single-node clusters or when the replica count exceeds the number of data nodes minus one.

Resolution:

# Check unallocated shards
curl -sk -u admin:password \
  "https://localhost:9200/_cluster/allocation/explain?pretty"

# Set replicas to 0 for single-node cluster
curl -sk -u admin:password \
  -X PUT "https://localhost:9200/wazuh-alerts-*/_settings" \
  -H "Content-Type: application/json" \
  -d '{"index": {"number_of_replicas": 0}}'

Cluster Status Red

Cause: primary shards are unallocated. Possible reasons include insufficient disk space, node failure, or data corruption.

Resolution:

Check the reason for non-allocation:

curl -sk -u admin:password \
  "https://localhost:9200/_cluster/allocation/explain?pretty"

Check disk space:

curl -sk -u admin:password \
  "https://localhost:9200/_cat/allocation?v"

If necessary, force shard reallocation:

curl -sk -u admin:password \
  -X POST "https://localhost:9200/_cluster/reroute?retry_failed=true"

Disk Watermarks

OpenSearch stops writing when disk usage thresholds are reached:

Threshold	Default Value	Behavior
Low watermark	85%	New shards are not placed on the node
High watermark	90%	Shards are relocated away from the node
Flood stage	95%	Indices are switched to read-only mode

Adjusting thresholds:

curl -sk -u admin:password \
  -X PUT "https://localhost:9200/_cluster/settings" \
  -H "Content-Type: application/json" \
  -d '{
    "persistent": {
      "cluster.routing.allocation.disk.watermark.low": "85%",
      "cluster.routing.allocation.disk.watermark.high": "90%",
      "cluster.routing.allocation.disk.watermark.flood_stage": "95%"
    }
  }'

To remove the read-only block after freeing space:

curl -sk -u admin:password \
  -X PUT "https://localhost:9200/wazuh-alerts-*/_settings" \
  -H "Content-Type: application/json" \
  -d '{"index.blocks.read_only_allow_delete": null}'

JVM Out of Memory (OOM)

Symptoms: node crashes or becomes unresponsive, logs contain java.lang.OutOfMemoryError.

Resolution:

Increase JVM heap in /etc/wazuh-indexer/jvm.options (no more than 50% of RAM and no more than 32 GB).
Check heap consumption:

curl -sk -u admin:password \
  "https://localhost:9200/_cat/nodes?v&h=name,heap.percent,heap.max"

Reduce the number of shards or indices (consider an ISM policy to delete old data).
Add data nodes to distribute the workload.

Unassigned Shards

Diagnosing the cause:

curl -sk -u admin:password \
  "https://localhost:9200/_cluster/allocation/explain?pretty" \
  -H "Content-Type: application/json" \
  -d '{"index":"wazuh-alerts-2024.01.01","shard":0,"primary":true}'

Common reasons:

NODE_LEFT - node left the cluster (restore the node or wait for it to rejoin)
ALLOCATION_FAILED - allocation error (check node logs)
INDEX_CREATED - index was just created, shards are being allocated
CLUSTER_RECOVERED - cluster is recovering after a restart

For a general infrastructure overview, see the Wazuh infrastructure section . Data flows into the indexer from the server cluster , and management is performed through the REST API .

Last updated on 8, April 2026

Wazuh Indexer API - Querying and Managing Data Wazuh Server Cluster - Architecture and Configuration