Wazuh Server Cluster - Architecture and Configuration
The Wazuh server cluster connects multiple managers to provide horizontal scaling and fault tolerance. The master/worker architecture distributes event analysis workloads across nodes while maintaining a unified configuration of rules, decoders, and security policies.
Cluster Architecture
Node Roles
The Wazuh cluster operates with two distinct node types:
Master (primary node) - the central node that performs the following functions:
- Stores the authoritative configuration (rules, decoders, CDB lists)
- Distributes configuration changes to all worker nodes
- Manages agent registration and group assignments
- Accepts connections from worker nodes for synchronization
- Processes cluster-related REST API requests
Worker (processing node) - a node that receives and analyzes agent events:
- Receives events from assigned agents over port 1514/TCP
- Performs log decoding and rule matching
- Forwards analysis results to the indexer via Filebeat
- Receives configuration updates from the master node
- Sends agent status information back to the master
Only one master node is allowed per cluster. The number of worker nodes is limited only by available infrastructure resources.
Inter-Node Communication
Cluster nodes exchange data over port 1516/TCP with encryption. The master node listens for incoming connections from worker nodes. Each worker maintains a persistent connection to the master for receiving configuration updates and transmitting status information.
┌─────────────────┐
│ Master Node │
│ (manager_01) │
│ Port 1516 │
└────┬───────┬────┘
1516/TCP │ │ 1516/TCP
┌────────────┘ └────────────┐
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ Worker Node 1 │ │ Worker Node 2 │
│ (manager_02) │ │ (manager_03) │
└────┬───────┬────┘ └────┬───────┬────┘
│ │ │ │
1514/TCP 1514/TCP 1514/TCP 1514/TCP
│ │ │ │
┌────┘ ┌───┘ ┌────┘ ┌───┘
▼ ▼ ▼ ▼
Agent 1 Agent 2 Agent 3 Agent 4Cluster Configuration in ossec.conf
Cluster settings are defined in /var/ossec/etc/ossec.conf within the <cluster> block. This configuration must be present on every cluster node.
Complete Configuration Block
<cluster>
<name>wazuh</name>
<node_name>manager_01</node_name>
<node_type>master</node_type>
<key>ugdtAnd7Pi9myP7CVts4qZaZQEQcRYZa</key>
<port>1516</port>
<bind_addr>0.0.0.0</bind_addr>
<nodes>
<node>192.168.1.10</node>
</nodes>
<hidden>no</hidden>
<disabled>no</disabled>
</cluster>Parameter Reference
| Parameter | Description | Default | Allowed Values |
|---|---|---|---|
name | Cluster name. Must be identical on all nodes | wazuh | Alphanumeric string |
node_name | Unique identifier for this node within the cluster | node01 | Any unique string |
node_type | Role of this node in the cluster | master | master, worker |
key | 32-character encryption key for inter-node communication. Must be identical on all nodes | - | 32 alphanumeric characters |
port | TCP port for cluster communication | 1516 | 1-65535 |
bind_addr | IP address on which the node listens for cluster connections | 0.0.0.0 | Valid IPv4/IPv6 address |
nodes | IP addresses or DNS names of the master node | - | IP addresses or FQDNs |
hidden | Suppresses cluster source information in alerts | no | yes, no |
disabled | Disables cluster functionality | no | yes, no |
Generating the Cluster Key
The key must contain exactly 32 alphanumeric characters. Generate one with:
openssl rand -hex 16This produces 32 hexadecimal characters suitable for the key parameter.
Master Node Configuration
<cluster>
<name>production-cluster</name>
<node_name>master-node</node_name>
<node_type>master</node_type>
<key>a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6</key>
<port>1516</port>
<bind_addr>0.0.0.0</bind_addr>
<nodes>
<node>192.168.1.10</node>
</nodes>
<hidden>no</hidden>
<disabled>no</disabled>
</cluster>Worker Node Configuration
On worker nodes, change node_name and node_type. The <nodes> block must point to the master node IP address:
<cluster>
<name>production-cluster</name>
<node_name>worker-01</node_name>
<node_type>worker</node_type>
<key>a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6</key>
<port>1516</port>
<bind_addr>0.0.0.0</bind_addr>
<nodes>
<node>192.168.1.10</node>
</nodes>
<hidden>no</hidden>
<disabled>no</disabled>
</cluster>Restart the manager after modifying the configuration:
systemctl restart wazuh-managerNode Synchronization
The master node automatically distributes specific files and data to all worker nodes. Synchronization ensures uniform event analysis configuration across the entire cluster.
Synchronized Data
Master to worker:
| Category | Files/Data | Description |
|---|---|---|
| Rules | /var/ossec/etc/rules/local_rules.xml and custom rules | Threat detection rules |
| Decoders | /var/ossec/etc/decoders/local_decoder.xml and custom decoders | Log parsers |
| CDB lists | /var/ossec/etc/lists/ | Lookup lists for additional classification |
| Group configuration | /var/ossec/etc/shared/ | agent.conf configurations per agent group |
| Custom files | Files added to rules and decoders directories | All contents of etc/rules and etc/decoders |
Worker to master:
| Category | Data | Description |
|---|---|---|
| Agent information | Status, version, OS of agents | Details about agents connected to the worker |
| Agent status | Last connection time, keepalive | Real-time agent state information |
| Agent groups | Agent group membership | Synchronization of group assignments |
Synchronization Mechanism
Synchronization follows a push model from the master node. When files change on the master, checksums are computed and modified files are transmitted to all workers. Worker nodes cannot initiate configuration distribution - rule and decoder changes must be made exclusively on the master node.
The default synchronization check interval is 10 seconds. During each interval, the master compares file checksums and determines whether updates are needed.
Adding and Removing Nodes
Adding a Worker Node
Install the Wazuh manager on the new server.
Configure the
<cluster>block in/var/ossec/etc/ossec.conf:
<cluster>
<name>production-cluster</name>
<node_name>worker-02</node_name>
<node_type>worker</node_type>
<key>a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6</key>
<port>1516</port>
<bind_addr>0.0.0.0</bind_addr>
<nodes>
<node>192.168.1.10</node>
</nodes>
<hidden>no</hidden>
<disabled>no</disabled>
</cluster>Ensure port 1516/TCP is open between the new node and the master.
Start the manager:
systemctl start wazuh-manager- Verify the node has joined:
/var/ossec/bin/cluster_control -lRemoving a Worker Node
Reassign agents from the node being removed to other nodes.
Stop the manager on the node being removed:
systemctl stop wazuh-manager- Confirm the node is no longer listed:
/var/ossec/bin/cluster_control -lThe master node automatically detects worker disconnections and updates the active node list.
Agent Load Balancing Across Workers
Agent Assignment
Agents can connect to any cluster node (master or worker). For load balancing, use an external load balancer (HAProxy, Nginx, AWS NLB) to distribute incoming agent connections across worker nodes.
DNS Round-Robin Configuration
The simplest balancing approach uses a DNS record with multiple A entries:
wazuh-cluster.example.com A 192.168.1.11 ; worker-01
wazuh-cluster.example.com A 192.168.1.12 ; worker-02
wazuh-cluster.example.com A 192.168.1.13 ; worker-03In the agent configuration (ossec.conf), specify the DNS name:
<client>
<server>
<address>wazuh-cluster.example.com</address>
<port>1514</port>
<protocol>tcp</protocol>
</server>
</client>HAProxy Configuration
For more controlled balancing, use HAProxy:
frontend wazuh_agents
bind *:1514
mode tcp
default_backend wazuh_workers
backend wazuh_workers
mode tcp
balance roundrobin
server worker-01 192.168.1.11:1514 check
server worker-02 192.168.1.12:1514 check
server worker-03 192.168.1.13:1514 checkDistribution Recommendations
- Direct agents primarily to worker nodes rather than the master
- The master node handles cluster synchronization and API requests, so assigning a large number of agents to it degrades performance
- Monitor even distribution across workers using
cluster_control
The cluster_control Utility
The /var/ossec/bin/cluster_control utility provides a command-line interface for monitoring and managing the cluster.
Viewing Cluster Status
# List all cluster nodes
/var/ossec/bin/cluster_control -lExample output:
NAME TYPE VERSION ADDRESS
master-node master 4.14.3 192.168.1.10
worker-01 worker 4.14.3 192.168.1.11
worker-02 worker 4.14.3 192.168.1.12Viewing Agent Distribution
# Show agent count per node
/var/ossec/bin/cluster_control -aExample output:
NAME AGENTS
master-node 15
worker-01 245
worker-02 240Additional Commands
# Detailed information about a specific node
/var/ossec/bin/cluster_control -i worker-01
# Check synchronization status
/var/ossec/bin/cluster_control -l -fn
# Display help
/var/ossec/bin/cluster_control -hMonitoring via REST API
Cluster status is also available through the REST API :
# Obtain JWT token
TOKEN=$(curl -sk -u wazuh-wui:password \
-X POST "https://localhost:55000/security/user/authenticate?raw=true")
# Cluster node information
curl -sk -H "Authorization: Bearer $TOKEN" \
"https://localhost:55000/cluster/nodes?pretty=true"
# Cluster status
curl -sk -H "Authorization: Bearer $TOKEN" \
"https://localhost:55000/cluster/status?pretty=true"
# Agent distribution across nodes
curl -sk -H "Authorization: Bearer $TOKEN" \
"https://localhost:55000/agents?pretty=true&select=node_name&limit=500"Performance Tuning
Agent Count Recommendations
The number of agents a single node can handle depends on hardware resources and event generation intensity (EPS - Events Per Second).
| Node Resources (CPU/RAM) | Recommended Agents | Approximate EPS |
|---|---|---|
| 2 CPU / 4 GB | Up to 50 | Up to 100 |
| 4 CPU / 8 GB | Up to 200 | Up to 500 |
| 8 CPU / 16 GB | Up to 1000 | Up to 2500 |
| 16 CPU / 32 GB | Up to 5000 | Up to 10000 |
Master Node Optimization
The master node handles synchronization and API requests. To optimize:
- Minimize the number of agents connected directly to the master
- Allocate sufficient resources for synchronization (especially important with a large number of custom rules and decoders)
- Place master and worker nodes in the same network to minimize synchronization latency
Filebeat Configuration
Each cluster node forwards data to the indexer via Filebeat. Ensure Filebeat is configured on every node and points to the indexer cluster:
output.elasticsearch:
hosts:
- "192.168.1.20:9200"
- "192.168.1.21:9200"
- "192.168.1.22:9200"
protocol: https
username: admin
password: "${INDEXER_PASSWORD}"
ssl.certificate_authorities:
- /etc/filebeat/certs/root-ca.pemComparison with Other SIEM Platforms
Splunk Indexer Clustering
| Feature | Wazuh Server Cluster | Splunk Indexer Cluster |
|---|---|---|
| Model | Master/Worker | Cluster Manager/Peers |
| Master function | Configuration synchronization | Data replication management |
| Load balancing | External balancer (HAProxy) | Built-in (forwarder affinity) |
| Synchronization | Rules, decoders, lists | Configuration bundles |
| Cost | Free (open source) | Commercial license |
ELK Coordinating Nodes
| Feature | Wazuh Server Cluster | ELK Stack |
|---|---|---|
| Event analysis | On server cluster | At Logstash level |
| Clustering | Built-in | Separate per component |
| Configuration | Single ossec.conf | Separate configs for ES, Logstash, Kibana |
| Detection | Rules and decoders | ElastAlert / Detection Rules (separate project) |
Troubleshooting
Node Fails to Join the Cluster
Symptoms: the worker node does not appear in cluster_control -l.
Checks:
Verify that the
keyis identical on both master and worker nodes.Test network connectivity on port 1516/TCP:
nc -zv 192.168.1.10 1516Confirm the cluster
namematches on all nodes.Review the cluster log:
tail -f /var/ossec/logs/cluster.log- Ensure the
disabledparameter is set tono.
Synchronization Issues
Symptoms: rule changes on the master are not applied on worker nodes.
Checks:
- Check synchronization status:
/var/ossec/bin/cluster_control -l -fnConfirm that changes were made on the master node (not on a worker).
Review the cluster log for synchronization errors:
grep -i "error\|sync" /var/ossec/logs/cluster.log | tail -20- Wait for the synchronization interval (default 10 seconds) or restart the manager.
Split-Brain Scenario
Symptoms: worker nodes lose visibility of the master, agents remain connected but alerts are not generated correctly.
Causes:
- Network connectivity loss between nodes
- Master node overload
- Incorrect network interface configuration (
bind_addr)
Resolution:
Restore network connectivity between nodes.
Restart worker nodes after connectivity is restored:
systemctl restart wazuh-manager- Verify cluster integrity:
/var/ossec/bin/cluster_control -lHigh Resource Consumption During Synchronization
Symptoms: the master node experiences CPU/RAM spikes with a large number of custom rules.
Resolution:
- Reduce the number of custom rules and decoders by consolidating duplicates
- Increase master node resources
- Verify that cyclical synchronization is not occurring (changes should only be made on the master)
For a general overview of infrastructure components, see the Wazuh infrastructure section . Component details are available in Wazuh architecture .