Wazuh 4.14 Decoders - Log Data Extraction
Wazuh decoders extract structured data from raw log messages: IP addresses, usernames, actions, error codes, and other fields. Without correct decoding, rules cannot analyze events, and alerts lack actionable information. This guide covers decoder syntax, parent-child decoder hierarchies, built-in decoders, and the process of building custom decoders.
Event Processing Phases
Every event passes through two decoding phases:
Phase 1: Pre-decoding
Performed automatically on all events. Standard syslog fields are extracted:
- timestamp - event timestamp
- hostname - source hostname
- program_name - program name (from the syslog header)
Example input log:
Mar 5 10:15:01 web-server sshd[12345]: Failed password for root from 192.168.1.100 port 22 ssh2Pre-decoding result:
timestamp: 'Mar 5 10:15:01'
hostname: 'web-server'
program_name: 'sshd'The remaining message body (Failed password for root from 192.168.1.100 port 22 ssh2) is passed to the decoding phase.
Phase 2: Decoding
In this phase, decoders analyze the message body and extract specific fields. Decoders are applied in definition order: parent decoders first, then child decoders.
XML Decoder Structure
Each decoder is defined by a <decoder> element:
<decoder name="sshd-failed">
<parent>sshd</parent>
<prematch>^Failed password</prematch>
<regex>^Failed password for (\S+) from (\S+) port (\d+)</regex>
<order>srcuser, srcip, srcport</order>
</decoder>Decoder Elements
Core Elements
name (attribute) - unique decoder name:
<decoder name="my-app">parent - links to a parent decoder. The child decoder is applied only if the parent matched:
<parent>sshd</parent>A parent decoder can have many children, but a child decoder cannot serve as a parent for others.
program_name - matches against the program name from the syslog header. Supports regex, sregex, and pcre2 types:
<program_name>sshd</program_name>
<program_name type="pcre2">nginx|apache2?</program_name>prematch - a precondition that must match before the main regex is applied. Operates on the message body after syslog header extraction:
<prematch>^Failed password</prematch>
<prematch type="pcre2">^(?:Failed|Invalid) password</prematch>regex - the regular expression for field extraction. Fields enclosed in parentheses are captured as values:
<regex>^Failed password for (\S+) from (\S+) port (\d+)</regex>order - defines the field names corresponding to the regex capture groups:
<order>srcuser, srcip, srcport</order>Available standard field names:
| Field | Description |
|---|---|
srcip | Source IP address |
dstip | Destination IP address |
srcport | Source port |
dstport | Destination port |
srcuser | Source user |
dstuser | Destination user |
user | User (generic) |
protocol | Protocol |
action | Action |
id | Event identifier |
url | URL |
data | Arbitrary data |
extra_data | Additional data |
status | Status |
system_name | System name |
In addition to standard fields, arbitrary names are permitted - they become dynamic fields.
Additional Elements
type - log type. Defines a category for use in rules:
<type>syslog</type>Available types: syslog (default), firewall, ids, web-log, squid, windows, host-information, ossec.
accumulate - tracks multi-line events by the id field:
<accumulate />fts (First Time Seen) - generates an event on the first occurrence of the specified field combination:
<fts>srcuser, srcip</fts>use_own_name - the child decoder uses its own name instead of inheriting the parent name:
<use_own_name>true</use_own_name>plugin_decoder - invokes a built-in specialized decoder:
<plugin_decoder>JSON_Decoder</plugin_decoder>Available plugins: JSON_Decoder, PF_Decoder, SymantecWS_Decoder, SonicWall_Decoder, OSSECAlert_Decoder.
Decoder Hierarchy (Parent-Child)
Decoders are organized into a hierarchical structure for improved efficiency and parsing accuracy.
How It Works
- Wazuh searches for a matching parent decoder by
program_nameorprematch - Once the parent decoder matches, all its child decoders are evaluated
- The child decoder whose
prematchorregexmatches extracts the fields
Hierarchy Example
Parent decoder (identifies the source - sshd):
<decoder name="sshd">
<program_name>^sshd</program_name>
</decoder>Child decoder (extracts fields on login failure):
<decoder name="sshd-failed">
<parent>sshd</parent>
<prematch>^Failed password</prematch>
<regex>^Failed password for (\S+) from (\S+) port (\d+)</regex>
<order>srcuser, srcip, srcport</order>
</decoder>Child decoder (extracts fields on successful login):
<decoder name="sshd-success">
<parent>sshd</parent>
<prematch>^Accepted</prematch>
<regex>^Accepted \S+ for (\S+) from (\S+) port (\d+)</regex>
<order>srcuser, srcip, srcport</order>
</decoder>Child decoder (disconnect event):
<decoder name="sshd-disconnect">
<parent>sshd</parent>
<prematch>^Disconnected from</prematch>
<regex>^Disconnected from user (\S+) (\S+) port (\d+)</regex>
<order>srcuser, srcip, srcport</order>
</decoder>JSON Decoder
Wazuh includes a built-in JSON_Decoder plugin for automatic JSON log parsing. This is particularly useful for applications that log in JSON format and for cloud services.
Basic Usage
<decoder name="json-app">
<prematch>^{"</prematch>
<plugin_decoder>JSON_Decoder</plugin_decoder>
</decoder>The JSON decoder automatically extracts all fields from the JSON object. Nested fields are represented using dot notation: data.source.ip.
JSON Log Example
Input log:
{"timestamp":"2025-03-05T10:15:01Z","event":"login_failed","user":"admin","src_ip":"192.168.1.100","port":22}Decoding result:
timestamp: '2025-03-05T10:15:01Z'
event: 'login_failed'
user: 'admin'
src_ip: '192.168.1.100'
port: '22'JSON Decoder Settings
json_null_field - behavior for null values:
<json_null_field>discard</json_null_field> <!-- ignore null fields -->
<json_null_field>string</json_null_field> <!-- store as the string "null" -->json_array_structure - array handling:
<json_array_structure>csv</json_array_structure> <!-- array as CSV -->Default Decoders
Wazuh ships with over 1500 decoders located in /var/ossec/ruleset/decoders/. Key files include:
| File | Description |
|---|---|
0005-sshd_decoders.xml | OpenSSH (sshd) |
0010-syslog_decoders.xml | Generic syslog messages |
0020-apache_decoders.xml | Apache HTTP Server |
0025-nginx_decoders.xml | Nginx |
0040-pam_decoders.xml | PAM authentication |
0050-postfix_decoders.xml | Postfix MTA |
0100-windows_decoders.xml | Windows Event Log |
0150-cisco_decoders.xml | Cisco devices |
0200-amazon_decoders.xml | AWS CloudTrail |
0350-docker_decoders.xml | Docker |
Default decoders are updated during Wazuh upgrades and should not be edited directly.
Writing Custom Decoders
Custom decoders are placed in /var/ossec/etc/decoders/local_decoder.xml. This file is preserved during upgrades.
Example: Custom Application Decoder
Suppose an application generates logs in this format:
Mar 5 10:15:01 app-server myapp[9876]: AUTH_FAIL user=admin ip=192.168.1.100 reason=invalid_passwordStep 1: Create the parent decoder
<decoder name="myapp">
<program_name>^myapp</program_name>
</decoder>Step 2: Create a child decoder to extract fields
<decoder name="myapp-auth-fail">
<parent>myapp</parent>
<prematch>^AUTH_FAIL</prematch>
<regex>^AUTH_FAIL user=(\S+) ip=(\S+) reason=(\S+)</regex>
<order>srcuser, srcip, data</order>
</decoder>Step 3: Test the decoder
/var/ossec/bin/wazuh-logtestEnter the log line and confirm that fields are extracted correctly.
Step 4: Create a rule for the event
In /var/ossec/etc/rules/local_rules.xml:
<group name="myapp,">
<rule id="100200" level="5">
<decoded_as>myapp</decoded_as>
<match>AUTH_FAIL</match>
<description>MyApp: authentication failure.</description>
<group>authentication_failed,</group>
</rule>
</group>Step 5: Reload the manager
/var/ossec/bin/wazuh-control reloadExample: Web Application Decoder
Web application log:
2025-03-05 10:15:01 [ERROR] SQL injection attempt detected: query="SELECT * FROM users WHERE id=1 OR 1=1" client=192.168.1.100 path=/api/usersDecoder:
<decoder name="webapp">
<prematch>^\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2} \[</prematch>
</decoder>
<decoder name="webapp-sqli">
<parent>webapp</parent>
<prematch>SQL injection attempt</prematch>
<regex>client=(\S+) path=(\S+)</regex>
<order>srcip, url</order>
</decoder>Example: JSON Application Log Decoder
If the application logs in JSON format:
{"level":"error","message":"unauthorized access","user":"guest","ip":"10.0.0.5","path":"/admin","timestamp":"2025-03-05T10:15:01Z"}Decoder:
<decoder name="json-webapp">
<prematch>^{"level":</prematch>
<plugin_decoder>JSON_Decoder</plugin_decoder>
</decoder>Rules can reference JSON fields through the <field> element:
<rule id="100210" level="8">
<decoded_as>json-webapp</decoded_as>
<field name="level">error</field>
<field name="message">unauthorized access</field>
<description>Web application: unauthorized access attempt.</description>
</rule>Testing Decoders with wazuh-logtest
Interactive Mode
/var/ossec/bin/wazuh-logtestEnter a log and inspect the pre-decoding and decoding phases:
Type one log per line
Mar 5 10:15:01 app-server myapp[9876]: AUTH_FAIL user=admin ip=192.168.1.100 reason=invalid_password
**Phase 1: Completed pre-decoding.
full event: '...'
timestamp: 'Mar 5 10:15:01'
hostname: 'app-server'
program_name: 'myapp'
**Phase 2: Completed decoding.
name: 'myapp-auth-fail'
parent: 'myapp'
srcuser: 'admin'
srcip: '192.168.1.100'
data: 'invalid_password'Debugging Decoders
If a decoder is not firing:
- Verify that
program_namein the parent decoder matches the program name in the log - Confirm that
prematchmatches the beginning of the message body (after pre-decoding) - Check the regex for correct capture groups
- Use the
-vflag for verbose output:
/var/ossec/bin/wazuh-logtest -vCommon Mistakes
| Error | Cause | Resolution |
|---|---|---|
| Decoder does not fire | program_name does not match | Check the syslog header format |
| Fields not extracted | Incorrect regex or order | Verify the number of capture groups |
| Child decoder not applied | parent does not match the parent decoder name | Compare the names |
| JSON fields not extracted | Missing plugin_decoder | Add <plugin_decoder>JSON_Decoder</plugin_decoder> |
Troubleshooting Decoder Issues
Decoder Does Not Detect program_name
Some log sources do not follow the standard syslog format. In these cases, program_name is not extracted during pre-decoding. Use prematch instead of program_name to identify the source.
Decoder Conflicts
If two decoders match the same log, the first one loaded takes priority. Custom decoders load after default decoders. To resolve a conflict, use a more specific prematch.
Decoder Performance
Avoid complex regular expressions with nested quantifiers ((.*)*, (a+)+) that can cause catastrophic backtracking. Use pcre2 with atomic groups for performance-critical decoders.
Related Sections
- Detection Rules - rules that consume decoder output
- Log Analysis - log collection for decoding
- Troubleshooting - diagnosing decoder issues