Wazuh 4.14 Decoders - Log Data Extraction

Wazuh decoders extract structured data from raw log messages: IP addresses, usernames, actions, error codes, and other fields. Without correct decoding, rules cannot analyze events, and alerts lack actionable information. This guide covers decoder syntax, parent-child decoder hierarchies, built-in decoders, and the process of building custom decoders.

Event Processing Phases

Every event passes through two decoding phases:

Phase 1: Pre-decoding

Performed automatically on all events. Standard syslog fields are extracted:

  • timestamp - event timestamp
  • hostname - source hostname
  • program_name - program name (from the syslog header)

Example input log:

Mar  5 10:15:01 web-server sshd[12345]: Failed password for root from 192.168.1.100 port 22 ssh2

Pre-decoding result:

timestamp: 'Mar  5 10:15:01'
hostname: 'web-server'
program_name: 'sshd'

The remaining message body (Failed password for root from 192.168.1.100 port 22 ssh2) is passed to the decoding phase.

Phase 2: Decoding

In this phase, decoders analyze the message body and extract specific fields. Decoders are applied in definition order: parent decoders first, then child decoders.

XML Decoder Structure

Each decoder is defined by a <decoder> element:

<decoder name="sshd-failed">
  <parent>sshd</parent>
  <prematch>^Failed password</prematch>
  <regex>^Failed password for (\S+) from (\S+) port (\d+)</regex>
  <order>srcuser, srcip, srcport</order>
</decoder>

Decoder Elements

Core Elements

name (attribute) - unique decoder name:

<decoder name="my-app">

parent - links to a parent decoder. The child decoder is applied only if the parent matched:

<parent>sshd</parent>

A parent decoder can have many children, but a child decoder cannot serve as a parent for others.

program_name - matches against the program name from the syslog header. Supports regex, sregex, and pcre2 types:

<program_name>sshd</program_name>
<program_name type="pcre2">nginx|apache2?</program_name>

prematch - a precondition that must match before the main regex is applied. Operates on the message body after syslog header extraction:

<prematch>^Failed password</prematch>
<prematch type="pcre2">^(?:Failed|Invalid) password</prematch>

regex - the regular expression for field extraction. Fields enclosed in parentheses are captured as values:

<regex>^Failed password for (\S+) from (\S+) port (\d+)</regex>

order - defines the field names corresponding to the regex capture groups:

<order>srcuser, srcip, srcport</order>

Available standard field names:

FieldDescription
srcipSource IP address
dstipDestination IP address
srcportSource port
dstportDestination port
srcuserSource user
dstuserDestination user
userUser (generic)
protocolProtocol
actionAction
idEvent identifier
urlURL
dataArbitrary data
extra_dataAdditional data
statusStatus
system_nameSystem name

In addition to standard fields, arbitrary names are permitted - they become dynamic fields.

Additional Elements

type - log type. Defines a category for use in rules:

<type>syslog</type>

Available types: syslog (default), firewall, ids, web-log, squid, windows, host-information, ossec.

accumulate - tracks multi-line events by the id field:

<accumulate />

fts (First Time Seen) - generates an event on the first occurrence of the specified field combination:

<fts>srcuser, srcip</fts>

use_own_name - the child decoder uses its own name instead of inheriting the parent name:

<use_own_name>true</use_own_name>

plugin_decoder - invokes a built-in specialized decoder:

<plugin_decoder>JSON_Decoder</plugin_decoder>

Available plugins: JSON_Decoder, PF_Decoder, SymantecWS_Decoder, SonicWall_Decoder, OSSECAlert_Decoder.

Decoder Hierarchy (Parent-Child)

Decoders are organized into a hierarchical structure for improved efficiency and parsing accuracy.

How It Works

  1. Wazuh searches for a matching parent decoder by program_name or prematch
  2. Once the parent decoder matches, all its child decoders are evaluated
  3. The child decoder whose prematch or regex matches extracts the fields

Hierarchy Example

Parent decoder (identifies the source - sshd):

<decoder name="sshd">
  <program_name>^sshd</program_name>
</decoder>

Child decoder (extracts fields on login failure):

<decoder name="sshd-failed">
  <parent>sshd</parent>
  <prematch>^Failed password</prematch>
  <regex>^Failed password for (\S+) from (\S+) port (\d+)</regex>
  <order>srcuser, srcip, srcport</order>
</decoder>

Child decoder (extracts fields on successful login):

<decoder name="sshd-success">
  <parent>sshd</parent>
  <prematch>^Accepted</prematch>
  <regex>^Accepted \S+ for (\S+) from (\S+) port (\d+)</regex>
  <order>srcuser, srcip, srcport</order>
</decoder>

Child decoder (disconnect event):

<decoder name="sshd-disconnect">
  <parent>sshd</parent>
  <prematch>^Disconnected from</prematch>
  <regex>^Disconnected from user (\S+) (\S+) port (\d+)</regex>
  <order>srcuser, srcip, srcport</order>
</decoder>

JSON Decoder

Wazuh includes a built-in JSON_Decoder plugin for automatic JSON log parsing. This is particularly useful for applications that log in JSON format and for cloud services.

Basic Usage

<decoder name="json-app">
  <prematch>^{"</prematch>
  <plugin_decoder>JSON_Decoder</plugin_decoder>
</decoder>

The JSON decoder automatically extracts all fields from the JSON object. Nested fields are represented using dot notation: data.source.ip.

JSON Log Example

Input log:

{"timestamp":"2025-03-05T10:15:01Z","event":"login_failed","user":"admin","src_ip":"192.168.1.100","port":22}

Decoding result:

timestamp: '2025-03-05T10:15:01Z'
event: 'login_failed'
user: 'admin'
src_ip: '192.168.1.100'
port: '22'

JSON Decoder Settings

json_null_field - behavior for null values:

<json_null_field>discard</json_null_field>  <!-- ignore null fields -->
<json_null_field>string</json_null_field>   <!-- store as the string "null" -->

json_array_structure - array handling:

<json_array_structure>csv</json_array_structure>  <!-- array as CSV -->

Default Decoders

Wazuh ships with over 1500 decoders located in /var/ossec/ruleset/decoders/. Key files include:

FileDescription
0005-sshd_decoders.xmlOpenSSH (sshd)
0010-syslog_decoders.xmlGeneric syslog messages
0020-apache_decoders.xmlApache HTTP Server
0025-nginx_decoders.xmlNginx
0040-pam_decoders.xmlPAM authentication
0050-postfix_decoders.xmlPostfix MTA
0100-windows_decoders.xmlWindows Event Log
0150-cisco_decoders.xmlCisco devices
0200-amazon_decoders.xmlAWS CloudTrail
0350-docker_decoders.xmlDocker

Default decoders are updated during Wazuh upgrades and should not be edited directly.

Writing Custom Decoders

Custom decoders are placed in /var/ossec/etc/decoders/local_decoder.xml. This file is preserved during upgrades.

Example: Custom Application Decoder

Suppose an application generates logs in this format:

Mar  5 10:15:01 app-server myapp[9876]: AUTH_FAIL user=admin ip=192.168.1.100 reason=invalid_password

Step 1: Create the parent decoder

<decoder name="myapp">
  <program_name>^myapp</program_name>
</decoder>

Step 2: Create a child decoder to extract fields

<decoder name="myapp-auth-fail">
  <parent>myapp</parent>
  <prematch>^AUTH_FAIL</prematch>
  <regex>^AUTH_FAIL user=(\S+) ip=(\S+) reason=(\S+)</regex>
  <order>srcuser, srcip, data</order>
</decoder>

Step 3: Test the decoder

/var/ossec/bin/wazuh-logtest

Enter the log line and confirm that fields are extracted correctly.

Step 4: Create a rule for the event

In /var/ossec/etc/rules/local_rules.xml:

<group name="myapp,">
  <rule id="100200" level="5">
    <decoded_as>myapp</decoded_as>
    <match>AUTH_FAIL</match>
    <description>MyApp: authentication failure.</description>
    <group>authentication_failed,</group>
  </rule>
</group>

Step 5: Reload the manager

/var/ossec/bin/wazuh-control reload

Example: Web Application Decoder

Web application log:

2025-03-05 10:15:01 [ERROR] SQL injection attempt detected: query="SELECT * FROM users WHERE id=1 OR 1=1" client=192.168.1.100 path=/api/users

Decoder:

<decoder name="webapp">
  <prematch>^\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2} \[</prematch>
</decoder>

<decoder name="webapp-sqli">
  <parent>webapp</parent>
  <prematch>SQL injection attempt</prematch>
  <regex>client=(\S+) path=(\S+)</regex>
  <order>srcip, url</order>
</decoder>

Example: JSON Application Log Decoder

If the application logs in JSON format:

{"level":"error","message":"unauthorized access","user":"guest","ip":"10.0.0.5","path":"/admin","timestamp":"2025-03-05T10:15:01Z"}

Decoder:

<decoder name="json-webapp">
  <prematch>^{"level":</prematch>
  <plugin_decoder>JSON_Decoder</plugin_decoder>
</decoder>

Rules can reference JSON fields through the <field> element:

<rule id="100210" level="8">
  <decoded_as>json-webapp</decoded_as>
  <field name="level">error</field>
  <field name="message">unauthorized access</field>
  <description>Web application: unauthorized access attempt.</description>
</rule>

Testing Decoders with wazuh-logtest

Interactive Mode

/var/ossec/bin/wazuh-logtest

Enter a log and inspect the pre-decoding and decoding phases:

Type one log per line

Mar  5 10:15:01 app-server myapp[9876]: AUTH_FAIL user=admin ip=192.168.1.100 reason=invalid_password

**Phase 1: Completed pre-decoding.
       full event: '...'
       timestamp: 'Mar  5 10:15:01'
       hostname: 'app-server'
       program_name: 'myapp'

**Phase 2: Completed decoding.
       name: 'myapp-auth-fail'
       parent: 'myapp'
       srcuser: 'admin'
       srcip: '192.168.1.100'
       data: 'invalid_password'

Debugging Decoders

If a decoder is not firing:

  1. Verify that program_name in the parent decoder matches the program name in the log
  2. Confirm that prematch matches the beginning of the message body (after pre-decoding)
  3. Check the regex for correct capture groups
  4. Use the -v flag for verbose output:
/var/ossec/bin/wazuh-logtest -v

Common Mistakes

ErrorCauseResolution
Decoder does not fireprogram_name does not matchCheck the syslog header format
Fields not extractedIncorrect regex or orderVerify the number of capture groups
Child decoder not appliedparent does not match the parent decoder nameCompare the names
JSON fields not extractedMissing plugin_decoderAdd <plugin_decoder>JSON_Decoder</plugin_decoder>

Troubleshooting Decoder Issues

Decoder Does Not Detect program_name

Some log sources do not follow the standard syslog format. In these cases, program_name is not extracted during pre-decoding. Use prematch instead of program_name to identify the source.

Decoder Conflicts

If two decoders match the same log, the first one loaded takes priority. Custom decoders load after default decoders. To resolve a conflict, use a more specific prematch.

Decoder Performance

Avoid complex regular expressions with nested quantifiers ((.*)*, (a+)+) that can cause catastrophic backtracking. Use pcre2 with atomic groups for performance-critical decoders.

Related Sections

Last updated on