pfSense Multi-WAN Failover - Automatic Link Switchover

Failover in the Multi-WAN context is the automatic redirection of traffic from a primary internet link to a backup link when the primary is detected as down. pfSense implements failover through Gateway Groups with gateways separated into priority levels (tiers). Gateways on lower-numbered tiers are preferred, and gateways on higher-numbered tiers activate only when all gateways on the previous tier become unavailable.

Unlike load balancing, where gateways operate in parallel on the same tier, a failover configuration assigns gateways to different tiers, forming a priority chain. The backup link remains in hot standby and accepts traffic only upon primary link failure.

Failover vs. Load Balancing

Both mechanisms use Gateway Groups but differ in tier assignment logic:

Characteristic	Load Balancing	Failover
Tier assignment	All gateways on the same tier	Gateways on different tiers
Link utilization	Simultaneous	Sequential (by priority)
Backup link	None (all active)	Activates on primary failure
Link load	Distributed	Concentrated on primary
Switchover	Automatic (failed gateway excluded)	Automatic (backup activated)

A combined configuration merges both approaches. For example, two links on Tier 1 provide load balancing while a third link on Tier 2 serves as backup for both:

Gateway	Tier	Role
WAN1_DHCP	1	Primary, load balanced
WAN2_DHCP	1	Primary, load balanced
WAN3_DHCP	2	Backup

In this configuration, traffic is balanced between WAN1 and WAN2. If WAN1 fails, all traffic shifts to WAN2. If both primary links fail, WAN3 activates.

Gateway Monitoring for Failover

Failover speed and accuracy depend directly on gateway monitoring settings. Aggressive settings enable rapid switchover but increase the risk of false positives. Conservative settings reduce false positive risk but increase failure detection time.

Recommended Monitoring Parameters

For a typical failover configuration, the following parameters are recommended:

Parameter	Value	Rationale
Monitor IP	Public DNS (8.8.8.8, 1.1.1.1)	Tests end-to-end reachability, not just the ISP gateway
Probe Interval	1 second	Rapid failure detection
Loss Interval	2000 ms	Sufficient timeout for high-latency links
Time Period	30 seconds	Shortened averaging window for faster response
High Latency	400 ms	Warning threshold for degradation
High Loss	15%	Warning threshold for packet loss
Down	10%	Threshold for switchover to backup link

Choosing a Monitor IP

The Monitor IP determines what the monitoring system actually tests:

Monitor IP	What Is Tested	Advantages	Disadvantages
ISP gateway IP	Nearest hop availability	Minimal latency, fast response	Does not detect upstream failures
Public DNS (8.8.8.8)	End-to-end internet reachability	Detects any failure in the chain	Additional latency
Own VPS	End-to-end reachability to target resource	Full control	Requires additional infrastructure

Warning:
Each gateway must use a unique Monitor IP. If both gateways monitor the same address, a failure of that address (rather than the link) causes both gateways to transition to Down status simultaneously, resulting in complete connectivity loss.

dpinger Configuration

The dpinger daemon performs gateway monitoring. Its operation can be verified from the command line:

# Check dpinger processes
ps aux | grep dpinger

# View gateway status in real time
/usr/local/sbin/pfSsh.php playback gatewaystatus

dpinger logs are written to the system journal and accessible under Status > System Logs > Gateways.

Creating a Failover Gateway Group

Step-by-Step Configuration

Navigate to System > Routing > Gateway Groups.
Click Add.
Configure the group parameters:

Parameter	Value	Description
Group Name	`WAN_Failover`	Group name
WAN1_DHCP	Tier 1	Primary link
WAN2_DHCP	Tier 2	Backup link
Trigger Level	Packet Loss or High Latency	Switchover condition
Description	Primary WAN1, failover to WAN2	Purpose description

Click Save, then Apply Changes.

Selecting the Trigger Level for Failover

For failover configurations, the Trigger Level determines switchover sensitivity:

Trigger Level	Time to Switchover	False Positive Risk	Recommended For
Member Down	Maximum	Minimal	Non-critical services
Packet Loss	Medium	Medium	Most configurations
High Latency	Minimum	High	Latency-sensitive services
Packet Loss or High Latency	Minimum	Maximum	Critical services with reliable links

For production configurations, Packet Loss is recommended - it provides a balance between response speed and resilience to transient fluctuations.

Applying the Gateway Group to Firewall Rules

After creating the failover Gateway Group, it must be assigned to a firewall rule.

Rule Configuration

Navigate to Firewall > Rules > LAN.
Create or edit a rule for outbound traffic.
Under Extra Options, click Display Advanced.
In the Gateway field, select WAN_Failover.
Click Save, then Apply Changes.

Different traffic types can use separate rules with distinct failover groups. For example, critical business traffic can use a group with aggressive monitoring settings while general traffic uses a group with conservative settings.

DNS with Failover

When switching to a backup link, DNS queries routed through the primary link stop receiving responses. This can cause name resolution delays until DNS switches to the backup path.

DNS Configuration for Failover

Under System > General Setup, configure DNS servers for each WAN:

DNS Server	IP Address	Gateway
DNS Server 1	8.8.8.8	WAN1_DHCP
DNS Server 2	8.8.4.4	WAN1_DHCP
DNS Server 3	1.1.1.1	WAN2_DHCP
DNS Server 4	1.0.0.1	WAN2_DHCP

Under Services > DNS Resolver, enable DNS Query Forwarding to use the configured upstream servers.
pfSense automatically stops using DNS servers bound to an unavailable gateway and switches to servers bound to the available link.

Warning:
Without DNS Query Forwarding, the DNS Resolver performs recursive queries directly to root servers. In this case, DNS queries are routed through the default gateway. During failover, the default gateway changes automatically, but the transition period may cause brief name resolution delays.

Testing Failover

Before deploying the configuration to production, the switchover must be verified.

Method 1: Physical Disconnection

Open Status > Gateways to observe gateway statuses.
Physically disconnect the primary WAN interface cable.
Observe the primary gateway status change:
- Status should transition to Warning, then Down.
- Time to status change depends on the Time Period and threshold settings.
Verify that traffic has switched to the backup link:
- Open an external IP check service (such as ifconfig.me).
- The displayed IP should match the backup WAN external address.
Reconnect the cable and confirm the return to the primary link.

Method 2: Blocking the Monitor IP

Create a temporary firewall rule on the WAN interface blocking ICMP traffic to the primary gateway Monitor IP.
Observe the switchover under Status > Gateways.
Remove the temporary rule after verification.

This method tests failover without physical intervention.

Method 3: Threshold Adjustment

Temporarily set the Down threshold to 0% for the primary gateway.
Any packet loss triggers an immediate switchover.
Restore original threshold values after verification.

Verification Checklist

Check	Expected Result
Switchover time	Depends on Time Period and thresholds (typically 30-60 seconds)
DNS resolution	Continues working through the backup link
Active TCP connections	Interrupted (expected behavior)
New connections	Established through the backup link
Return to primary	Automatic after primary gateway recovery
Logs	Switchover entries in Status > System Logs > Gateways

Recovery Behavior

After the primary gateway becomes reachable again, pfSense automatically returns traffic to the primary link (Tier 1). The recovery sequence:

dpinger detects that the primary gateway is responding to probe packets again.
After accumulating sufficient successful responses within the Time Period, the gateway status changes to Online.
pfSense routes new connections through the primary link.
Active connections through the backup link continue until they complete naturally.

Recovery time is governed by the Time Period parameter. At the default value (60 seconds), return to the primary link occurs approximately 60-90 seconds after connectivity is restored.

Warning:
Frequent switching between links (flapping) indicates primary link instability. If the primary link recovers and fails in rapid succession, active connections are interrupted with each switchover. In such cases, increase the Time Period or threshold values to reduce sensitivity.

VPN with Failover

IPsec

When using IPsec VPN with failover, the following considerations apply:

An IPsec tunnel is bound to a specific WAN interface. When that interface fails, the tunnel drops.
For automatic IPsec recovery through the backup WAN, create a separate Phase 1 configuration for each WAN interface.
The remote peer must also be configured to accept connections from both IP addresses.

IPsec failover configuration:

Parameter	WAN1 Phase 1	WAN2 Phase 1
Interface	WAN1	WAN2
Remote Gateway	peer-ip-address	peer-ip-address
Phase 2 Subnet	192.168.1.0/24	192.168.1.0/24

When WAN1 fails, the IPsec tunnel through WAN1 drops, and the tunnel through WAN2 establishes automatically (when DPD - Dead Peer Detection - is enabled).

OpenVPN

OpenVPN supports several failover approaches:

Gateway Group assignment to OpenVPN interface - the OpenVPN server or client binds to a specific WAN. For failover, create two OpenVPN instances on different WANs.
Floating IP - when using a CARP VIP as the OpenVPN server address, switchover occurs at the CARP level.
Client-side failover - in the OpenVPN client configuration, specify multiple remote directives with different servers. The client automatically connects to the next server when the connection drops.

Troubleshooting

Failover Not Triggering

Symptom: the primary link is down, but traffic does not switch to the backup.

Checks:

Gateway status - check Status > Gateways. If the primary gateway still shows Online, the issue is in monitoring settings:
- Verify the Monitor IP is correct and unreachable through the failed link.
- Confirm the Monitor IP is not reachable through the backup link (routing loop).
Gateway Group - verify gateways are assigned to the correct tiers (primary on Tier 1, backup on Tier 2).
Firewall rule - confirm the Gateway Group is assigned in the firewall rule.
Trigger Level - verify the Trigger Level matches the failure type (for example, Member Down does not react to high loss, only to complete failure).

Slow Failure Detection

Symptom: switchover to the backup link takes several minutes.

Causes and solutions:

Time Period too large - reduce to 30 seconds. dpinger requires data accumulation over the entire Time Period before changing status.
Thresholds too high - if the Down threshold is set to 50%, the gateway must lose 50% of packets before being marked as Down. The recommended value is 10%.
Monitor IP on ISP gateway - the ISP gateway may continue responding despite upstream connectivity loss. Replace with a public DNS server.

False Positives

Symptom: failover triggers while the primary link is operational.

Causes and solutions:

Monitor IP overloaded - if the Monitor IP (such as 8.8.8.8) temporarily stops responding due to rate limiting or congestion, dpinger records losses. Use a less congested Monitor IP or increase thresholds.
Time Period too small - transient network fluctuations cause switchover. Increase the Time Period to 60-120 seconds.
Probe Interval too aggressive - at 1-second intervals with an unstable link, losses accumulate quickly. Increase to 2-3 seconds.

Flapping (Cyclic Switching)

Symptom: traffic constantly switches between primary and backup links.

Cause: the primary link is unstable - it periodically recovers and fails.

Solution:

Increase the Time Period to 120-180 seconds for a longer stability window before returning.
Increase the Down threshold to reduce sensitivity.
If primary link instability cannot be resolved, consider transitioning to a load balancing configuration where both links are used simultaneously.

Monitoring and Alerting

pfSense logs gateway switchover events in the system journal. For timely incident response, external monitoring is recommended.

System Logs

Failover events are recorded under Status > System Logs > Gateways. Typical entries:

dpinger: WAN1_DHCP 8.8.8.8: Alarm latency 0us stddev 0us loss 100%
dpinger: WAN1_DHCP 8.8.8.8: Clear latency 5432us stddev 312us loss 0%

SNMP Monitoring

pfSense supports SNMP for external monitoring. Gateway status is available through SNMP OIDs. SNMP configuration is under Services > SNMP.

Syslog

To send logs to an external syslog server (Wazuh, ELK, Graylog), configure remote syslog under Status > System Logs > Settings, in the Remote Logging Options section.

Related Sections

Multi-WAN Load Balancing - configuring simultaneous use of multiple links
Outbound NAT - configuring NAT for correct failover operation
IPsec VPN - configuring IPsec tunnels with Multi-WAN failover

Last updated on 7, April 2026

pfSense Multi-WAN Load Balancing - Gateway Groups