Articles

Meraki Load Balancing For Multi-WAN Environments

John Ciarlone John Ciarlone
12 minute read

If you run Meraki MX appliances with more than one WAN circuit, you are not just turning on an extra pipe. You are handing real control to the MX decision engine and asking it to place flows on the best available path, in real time, based on how your links behave.

That can work very well, or it can create surprises. The difference comes down to understanding that Meraki load balancing is flow-based, SLA aware, and tightly tied to circuit quality. It is not link aggregation, and it does not magically turn two links into one big one for a single stream.

This article walks through how load balancing actually works in MX appliances, how SD WAN decisions are made, and what IT teams should watch in real deployments.

Where Meraki Load Balancing Actually Delivers Value

Meraki MX appliances support dual active WAN uplinks. When you enable load balancing, the MX spreads outbound traffic across those uplinks at the session level, not packet by packet.

The value you get is not pure throughput. It is a mix of:

  • Better utilization across circuits over time.

  • Smoother handling of bursts and busy hours.

  • Faster adaptation to link problems, without manual routing changes.

You see the benefits most in environments with many concurrent flows and mixed application types. A branch running POS, guest Wi Fi, SaaS, and Auto VPN will usually get cleaner results than a site dominated by one backup stream.

To get realistic expectations, it helps to think of MX as a flow director that is constantly measuring links and applying your policies, not a simple round robin between WAN1 and WAN2.

Meraki’s Session-Based Decision Engine

The core behavior is simple: one flow, one uplink. When a new session starts, the MX:

  1. Checks which uplinks are available.

  2. Looks at recent latency, jitter, and loss measurements.

  3. Applies SD WAN and flow preference policies.

  4. Assigns that flow to a WAN interface.

Both directions of that session stay on the same uplink for its life. That preserves symmetry for NAT and stateful inspection and avoids out-of-order delivery for sensitive applications.

Traffic Classification Rules That Influence WAN Choice

Meraki does not make all traffic equal. You can define rules that influence path choice based on:

  • Application category: Voice, collaboration, web, file sharing, etc.

  • Source and destination: VLANs, subnets, data centers, or SaaS ranges.

  • Markings and ports: DSCP values or traditional L3 and L4 matches.

These rules sit on top of the default flow-based logic. If a rule says that voice should prefer the lowest latency path and your performance classes agree, those flows will land on the cleaner circuit even if that means less perfect utilization.

The result is a mix of automation and intent. The MX does the heavy lifting, but you decide which traffic deserves the best paths.

How Adaptive Path Selection Interacts With Load Balancing

Adaptive path selection turns "dual WAN" into SD WAN.

You define performance classes that describe acceptable latency, jitter, and loss for different traffic types. The MX continuously probes each path, compares live metrics to these classes, and uses that data when placing new flows.

If an uplink crosses an SLA threshold, the MX will:

  • Keep existing sessions on that uplink if they are still usable.

  • Steer new flows for that class to the healthier uplink.

This keeps user impact lower than tearing down sessions aggressively, while still protecting new traffic.

In Auto VPN, the same logic applies to encrypted tunnels. Multiple tunnels between the same sites can remain active. Flows can be balanced across them while still respecting performance targets.

Failure Detection Timers That Matter

Failure detection is where theory and production often disagree. The MX relies on timers and probes to decide whether a path is:

  • Up and healthy.

  • Degraded, but still technically up.

  • Down and unusable.

You control how quickly the MX calls a path bad, and how long it must look good again before traffic returns. Short timers give faster failover but can cause flapping if links are marginal. Longer timers avoid churn but can extend brownout periods.

Tuning these values based on real link behavior is critical. Copying aggressive lab settings into a noisy broadband environment is a reliable way to cause instability.

Architecting Multi-WAN Designs With Real Constraints in Mind

Once you understand how the MX makes decisions, you can design around real constraints instead of pretending every circuit is equal.

In production, links are rarely symmetric. You might be mixing:

  • Business fiber with best effort broadband.

  • MPLS with DIA.

  • One ISP that uses CGNAT with another that hands you a clean public space.

Each mix has trade-offs that show up in load-balancing behavior.

When Asymmetric Circuits Become a Liability

Asymmetric circuits can hurt more than they help if you ignore their differences.

Typical problems include:

  • Latency gaps: MPLS with low jitter paired with high latency broadband can ruin voice and VDI if flows land poorly.

  • Inconsistent return paths: Asymmetric routing upstream can cause return traffic to hit the wrong MX uplink, breaking session state.

  • Bandwidth mismatch: A small circuit paired with a large one will not produce even-looking graphs unless your traffic mix is very granular.

If you cannot avoid asymmetry, treat circuits differently:

  • Pin latency-sensitive traffic to the cleaner link.

  • Use the cheaper or noisier link for SaaS, web, and bulk transfers.

  • Accept that "perfectly split" utilization is not the goal.

Load Balancing in Full-Tunnel vs. Split-Tunnel VPN Designs

Your Auto VPN design changes how dual WAN feels.

  • Full tunnel:

    • Branches send most traffic to a hub.

    • You are balancing encrypted tunnels and their flows on both the branch and the hub uplinks.

    • Path quality is an end-to-end story, not a single link metric.

  • Split tunnel:

    • Only selected internal routes use VPN.

    • Internet traffic breaks out locally and is balanced across local circuits.

    • The benefits of dual WAN are very visible for SaaS and web, while internal apps follow stricter policies.

In both models, you should test how tunnels behave when an uplink degrades, fails, and comes back. Many issues show up not when things go down, but when they return in a half-healthy state.

Working Within ISP Constraints and BGP-less Edge Designs

Most MX deployments do not run BGP at the edge. That simplifies the design but limits what you can influence upstream.

The MX:

  • Owns outbound path selection.

  • Relies on NAT and stable return paths to keep flows symmetric.

  • Cannot easily change how ISPs choose routes between each other.

This makes it important to:

  • Understand each ISP's NAT and shaping behavior.

  • Keep internal routing simple and consistent toward the MX.

  • Avoid clever designs that bypass the MX for some flows and not others.

The cleaner your edge, the more predictable MX load balancing becomes.

Dashboard Configuration Flow and Validation Steps

The Meraki Dashboard makes dual WAN easy to enable. Production teams still follow a sequence so they can predict and troubleshoot the results.

Preparing WAN Circuits and Probing Targets

Before you toggle anything:

  • Verify each uplink's basic stability with simple tests.

  • Confirm DNS resolution and default probe targets are reachable on both circuits.

  • Note busy hour latency, jitter, and loss, so your performance classes are grounded in reality.

If one ISP uses aggressive rate limiting, CGNAT, or odd routing, document it. The MX will still work, but your expectations should match what the carrier actually delivers.

Enabling Load Balancing and Assigning WAN Preferences

In Security & SD-WAN > SD-WAN & traffic shaping:

  • Enable load balancing across WAN1 and WAN2.

  • Create performance classes that match your key application types.

  • Add flow preferences for important traffic, such as:

    • Voice and collaboration.

    • Specific branch to data center routes.

    • Sensitive VLANs or departments. 

Make one change at a time. Treat configuration like code: document what you touched and what you expected to see.

Verifying Session Distribution and Failover Logic

After configuration, validate behavior:

  • Check session tables and WAN graphs to confirm both circuits are actually used.

  • Correlate application usage with uplink utilization.

  • Trigger controlled failover and failback events, and watch how long real sessions survive.

You want to confirm that the MX is making decisions for the reasons you expect, not just that it is sending some traffic out of each link.

Common Deployment Problems and How IT Teams Resolve Them

In the field, a handful of patterns make up most dual WAN tickets. Your outline calls out the big four.

“Traffic Isn’t Splitting The Way We Expected”

This often comes down to:

  • A few large flows dominate usage.

  • Overly broad preferences pinning too much to one link.

  • Hidden ISP shaping that skews utilization.

Resolution steps:

  • Compare per-application usage with uplink graphs.

  • Tighten or narrow flow preferences.

  • Decide whether the secondary link is there for true balancing or mainly for resiliency and offload.

“Failover Works, But Failback Breaks Sessions”

Symptoms:

  • Users are fine during an outage, then complain when the primary comes back.

  • WAN1 and WAN2 status flaps in the event log.

  • Certain apps drop at the moment of failback.

Typical fixes:

  • Increase recovery timers so the MX waits longer before returning traffic.

  • For critical traffic, delay failback until a maintenance window.

  • Stabilize the primary circuit or demote it if it is chronically marginal.

“Auto VPN Performance Drops Under Load Balancing”

You may see:

  • Higher latency for some branches after enabling balancing.

  • Voice or VDI degradation even though tunnels stay up.

  • MX CPU or encryption offload reaching higher utilization.

Mitigation:

  • Tighten performance classes for VPN traffic that carries real-time applications.

  • Pin specific VPN hubs or networks to the lower-latency uplink.

  • Confirm that MX models are sized correctly for the number of active tunnels and throughput.

“Applications See Random Disconnects”

These are usually hardest on users and most frustrating for engineers.

Root causes often include:

  • Asymmetric return paths from cloud or data center services.

  • Different NAT timeout values across providers.

  • Aggressive firewall idle timers that do not match app behavior.

Resolution pattern:

  • Capture traffic on both uplinks and compare.

  • Align NAT and firewall timeouts with application requirements.

  • Use flow preferences and routing to keep critical traffic on the path with the most stable behavior.

Hardening the Network When Running Active-Active WAN

Once both uplinks carry production traffic, your exposure and complexity grow. Hardening should evolve with deployment, not trail behind it.

Reducing Attack Surface Across Dual WANs

Inventory of which services are reachable on each uplink and on which public IPs. Then:

  • Remove or lock down anything that does not need inbound access.

  • Keep inbound rules consistent across both circuits so behavior does not change if the MX shifts flows.

Monitoring for Drift and Performance Decay

Circuits age. ISPs change policies quietly. To avoid surprises:

  • Review WAN health, SD WAN events, and performance class violations on a regular schedule.

  • Watch for slow increases in jitter and loss that never cross a hard outage threshold but still hurt users.

Controlling Inbound Exposure Across Dual WANs

Do not assume your "safer" ISP will always handle more security.

  • Normalize MX firewall and IDS/IPS policies across uplinks.

  • Treat the least filtered carrier as the baseline risk and harden to that standard.

Normalizing Carrier-Level Security Gaps

If one ISP drops obvious attacks and another passes them through, the MX should be configured to absorb the difference:

  • Apply equivalent IDS/IPS, content filtering, and geo-blocking.

  • Keep policy templates consistent across sites to avoid drift.

Hardening DNS and WAN Probing Behavior

Because MX uses probes and, in many cases, DNS to judge link health:

  • Protect DNS resolvers used by the MX.

  • Ensure probe targets are not rate-limited or filtered in ways that confuse health checks.

  • Treat probe traffic as part of the control plane, not as background noise.

Preventing Asymmetric Path Bypass

Make sure all outbound flows that should be tracked by MX actually pass through it.

  • Avoid parallel exits that bypass MX and cause state or NAT inconsistency.

  • Document any exception paths clearly and test them under failover conditions.

Validating NAT and Port-State Behavior Post-Hardening

After tightening policies:

  • Test long-lived sessions on both uplinks.

  • Confirm idle and hard timeouts match application needs.

  • Validate that NAT behavior is stable across failover and failback.

Best Practices IT Teams Repeat Across Successful Deployments

Across many Meraki MX rollouts, a set of habits show up in every stable environment:

  • Symmetric circuits: When possible, choose WAN links with similar bandwidth and latency so MX decisions, performance, and troubleshooting stay predictable.

  • Baseline first: Measure each circuit’s latency, jitter, and loss during busy hours before you enable load balancing so your expectations and thresholds are grounded in reality.

  • Realistic classes: Define performance classes that match how your key apps behave (voice, VDI, SaaS, bulk) and map them so the MX can make intelligent path choices.

  • Intentional flow prefs: Use flow preferences to protect critical traffic on the cleanest path instead of forcing a 50/50 split that may hurt user experience.

  • Test failover + failback: Run planned tests with real or realistic traffic so you see how apps behave when links drop and when they come back, not just when you pull a cable.

  • Review regularly: Check SD-WAN events, path choices, and circuit health on a schedule, so you catch drift and ISP issues before they turn into user-impacting outages.

Load Balancing Isn’t a Shortcut, It’s a Design Choice

Meraki load balancing does not fix bad circuits or sloppy designs. It amplifies whatever foundation you give it.

If you feed the MX honest performance data, sensible policies, and well-understood circuits, it will give you smoother traffic distribution and better resiliency. If you treat dual WAN as a checkbox, it will give you exactly that level of reliability.

Optimize, secure, and modernize your Meraki WAN with Hummingbird’s certified guidance.

FAQs

Does Meraki support packet-based load balancing?

No. Meraki MX appliances use flow-based load balancing, not per-packet distribution. Each new session is assigned to a WAN uplink based on real-time link performance and policy logic. This ensures session integrity for applications sensitive to out-of-order delivery.

Can the MX distribute VPN traffic across multiple WAN uplinks?

Yes. Auto VPN establishes active-active tunnels across all available uplinks. Traffic is balanced per flow across these tunnels based on uplink performance and SD-WAN policies, including SLA thresholds for latency, jitter, and loss.

Does changing WAN priority force flows to move immediately?

No. Existing flows maintain their assigned uplink to preserve session integrity. New flows follow the updated uplink priority and SD-WAN policy rules. You can verify path selection in the WAN event logs and the SD-WAN performance dashboard.

Can I force specific applications to use a preferred WAN link?

Yes. Flow preferences allow you to route traffic based on application type, subnet, VPN destination, or VLAN. These rules override default flow balancing and use Meraki’s L7 DPI engine to identify application categories.

How does load balancing behave during partial uplink degradation?

If an uplink’s performance deteriorates but does not fully fail, the MX gradually shifts new flows toward healthier uplinks. Existing flows remain active on their original path unless the link becomes unusable. This prevents unnecessary churn.

Is load balancing supported when using MPLS alongside broadband?

Yes. When MPLS is connected via WAN or LAN, the MX evaluates its performance like any other path. You can assign preferred paths for critical applications (e.g., VoIP) while steering best-effort traffic over broadband circuits.

Does Meraki load balancing work in HA pairs?

Yes. When using a warm-spare pair, both appliances continuously share uplink health information. If a failover occurs, the spare retains the SD-WAN state and continues flow distribution according to established policies.

How does the MX handle asymmetric routing concerns?

Flow-based routing keeps both directions of a session on the same WAN uplink to prevent asymmetric paths. Even with multiple WAN circuits, the MX ensures bidirectional flow consistency unless a failover event forces reassignment.

Can load balancing be combined with traffic shaping?

Absolutely. Traffic shaping rules operate at the queue and bandwidth-allocation level, while load balancing makes path decisions. Together, they allow granular prioritization of real-time traffic and more efficient multi-WAN utilization.

« Back to Articles