While software offloading works on almost any device, Hardware Flow Offloading is specific to certain chipsets (like some MediaTek or Marvell units).
If you’ve ever used nftables , you know it’s powerful and flexible. But software filtering still consumes CPU. What if your network card could do the heavy lifting? Enter — and the kernel module that makes it work. kmod-nft-offload
In OpenWrt, the kmod-nft-offload module is typically installed as a dependency of the firewall4 package. However, if you need to install it manually or verify its presence: While software offloading works on almost any device,
In essence, kmod-nft-offload translates high-level nftables rules into low-level instructions that a network interface card (NIC) or switch's packet processor can understand and execute directly on hardware. By bypassing the main CPU for established connection flows, it dramatically increases throughput and reduces latency. Think of it as a dedicated express lane on a highway: the first few packets of a connection (the "slow path") are handled in software to establish state, but once a connection is established, the remaining packets are seamlessly offloaded to the hardware "fast path" for wire-speed forwarding. What if your network card could do the heavy lifting
Most modern OpenWrt builds include it by default if they use firewall4 . If yours doesn't, you can install it via the OpenWrt SSH CLI: opkg update opkg install kmod-nft-offload Use code with caution. Copied to clipboard
Even with supported hardware, offload can fail silently. Here are common pitfalls:
You cannot offload ct state established easily because the hardware would need to maintain stateful timers. For true offload, use stateless rules or ensure tc can offload the connection tracking (requires advanced hardware with full conntrack offload, like Mellanox ASAP²).