Skip to main content

*This post originally published on medium as part of the Walmart Global Tech Blog, written by L3AF community member Karan Dalal.

Walmart Global Tech is developing some of the most cutting-edge products in the realm of eBPF under the Linux Foundation Networking (LFN) project, L3AF. Though this blog can be read independently, we recommend reading the three-part series introducing the L3AF project.

“The Traffic Mirroring eBPF package takes advantage of recent innovations in kernel engineering to provide a scalable, efficient, and distributed solution to mirror traffic of interest from a linux host to a destination of choice.”

With millions of people around the world shopping at Walmart, our platform engineers have developed the traffic mirroring (eBPF package) to optimize the customer experience, and to enable real-time, reliable network security.

It is essential for Walmart to have visibility into how the customers are interacting with our site. Similarly, we would also like to send traffic to out-of-band security and monitoring solutions for content inspection, threat monitoring and network debugging.

Today, we are open sourcing the traffic mirroring eBPF package, which seamlessly integrates with the L3AF eco-system to automate the process of sending the traffic of interest to a destination of choice.

Requirements for a traffic mirroring solution

One of the most effective ways of collecting this traffic of interest in the public cloud is from the edge proxy servers. However, it is also a critical hop that handles all the ingress traffic to the site and is performance sensitive.

Keeping this in mind, below is a high-level summary of the solution requirements:

  • Lightweight, highly performant, and secure solution that can be implemented at the source (i.e., on the edge proxy)
  • Software-based solution that can run on any commodity linux server, thereby eliminating an additional point of hardware failure, and saving on infrastructure cost.
  • Customized filtering capability that allows us to mirror only traffic we care about to the analytics solution/s.
  • Provide the flexibility to dynamically enable/disable the mirroring function with the ability to selectively choose the applications/domains that we want visibility into, on the fly, in a seamless manner.

Challenges with commercial solutions

Most of the commercial solutions are based on legacy networking stack and have several limitations. A few of these solutions are listed here:

  • Running a stand-alone agent that would mirror 100% of traffic (on the edge proxy). However, this would incur significant traffic expenses as we would mirror 100% data. Additionally, we would also incur licensing cost ($ / NIC that we decide to mirror) and be an overhead on the resources of the host.
  • Using traffic mirroring services that are offered natively by the public cloud. However, this isn’t a consistent solution as many flavors of the public cloud either do not offer this solution or do not offer the necessary capability to filter the data of interest.

Traffic Mirroring with eBPF

The traffic mirroring eBPF package encapsulates the filtering and mirroring functionalities together.

As seen in the diagram, it does the following:

  • mirrors both the inbound and outbound traffic, from the attached host network interface/s
  • filters the traffic of interest based on one or many custom filters in 5-tuple (sa, da, sp, dp, proto), thereby limiting the bandwidth utilization.
  • sends the traffic of interest to the destination of choice.

Additionally, given that eBPF is very lightweight, highly performant, and safe, this solution can be implemented at the source (i.e., on the edge proxy). So, on the edge proxy, we attach the mirroring function to the primary NIC that processes the actual traffic. This solution examines every incoming/outgoing packet using the TC hook and matches it against the filter (5-tuple based). If the match is successful, it clones the packet and redirects to a secondary NIC that forwards traffic to the analytics systems on a GUE tunnel.

Out of the box integration with L3AF eco-system

The L3AF eco-system enables enterprises to manage and orchestrate eBPF packages at scale. Using out-of-the box integration with L3AF, users can leverage simple APIs to:

  • Launch the traffic mirroring program into the running kernel and monitor the health of the program.
  • Run multiple eBPF programs in a chain (sequence) on an interface. For e.g., we can call an API to configure and run flow log exporter and traffic mirroring programs in a sequence. The API also allows us to change the sequence dynamically.
  • Update the traffic mirroring filters on the fly. For e.g., we can call an API to update the port or protocol.
  • Publish the health status in PromQL format, and integrate with alert manager

Try it!

The L3AF dev environment is a virtual machine environment that allows users to develop, test, or just try out packages from the eBPF package repository. Detailed instructions on how to setup the L3AF development environment via Vagrant or on a standalone Linux machine/VM are available here. Once the development environment is setup, you can see that we use one host interface for the data plane and another host interface for forwarding the mirrored traffic. As mirrored packets are forwarded to the collector via GUE, we create a GUE tunnel both on the host’s “Interface 2” and collector’s “Interface 1”. This is also explained in the diagram above. To know more about how the GUE tunnel is setup, you can read the documentation here.

To orchestrate and compose the traffic mirroring program, we can either use the Swagger UI or HTTP POST request. This payload contains the configuration details of development environment (e.g., IP of the Collector’s Tunnel interface, Gateway IP, mirror ingress/egress traffic, etc.) and the traffic mirroring program’s filter parameters (e.g., only mirror TCP packets originating from a particular source IP and whose destination PORT is 8080).

Once traffic mirroring is loaded, you can observe the mirrored traffic via a packet capture on Collector’s “Interface 1” based on the traffic mirroring filter parameters specified. Alternatively, you can also observe the Grafana dashboard for Host’s “Interface 2” to monitor the metrics pertaining to mirrored packets.

The L3AF Project is in the early adoption phase, and we would like to welcome you to ask us questions on our GitHub repository — and contributions are always welcome!

This blog is written with inputs from Jay Sheth and Santhosh Fernandes, who are engineers on the L3AF Project at Walmart.

Author