What You Will Build
Section 1: Pure L3 Data Center
Design a spine-leaf IPv6 fabric with unnumbered BGP, ECMP, and <50ms failover.
Section 2: Cloud-Native Router
Bootstrap a Talos & Kubernetes cluster, integrate it with the L3 fabric using Cilium.
Section 3: Networking with eBPF
Develop eBPF-based network functions and achieve kernel-level performance.
Service Chains
Chain multiple eBPF network functions (CNFs) into service pipelines with full observability.
Foundations
Reference designs, templates, and checklists to solidify production readiness.
Course Highlights
- Full Packet Path Control: Master routing from physical underlay to Kubernetes overlay, with eBPF hooking in between.
- Predictable Failover: Achieve failover within 50ms using BFD and multipath routing – minimizing downtime.
- Line-Rate Performance: Utilize eBPF to run network functions at NIC-native speeds, handling millions of packets per second.
- Hands-On Learning: Every module includes labs where you break and fix the network, forging real troubleshooting skills.
FAQ
What do you mean by "pure L3" networking?
"Pure L3" means the underlay uses routing exclusively (no VLANs or bridging). Every link between switches/routers is a Layer-3 interface running a routing protocol, which leads to a simpler, more stable network at scale.
Can Cilium really handle ECMP and BGP?
Yes. Cilium can operate in a mode without overlays (using direct routing), so it works with ECMP in the underlay. Cilium also has a BGP announcement feature (beta) for advertising pod/service routes, or you can use external BGP agents alongside Cilium. We demonstrate one approach in the labs.
Why use IPv6 link-local for BGP peering?
Using IPv6 link-local addresses for BGP unnumbered sessions means we don't need to allocate global IPs for inter-router links. It's convenient and aligns with RFC 5549 which allows IPv4 routes to be advertised with IPv6 next hops. It also inherently ties the peering to the specific interface.
What BFD timers are you using?
We use 50ms send/receive intervals with a multiplier of 3 (so ~150ms detection). This aggressive setting gives fast failover. It's tested in our labs, but in production you'd ensure devices can handle it or adjust slightly higher if needed.
Meet the Instructor
Steven Cassamajor
Network Engineer & InstructorSteven has over 10 years of experience designing data center networks and has been an active contributor to open-source eBPF projects. He created this course to share a practical, hands-on path to mastering modern network infrastructure by blending traditional protocols with cutting-edge eBPF technology.