Program

Monday, 23rd March 2026 (UTC)

14:00 - 14:30 - Introduction and Awards
14:30 - 15:30 - Keynote 1 (Randy Bush, Arrcus / IIJ): Mind The Gap or Confessions of an Internet Microscopist: Always Learning Lessons the Hard Way
15:30 - 15:50 - Break
15:50 - 16:45 - Routing (Session Chair: Thomas Krenc (IIJ))

Prefix Top Lists Reloaded: A Temporal Prefix Ranking Dataset short

Savvas Kastanakis (University of Twente), Rick Fontein (University of Twente), Shyam Krishna Khadka (University of Twente), Ebrima Jaw (University of Twente), Cristian Hesselman (SIDN Labs, University of Twente), Mattijs Jonker (University of Twente)

Abstract Paper

Abstract: Accurate Internet measurements depend on well-defined targets. A popular mechanism for target selection is domain-based top lists, e.g., the Tranco or Cisco Umbrella lists. Such lists have a few shortcomings such as the lack of aggregation across related domain names and high volatility over time. Prefix Top Lists (\textit{PTL}) were introduced in 2019 to address these issues, by aggregating domain names into IP prefixes and applying a Zipf-based ranking model to improve stability and representativeness, nonetheless, the original \textit{PTL} resource was discontinued, leaving a gap in publicly available prefix-level data. In this replication study, we revive and enhance the \textit{PTL} resource by incorporating a broader range of domain-based top lists. Our approach involves mapping domain names to IP prefixes using DNS resolution and BGP routing data, ranking prefixes through a Zipf-based weighting system, and conducting three use-case studies to promote the applicability of \textit{PTLs}. We release the complete \textit{PTL} toolchain as open-source software and publish weekly \textit{PTL} snapshots under~\url{https://openintel.nl/data/prefix-top-lists}, ensuring sustained, versioned and publicly accessible prefix-level rankings for the measurement community.
Detecting and Characterizing DDoS Scrubbing from Global BGP Routing: Insights from Five Leading Scrubbers long

Shyam Krishna Khadka (University of Twente), Suzan Bayhan (University of Twente), Ralph Holz (University of Munster, University of Twente), Cristian Hesselman (SIDN Labs, University of Twente)

Abstract Paper

Abstract: Many scrubbers use the Border Gateway Protocol (BGP) to route Distributed Denial of Service (DDoS) traffic to their infrastructure, allowing them to drop the DDoS traffic and forward legitimate traffic to the Autonomous Systems (ASes) the scrubber protects. Despite their importance, the prevalence and operational behaviors of BGP-based DDoS scrubbing services remain poorly understood, such as the extent to which protected ASes always have a scrubber on their path or activate a scrubber on-demand when an attack occurs. We bridge this gap by detecting scrubbing activations and deactivations in public BGP data, where they manifest themselves as a scrubber dynamically appearing as the first upstream of an origin AS or as an origin AS for a particular prefix. We use 30 days of BGP data from the RIS route collectors, focusing on the global top five scrubbing providers, such as Cloudflare and Akamai. We also characterize their behavior, including protection modes, on-demand mitigation strategies, and RPKI/IRR practices. We find that prefixes that always use a scrubber are dominant compared to those that activate a scrubber on-demand. We also observe that 48% of the prefixes that scrubbers temporarily originate during an attack are not covered by valid RPKI ROAs (12.5% Invalid and 35.5% Notfound), which highlights a potential operational gap in current scrubbing practices regarding routing security. These insights are conservative because we only consider public BGP data and AS path changes that are most likely to be scrubbing events (e.g., those observed by two or more route collector peers). We believe our work is useful for security researchers and policymakers, for instance, to better understand DDoS protection levels of ASes in a particular country or region.
Routing under siege: how traffic engineering decisions facilitate prefix hijackings long

Renan Barreto (Universidade Federal do Rio Grande), Leandro Bertholdo (Universidade Federal do Rio Grande do Sul), Pedro Marcos (Universidade Federal do Rio Grande)

Abstract Paper

Abstract: The reliability and security of Internet routing are increasingly challenged by applications with strict service requirements, where connectivity and traffic engineering play a central role. While operators apply traffic engineering to optimize performance and resilience, these decisions can inadvertently amplify routing security risks. Existing mechanisms such as BGPSec, RPKI, and ASPA remain insufficient due to limited deployment and inherent technical limitations, leaving open questions about how traffic engineering practices and connectivity affect routing security. To address this, we propose a methodology that combines measurements from both the control and data planes. We use the PEERING Testbed to announce prefixes on the Internet using different traffic engineering techniques, such as AS Path Prepend, more specific announcements, and selective route announcements, and hijack our prefixes to understand the interplay between traffic engineering and prefix hijackings. Our results show that prepending can increase the impact of a hijack from 17\% to 67\%, and that the way an AS connects to other networks---its connectivity structure---can also determine its exposure to prefix hijacks. We further demonstrate that hijacking via more specific prefixes is particularly effective, achieving up to 100% of both control and data plane targets. Based on these findings, we provide a comprehensive view of the current announced address space, showing that 61.4% of the address space may be facing a higher exposure to prefix hijackings due to ASes´ traffic engineering practices.

16:45 - 17:05 - Break
17:05 - 17:45 - Topology (Session Chair: Shuai Hao)

En Unión y Libertad: Subnational Strategies for Hosting Government Services long

Esteban Carisimo (Northwestern University), Mariano G. Beiró (Universidad de San Andres & CONICET), Lukas De Angelis Riva (Universidad de Buenos Aires), Mauricio Buzzone (Universidad de Buenos Aires), Fabián E. Bustamante (Northwestern University)

Abstract Paper

Abstract: We present the first empirical study of subnational hosting strategies, using Argentina’s 24 provinces as a case. Starting from oﬃcial landing pages, we analyze $\approx$1.2k domains (collected Oct 2023 – Apr 2024), classifying serving networks by operational control (sovereign, domestic third-party, global) and examining authoritative DNS and HTTPS deployment. We relate these choices to 31 demographic, economic, technological, and political covariates – associations only, not causal claims. We find substantial heterogeneity: some provinces operate sovereign infrastructure; others rely on domestic incumbents or outsource to global providers. Federal capacity is rarely used, with provinces favoring bespoke or repurposed networks (including utility backbones). Legacy telecom footprints remain strong predictors of hosting choice even within a shared national umbrella. We also observe frequent splits between hosting and nameservers and uneven HTTPS hygiene. Taken together, the study oﬀers a reusable measurement template and benchmarks that make sovereigntyperformance trade-oﬀs measurable below the nation level.
Unpacking Internet Ossification: A Large-Scale Study of Path-Impairing Middleboxes Across IPv4 and IPv6 long

Fahad Hilal (Max Planck Institute for Informatics), Taha Albakour (Max Planck Institute for Informatics), Oliver Gasser (IPinfo), Kevin Vermeulen (CNRS, Ecole Polytechnique)

Abstract Paper

Abstract: The end-to-end principle that limits on-path devices to simple tasks such as forwarding and routing has been one of the backbones of the Internet’s architecture. This is, however, being called into question, as Internet paths now contain devices which inspect, filter, modify or even discard packets. Some of these carry out benign and positive undertakings such as balancing resources and thwarting attacks, while others interfere with packets in unexpected ways leading to broken paths, thus inhibiting the deployment of new protocols or even extensions to existing ones. While Internet ossification has already been studied in prior work, we propose to address new research questions enabled by recent Internet-scale middlebox mapping techniques. Combining Internet-scale measurements, measurements towards popular domains, repeated measurements, and longitudinal measurements, both in IPv6 and IPv4, we provide a multi-dimensional study on path-impairing middleboxes in the Internet. Our findings reveal that six times fewer IPv6 prefixes are affected than IPv4 prefixes by path-impairing middleboxes, and that there is an opportunity to switch between IPv4 and IPv6 to evade path-impairing middleboxes. Looking into the nature of path-impairments, we find that up to 87% relate to the usage of Multipath TCP. We also present the first results about the dynamics of these middleboxes, at both short (over hours), and long (over years) time windows. We show that path-impairing middleboxes have a consistent behavior over hours and that their number has tripled since 2022 for IPv6. We complement our measurements with operator perspectives and outline a service designed to help operators uncover and address unintentional path-impairments in their networks. Finally, we highlight default configurations as one potential contributor to path-impairments.

Tuesday, 24th March 2026 (UTC)

14:00 - 14:35 - DNS (Session Chair: Orlando E. Martinez-Durive (IMDEA Networks / Net AI))

Black Holes and Prisoners: Understanding AS112 Deployment Characteristics short

Elizabeth Boswell (University of Glasgow), Xinyan Xian (University of Glasgow), Mingshu Wang (University of Glasgow), Stephen McQuistin (University of St Andrews), Colin Perkins (University of Glasgow)

Abstract Paper

Abstract: AS112 is a distributed, volunteer-run, anycast DNS service that acts as a sink for leaked DNS queries for local resources, preventing them from overloading core DNS infrastructure. AS112 helps protect important parts of the Internet infrastructure, but there has been no comprehensive study of who runs the AS112 servers, where they are located, and whether they effectively capture leaked queries. Using RIPE Atlas and 33646 open recursive resolvers, we detect 469 AS112 sites, run by 97 operators, and compare the response time and query distances of AS112 to root server queries. AS112 performs well, with 23.21% lower median response times and 36.11% lower median distances than the root. However, AS112 is largely dependent on few large operators (one operator serves 41.71% of probes in our study), limiting its resilience.
The Future of DNS Privacy: A Comparison of DNS over QUIC and DNS over HTTP/3 long

Philipp Bielefeld (University of Potsdam (Hasso Plattner Institute)), Felix Hoffmann (University of Potsdam (Hasso Plattner Institute)), Steffen Sassalla (University of Potsdam (Hasso Plattner Institute)), vasilis ververis (University of Potsdam (Hasso Plattner Institute)), Vaibhav Bajpai (University of Potsdam (Hasso Plattner Institute))

Abstract Paper

Abstract: This study presents a large-scale empirical analysis of DNS-over-Encryption (DoE) protocols, focusing on the adoption, protocol feature support, and impact on webpage loading performance. We conducted measurements across over three thousand DoE resolvers, characterizing their support for features such as session resumption and 0-Round-trip Time (RTT) in DNS-over-QUIC (DoQ) and DNS-over-HTTP/3 (DoH/3). Despite broader feature adoption by DoQ, major browsers currently favor DoH/3. Our extensive latency measurements demonstrate that both protocols perform comparably, with DoQ slightly outperforming on average. Complementary experiments with the top one million websites show negligible overall page load time penalties when using DoQ or DoH/3 compared to traditional DNS-over-UDP (Do53), even under low-latency conditions. Further, our analysis explores the relationship between webpage complexity, quantified via metrics including number of objects, queried servers, and MIME type diversity, and the performance impact of DoE. We find no statistically significant correlation, indicating that DoEs performance effects are consistent across a range of website architectures. The study also addresses limitations in current client support for key protocol enhancements and validates effective 0-RTT resumption using proxy resolvers. Our findings alleviate prevalent concerns about DoE-induced performance degradation, supporting broader adoption of encrypted Domain Name System (DNS) protocols without sacrificing user experience. We release our datasets, source code, and analysis scripts to facilitate reproducibility and foster further research into encrypted DNS ecosystems.

14:35 - 14:55 - Break
14:55 - 15:45 - Applications and congestion (Session Chair: Ricky Mok (CAIDA))

WikIPedia: Unearthing a 20-Year History of IPv6 Client Addressing short

Erik Rye (Johns Hopkins University), Dave Levin (University of Maryland)

Abstract Paper

Abstract: Due to their article editing policies, Wikimedia sites like Wikipedia have become inadvertent time capsules for IPv6 addresses. When Wikimedia users make edits without signing into an account, their IP addresses are used in lieu of a username. Wikimedia site dumps therefore provide researchers with over two decades worth of timestamped client IPv6 addresses to understand address assignments and how they have changed over time and space. In this work, we extract 19M unique IPv6 addresses from Wikimedia sites like Wikipedia that were used by editors from 2003 to 2024. We use these addresses to understand the prevalence of IPv6 in countries corresponding to Wikimedia site languages, how IPv6 adoption has grown over time, and the prevalence of EUI-64 addressing on client devices like desktops, laptops, and mobile phones.
A Microscopic View of Congestion Control Behavior in Video Conferencing Applications short

Nathaniel Cherian (Purdue University), Akhil Prasad (Purdue University), Sonia Fahmy (Purdue University)

Abstract Paper

Abstract: Video Conferencing Applications (VCAs) employ real-time congestion (rate) control algorithms on top of UDP. In this paper, we take an in-depth look at the congestion control behavior of four proprietary VCAs: Zoom, Microsoft Teams, Google Meet, and Cisco Webex. We compare their startup phases, bandwidth probing behaviors, and reactions to packet delays and drops. We uncover previously-unknown bandwidth estimation strategies, and tradeoffs in how quickly they react to available bandwidth changes. Our study is based on over 130 hours of VCA traffic data collected under diverse network conditions and two buffer sizes, and annotated with sending rate, buffer occupancy, packet drop, and several user Quality of Experience (QoE) metrics. Our dataset is publicly available to support further research in understanding VCA performance.
Measuring Low Latency at Scale: A Field Study of L4S in Residential Broadband short

Ayoub Ben Ameur (NetMicroscope), Francesco Bronzino (ENS Lyon / Institut universitaire de France/ NetMicroscope), Nick Feamster (University of Chicago / NetMicroscope), Paul Schmitt (Cal Poly / NetMicroscope)

Abstract Paper

Abstract: The Low Latency, Low Loss, Scalable Throughput (L4S) architecture promises to reduce queuing delay while sustaining high throughput. Prior work has largely evaluated L4S in synthetic environments or controlled testbeds, leaving its real-world performance underexplored. In this study, we measure L4S performance specically on Apple services delivered over Comcast residential networks. We deploy 83 Raspberry Pi devices across Comcast subscriber households and conduct over 120000 controlled experiments comparing L4S to traditional congestion control. Our results show that L4S reduces tail latency by up to 25% for interactive applications and for bulk downloads from Apple's CDN, while providing minimal gains for iCloud. Gains are most pronounced during peak usage hours when networks are congested, highlighting the situational benet of L4S in a single ISP ecosystem.

15:45 - 16:05 - Break
16:05 - 17:05 - Panel: New measurement datasets

Abstract

Abstract: High-quality, labeled datasets are essential for Internet measurement research. In his panel, we will introduce several large-scale newly available types of Internet measurement datasets, produced by recent US federally funded projects. We will also demonstrate how researchers can discover, access, and contribute new datasets through the Comunda portal. We will then invite community members to present new datasets they are willing to share, and conclude with an open discussion and audience Q&A on dataset availability, sharing, usage, and emerging research needs.
17:05 - 17:25 - Break
17:25 - 18:00 - Wireless and mobile (Session Chair: Hannah B. Pasandi (UC Berkeley))

Different Policies for Different NodeBs: Comparing Downlink Schedulers in Cellular Base Stations short

Zesen Zhang (University of California, San Diego), Jon Larrea (The University of Edinburgh), Jarrett Huddleston (Johns Hopkins University), Haoran Wan (Princeton University), Ricky K. P. Mok (CAIDA/UC San Diego), Bradley Huffaker (ucsd), kc Claffy (CAIDA), Kyle Jamieson (Princeton University), Alexander Marder (Johns Hopkins University), Aaron Schulman (University of California, San Diego)

Abstract Paper

Abstract: Cellular base stations rely on proprietary downlink scheduling algorithms that vendors independently develop to fairly and efficiently schedule traffic to competing users. Schedulers from different vendors can make different scheduling decisions depending on channel conditions, buffer status, fairness, and capability. This work is the first to show significant scheduling policy differences in a head-to-head comparison of the behavior of downlink schedulers across four base station vendors (Ericsson, Samsung, Nokia and Huawei) running on four cellular providers (AT&T, Verizon, T-Mobile and Vodafone). The evaluation is based on 500Gbytes of downlink transfers across 20 base stations in five cities during semi-controlled network and signal conditions. In particular, we observe different strategies for allocating radio resources, for rate control, and for handling users with asymmetric channel quality. These results challenge the assumptions made about downlink scheduler uniformity in prior cellular performance measurement studies.
Disentangling the Throughput Contributions of MIMO and Carrier Aggregation in 5G Networks long

Yufei Feng (Northeastern University), Phuc Dinh (Northeastern University), Moinak Ghosal (Northeastern University), Omar Basit (Purdue University), Sizhe Wang (Northeastern University), Y. Charlie Hu (Purdue University), Dimitrios Koutsonikolas (Northeastern University)

Abstract Paper

Abstract: Multiple-input multiple-output (MIMO) and carrier aggregation (CA) are two key MAC/PHY layer technologies employed by both user equipment and base stations to boost data throughput in 5G networks; yet, their respective contributions to real-world throughput improvements remain largely unexplored. Although both approaches conceptually rely on parallel transmissions, they differ fundamentally in their implementations: MIMO exploits spatial diversity through multiple data streams within a band, while CA aggregates spectrum across multiple bands. In this work, we present the first comparative study of MIMO and CA throughput gains in operational 5G networks. Using extensive measurements with commercial smartphones over all three major US cellular operators during a cross-country trip (from Los Angeles to Boston, 5700+ km), we first present the current state of deployment of both technologies in today’s 5G networks. We then disentangle their combined effects on throughput, quantifying the relative contribution of each technology to overall performance. Finally, we analyze how throughput scales across higher-order configurations of each technology, considering different MIMO transmission ranks and numbers of aggregated carriers, providing insights for future 5G deployments.

Wednesday, 25th March 2026 (UTC)

14:00 - 15:00 - Keynote 2 (Narseo Vallina-Rodriguez, IMDEA Networks): From Packets to Privacy: How Network Measurements Led Me to Privacy Research

Abstract

Abstract: Over the past decade, the digital landscape has evolved from a collection of isolated devices into a complex web of interconnected platforms, services, and applications. Modern apps routinely rely on wireless interfaces and IP-based protocols to discover and interact with neighboring processes across the same device and local networks. Understanding how data flows across these increasingly proprietary systems has become essential for uncovering emerging security and privacy risks that often remain invisible to traditional program-centric analysis methods. In this talk, I reflect on my research journey from network measurement to privacy and platform security. Techniques originally developed to empirically study Internet traffic and computer networks have proven remarkably effective for revealing structural weaknesses in modern, highly interconnected digital platforms, including browsers, mobile operating systems, and IoT ecosystems. Applying measurement-driven and black-box methodologies to these environments has uncovered new tracking mechanisms, incorrect trust assumptions embedded in network protocol standards, and structural design choices across platforms that enable large-scale privacy violations. Beyond advancing our understanding of privacy risks, these findings have led to concrete improvements in browsers and mobile platforms through responsible disclosure and collaboration with industry. At the same time, they highlight fundamental research challenges posed by the scale, complexity, and opacity of modern digital ecosystems, underscoring the need for new measurement-driven approaches to causally study the security and privacy of interconnected platforms.
15:00 - 15:20 - Break
15:20 - 16:35 - Security and privacy (Session Chair: Vasanta Chaganti (Swarthmore College))

A Measurement of Genuine Tor Traces for Realistic Website Fingerprinting short

Rob Jansen (U.S. Naval Research Laboratory), Ryan Wails (Georgetown University), Aaron Johnson (U.S. Naval Research Laboratory)

Abstract Paper

Abstract: Website fingerprinting (WF) is a dangerous attack on web privacy because it enables an adversary to predict the website a user is visiting, despite the use of encryption, VPNs, or anonymizing networks such as Tor. Previous WF work almost exclusively uses synthetic datasets to evaluate the performance and estimate the feasibility of WF attacks despite evidence that synthetic data misrepresents the real world. In this paper we present GTT23, the first WF dataset of genuine Tor traces, which we obtain through a large-scale measurement of the Tor network. GTT23 represents real Tor user behavior better than any existing WF dataset, is larger than any existing WF dataset by at least an order of magnitude, and will help ground the future study of realistic WF attacks and defenses. In a detailed evaluation, we survey 25 WF datasets published over the last 15 years and compare their characteristics to those of GTT23. We discover common deficiencies of synthetic datasets that make them inferior to GTT23 for drawing meaningful conclusions about the effectiveness of WF attacks directed at real Tor users. We have made GTT23 available to promote reproducible research and to help inspire new directions for future work.
Through a Smaller Lens: Revisiting Opportunistic Analysis using Network Telescopes long

Bernhard Degen (University of Twente), Nils Kempen (University of Münster), kc claffy (CAIDA/UC San Diego), Ricky K. P. Mok (CAIDA/UC San Diego), Ralph Holz (University of Münster & University of Twente), Roland van Rijswijk-Deij (University of Twente), Raffaele Sommese (University of Twente), Mattijs Jonker (University of Twente)

Abstract Paper

Abstract: Unsolicited network traffic observed to the addresses monitored by a network telescope enables, among other things, tracking of Internet outages, botnets and DDoS attacks. We examine how a decrease in available address space affects what we can learn about the phenomena we study with telescopes. We conduct a targeted replication of a seminal study conducted 10 years ago. Since then, IPv4 scarcity and rising operational costs have placed increased pressure on operators to maximize use of their allocated space, which has resulted in a reduction of address space available to major telescopes. As a first step, we characterize traffic to three network telescopes that differ in size, spatial distribution, and prominence. We find that most address blocks within each telescope observe a similar number of source IP addresses, and that smaller telescopes offer higher visibility per monitored address. We also find that sources target the IPv4 address space pervasively, with 37.0% of them targeting a /16 block in each of the three telescopes within an hour. As a case study, we examine the sensitivity of randomly-spoofed DoS attack inference to the size of the address space under observation and find that larger telescopes detect many attacks missed by smaller ones, although smaller telescopes observe disproportionately many relative to their address space size. Our study provides a framework to quantify the effects of reduced telescope address space and outlines future directions for telescope research.
State of Passkey Authentication in the Wild: A Census of the Top 100K sites long

Prince Bhardwaj (University of Surrey), Nishanth Sastry (University of Surrey)

Abstract Paper

Abstract: Passkeys -- discoverable WebAuthn credentials synchronized across devices are widely promoted as the future of passwordless authentication. Built on the FIDO2 standard, they eliminate shared secrets and resist phishing while offering usability through platform credential managers. Since their introduction in 2022, major vendors have integrated passkeys into operating systems and browsers, and prominent websites have announced support. Yet the true extent of adoption across the broader web remains unknown. Measuring this is challenging because websites implement passkeys in heterogeneous ways. Some expose explicit ``Sign in with passkey'' buttons, others hide options under multi-step flows or rely on conditional mediation, and many adopt external mechanisms such as JavaScript libraries or OAuth-based identity providers. There is no standardized discovery endpoint, and dynamic, JavaScript-heavy pages complicate automated detection. This paper makes two contributions. First, we present \emph{Fidentikit}, a browser-based crawler implementing 43 heuristics across five categories -- UI elements, DOM structures, WebAuthn API calls, network patterns, and library detection developed through iterative refinement over manual examination of 1{,}500 sites. Second, we apply Fidentikit to the top 100{,}000 Tranco-ranked domains, producing the first large-scale census of passkey adoption. Our results show adoption strongly correlates with site popularity and often depends on external identity providers rather than native implementations. $\textbf{Keywords:}$ passkeys, WebAuthn, FIDO2, passwordless, large-scale measurement, authentication.
Efficient System Log Analysis via Quantized On-Device Anomaly Detection and Response long

Qinxuan Shi (University of North Dakota), Zhanglong Yang (University of North Dakota), Sicong Shao (University of North Dakota)

Abstract Paper

Abstract: The rapid expansion of the Internet of Things (IoT) and the growing interconnectivity of industrial systems have created an urgent need for log anomaly detection (LAD) to be performed locally on edge devices. However, a significant gap exists between the computational resources required by advanced deep learning models and the limited processing capacity of edge hardware, often forcing a trade-off between detection accuracy and deployment feasibility. To address this challenge, this paper makes two major contributions. First, we introduce EM-AT-based LAD by designing an unsupervised LAD method that extends the Transformer-based anomaly detection model via integrating the Expectation-Maximization (EM) algorithm for fully automated threshold determination. While EM-AT-based LAD achieves high detection accuracy, its computational requirement limits its direct applicability on power-constrained edge devices. Therefore, we introduce LiteLADR, a framework that enables efficient system log analysis via quantized on-device anomaly detection and response. LiteLADR leverages TorchAO and ExecuTorch for model quantization and optimization, enabling both EM-AT and large language models (LLMs) to operate efficiently on resource-constrained edge nodes. Comprehensive evaluations on the HDFS and OpenStack datasets show that EM-AT outperforms leading methods, achieving F1-scores of 98.90% and 99.61%, respectively. LiteLADR preserves strong detection performance (F1-scores of 98.65% and 99.43%) while substantially reducing computational resource consumption.

16:35 - 16:55 - Break
16:55 - 17:30 - Virtualization (Session Chair: Aristide Akem (University of Southampton))

xPUBench: Scalable and Energy-Efficient GPU and DPU-Accelerated Network Functions long

Maxime Vanliefde (UCLouvain), Romain Van Hauwaert (UCLouvain), Nikita Tyunyayev (UCLouvain), Clément Delzotti (UCLouvain), Elena Agostini (NVIDIA), Tom Barbette (UCLouvain)

Abstract Paper

Abstract: The rapid increase in network speeds makes packet processing on general-purpose CPUs increasingly challenging. At 100 Gbps and beyond, CPUs struggle to sustain complex network functions without dedicated acceleration. This trend motivates the exploration and measurement of alternative compute platforms such as GPUs and embedded CPUs in Network Interface Cards (NICs). Modern NICs provide tighter integration with GPUs, with the ability to write received packets directly to GPU memory. SmartNICs, also known as Data Processing Units (DPUs), further feature embedded ARM or RISC-V cores capable of offloading NFV packet processing entirely. In this work, we introduce xPUBench, a benchmarking environment that systematically measures the performance and energy efficiency of packet processing across CPUs, GPUs, and DPUs. We evaluate several (co-)processing models relevant to Network Function Virtualization, including CPU+GPU hybrid, DPU-only, and GPU-only approaches. Our measurements show that, for a computation-heavy workload, current CPU-only implementations manage to handle up to 50% of the 100 Gbps NIC rate. In contrast, GPU implementations can saturate it. We also show that the DPUs’ most powerful embedded cores can replace the main CPU for some traditional packet processing, alleviating the load on the host, which can now be entirely dedicated to running applications. We finally propose a novel energy-efficiency dimension, showing that DPUs outperform traditional CPUs for low-throughput processing, requiring only 24 W to sustain 10 Gbps, and that GPUs outperform CPUs for high-throughput processing. Our findings emphasize the need to assess both performance and energy in heterogeneous packet-processing pipelines, given the growing diversity of “xPUs” in networked systems.
Characterizing SmartNIC Memory Bandwidth Bottlenecks short

Michał Podleś (NVIDIA), Boris Pismenny (NVIDIA), Jaun Jose Vegas Olmos (NVIDIA), Idelfonso Tafur Monroy (TUe Eindhoven)

Abstract Paper

Abstract: Emerging multi-hundred gigabit Ethernet speeds are outpacing improvements in CPU performance and memory bandwidth, challenging host-based I/O processing capacity. SmartNICs, such as NVIDIA Bluefield, promise to overcome this challenge by offloading network-intensive computation, freeing up host resources. We show however that SmartNICs fall short of this goal, offloading only a small portion of the host CPU and wire bandwidth for common tasks---Bluefield-2 NVMe-over-TCP storage disaggregation offloads up to 4 host cores while Bluefield-3 achieves up to 14 cores. Prior work attributes this limitation to weaker SmartNIC cores, in contrast, in this work we identify SmartNIC memory bandwidth as the key bottleneck to line-rate performance. We then leverage SmartNIC support for direct cache access to overcome this bottleneck by constraining I/O buffers to the last-level cache (LLC). Our evaluation shows the benefits of this approach by improving Bluefield-2 and Bluefield-3 throughput on the previous benchmark by up to 56\% and 20\%, respectively.

17:30 - 17:40 - Closing