Data Processing Units (DPUs) are becoming available in datacenter environments to offload/accelerate workloads from the host. However, a comprehensive analysis is required to help users determine how to effectively utilize DPUs for their workloads, considering the various configurations and generations available. To fill in this gap, we conduct a fair and rigorous characterization by performing 15 benchmarking tests to demonstrate the evolution of representative SoC-based DPUs, specifically NVIDIA’s BlueField-1, BlueField-2, and BlueField-3. Our work surfaces several idiosyncrasies across three key characterization dimensions—network, DMA engine, and memory. For network, we exhaustively test two major DPU modes—on-path (and five submodes) and off-path modes. We develop DPUDMABench, a microbenchmark suite to systematically analyze different data exchange primitives supported by DPU’s DMA engine. We also conduct two application case studies examining the DPU mode’s performance impact on TCP/IP and RDMA-based key-value stores (MICA and HERD). Based on our multi-generational DPU characterization, we identify and summarize 14 major idiosyncrasies, along with providing guidelines for optimal system and future hardware design.