compute AMD Ryzen Threadripper PRO 7995WX 96-Cores testing with a HP Z6 G5 A Workstation 8B24 (U65 Ver. 01.01.04 BIOS) and NVIDIA RTX A4000 16GB on CachyOS rolling via the Phoronix Test Suite. a: Processor: AMD Ryzen Threadripper PRO 7995WX 96-Cores @ 5.19GHz (96 Cores / 192 Threads), Motherboard: HP Z6 G5 A Workstation 8B24 (U65 Ver. 01.01.04 BIOS), Chipset: AMD Device 14a4, Memory: 8 x 16GB DDR5-5200MT/s Hynix HMCG78AGBRA190N, Disk: 2 x 1024GB SAMSUNG MZVL21T0HCLR-00BH1, Graphics: NVIDIA RTX A4000 16GB, Audio: NVIDIA GA104 HD Audio, Monitor: ASUS VP28U, Network: Realtek RTL8111/8168/8411 OS: CachyOS rolling, Kernel: 6.7.2-1-cachyos (x86_64), Desktop: GNOME Shell 45.3, Display Server: X Server 1.21.1.11, Display Driver: NVIDIA 545.29.06, OpenGL: 4.6.0, Compiler: GCC 13.2.1 20230801 + Clang 16.0.6 + LLVM 16.0.6, File-System: xfs, Screen Resolution: 3840x2160 b: Processor: AMD Ryzen Threadripper PRO 7995WX 96-Cores @ 5.19GHz (96 Cores / 192 Threads), Motherboard: HP Z6 G5 A Workstation 8B24 (U65 Ver. 01.01.04 BIOS), Chipset: AMD Device 14a4, Memory: 8 x 16GB DDR5-5200MT/s Hynix HMCG78AGBRA190N, Disk: 2 x 1024GB SAMSUNG MZVL21T0HCLR-00BH1, Graphics: NVIDIA RTX A4000 16GB, Audio: NVIDIA GA104 HD Audio, Monitor: ASUS VP28U, Network: Realtek RTL8111/8168/8411 OS: CachyOS rolling, Kernel: 6.7.2-1-cachyos (x86_64), Desktop: GNOME Shell 45.3, Display Server: X Server 1.21.1.11, Display Driver: NVIDIA 545.29.06, OpenGL: 4.6.0, Compiler: GCC 13.2.1 20230801 + Clang 16.0.6 + LLVM 16.0.6, File-System: xfs, Screen Resolution: 3840x2160 NAMD 3.0b6 Input: ATPase with 327,506 Atoms ns/day > Higher Is Better a . 8.48033 |=============================================================== b . 8.91806 |================================================================== NAMD 3.0b6 Input: STMV with 1,066,628 Atoms ns/day > Higher Is Better a . 2.49536 |================================================================== b . 2.49445 |================================================================== Intel Open Image Denoise 2.2 Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only Images / Sec > Higher Is Better a . 3.02 |===================================================================== b . 2.97 |==================================================================== Intel Open Image Denoise 2.2 Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only Images / Sec > Higher Is Better a . 3.03 |===================================================================== b . 2.95 |=================================================================== Intel Open Image Denoise 2.2 Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only Images / Sec > Higher Is Better a . 1.43 |===================================================================== b . 1.39 |=================================================================== GROMACS 2024 Implementation: MPI CPU - Input: water_GMX50_bare Ns Per Day > Higher Is Better a . 11.35 |==================================================================== b . 11.36 |==================================================================== ONNX Runtime 1.17 Model: GPT-2 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better a . 173.07 |================================================================== b . 174.68 |=================================================================== ONNX Runtime 1.17 Model: GPT-2 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better a . 5.77230 |================================================================== b . 5.71852 |================================================================= ONNX Runtime 1.17 Model: GPT-2 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 130.35 |=================================================================== b . 129.11 |================================================================== ONNX Runtime 1.17 Model: GPT-2 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 7.66971 |================================================================= b . 7.74347 |================================================================== ONNX Runtime 1.17 Model: yolov4 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better a . 9.49305 |================================================================== b . 9.42030 |================================================================= ONNX Runtime 1.17 Model: yolov4 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better a . 105.34 |================================================================== b . 106.15 |=================================================================== ONNX Runtime 1.17 Model: yolov4 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 9.19506 |================================================================== b . 9.02456 |================================================================= ONNX Runtime 1.17 Model: yolov4 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 108.75 |================================================================== b . 110.81 |=================================================================== ONNX Runtime 1.17 Model: T5 Encoder - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better a . 398.00 |=================================================================== b . 400.87 |=================================================================== ONNX Runtime 1.17 Model: T5 Encoder - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better a . 2.51140 |================================================================== b . 2.49335 |================================================================== ONNX Runtime 1.17 Model: T5 Encoder - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 184.30 |================================================================= b . 189.93 |=================================================================== ONNX Runtime 1.17 Model: T5 Encoder - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 5.42515 |================================================================== b . 5.26412 |================================================================ ONNX Runtime 1.17 Model: bertsquad-12 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better a . 12.81 |==================================================================== b . 12.85 |==================================================================== ONNX Runtime 1.17 Model: bertsquad-12 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better a . 78.08 |==================================================================== b . 77.84 |==================================================================== ONNX Runtime 1.17 Model: bertsquad-12 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 13.52 |==================================================================== b . 13.57 |==================================================================== ONNX Runtime 1.17 Model: bertsquad-12 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 73.98 |==================================================================== b . 73.67 |==================================================================== ONNX Runtime 1.17 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better a . 567.43 |=================================================================== b . 552.90 |================================================================= ONNX Runtime 1.17 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better a . 1.76072 |================================================================ b . 1.80697 |================================================================== ONNX Runtime 1.17 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 563.58 |=================================================================== b . 562.79 |=================================================================== ONNX Runtime 1.17 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 1.77395 |================================================================== b . 1.77643 |================================================================== ONNX Runtime 1.17 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better a . 1.17451 |================================================================== b . 1.16383 |================================================================= ONNX Runtime 1.17 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better a . 851.42 |================================================================== b . 859.23 |=================================================================== ONNX Runtime 1.17 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 5.38636 |================================================================== b . 4.54438 |======================================================== ONNX Runtime 1.17 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 185.65 |========================================================= b . 220.05 |=================================================================== ONNX Runtime 1.17 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better a . 28.35 |==================================================================== b . 28.09 |=================================================================== ONNX Runtime 1.17 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better a . 35.28 |=================================================================== b . 35.60 |==================================================================== ONNX Runtime 1.17 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 37.23 |==================================================================== b . 34.76 |=============================================================== ONNX Runtime 1.17 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 26.86 |=============================================================== b . 28.77 |==================================================================== ONNX Runtime 1.17 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better a . 183.78 |================================================================== b . 185.25 |=================================================================== ONNX Runtime 1.17 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better a . 5.43976 |================================================================== b . 5.39672 |================================================================= ONNX Runtime 1.17 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 209.29 |========================================================== b . 241.41 |=================================================================== ONNX Runtime 1.17 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 4.77742 |================================================================== b . 4.14187 |========================================================= ONNX Runtime 1.17 Model: super-resolution-10 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better a . 122.59 |=================================================================== b . 117.69 |================================================================ ONNX Runtime 1.17 Model: super-resolution-10 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better a . 8.15595 |=============================================================== b . 8.49576 |================================================================== ONNX Runtime 1.17 Model: super-resolution-10 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 142.57 |=================================================================== b . 141.27 |================================================================== ONNX Runtime 1.17 Model: super-resolution-10 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 7.01381 |================================================================= b . 7.07856 |================================================================== ONNX Runtime 1.17 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better a . 33.47 |=================================================================== b . 34.16 |==================================================================== ONNX Runtime 1.17 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better a . 29.87 |==================================================================== b . 29.27 |=================================================================== ONNX Runtime 1.17 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 42.97 |========================================================== b . 50.58 |==================================================================== ONNX Runtime 1.17 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 23.27 |==================================================================== b . 19.77 |==========================================================