tgl tgl Intel Core i7-1185G7 testing with a Dell XPS 13 9310 0DXP1F (3.7.0 BIOS) and Intel Xe TGL GT2 15GB on Ubuntu 23.10 via the Phoronix Test Suite. a: Processor: Intel Core i7-1185G7 @ 4.80GHz (4 Cores / 8 Threads), Motherboard: Dell XPS 13 9310 0DXP1F (3.7.0 BIOS), Chipset: Intel Tiger Lake-LP, Memory: 8 x 2GB LPDDR4-4267MT/s, Disk: Micron 2300 NVMe 512GB, Graphics: Intel Xe TGL GT2 15GB (1350MHz), Audio: Realtek ALC289, Network: Intel Wi-Fi 6 AX201 OS: Ubuntu 23.10, Kernel: 6.7.0-060700rc5-generic (x86_64), Desktop: GNOME Shell 45.1, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 24.0~git2312220600.68c53e~oibaf~m (git-68c53ec 2023-12-22 mantic-oibaf-ppa), OpenCL: OpenCL 3.0, Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1920x1200 b: Processor: Intel Core i7-1185G7 @ 4.80GHz (4 Cores / 8 Threads), Motherboard: Dell XPS 13 9310 0DXP1F (3.7.0 BIOS), Chipset: Intel Tiger Lake-LP, Memory: 8 x 2GB LPDDR4-4267MT/s, Disk: Micron 2300 NVMe 512GB, Graphics: Intel Xe TGL GT2 15GB (1350MHz), Audio: Realtek ALC289, Network: Intel Wi-Fi 6 AX201 OS: Ubuntu 23.10, Kernel: 6.7.0-060700rc5-generic (x86_64), Desktop: GNOME Shell 45.1, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 24.0~git2312220600.68c53e~oibaf~m (git-68c53ec 2023-12-22 mantic-oibaf-ppa), OpenCL: OpenCL 3.0, Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1920x1200 c: Processor: Intel Core i7-1185G7 @ 4.80GHz (4 Cores / 8 Threads), Motherboard: Dell XPS 13 9310 0DXP1F (3.7.0 BIOS), Chipset: Intel Tiger Lake-LP, Memory: 8 x 2GB LPDDR4-4267MT/s, Disk: Micron 2300 NVMe 512GB, Graphics: Intel Xe TGL GT2 15GB (1350MHz), Audio: Realtek ALC289, Network: Intel Wi-Fi 6 AX201 OS: Ubuntu 23.10, Kernel: 6.7.0-060700rc5-generic (x86_64), Desktop: GNOME Shell 45.1, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 24.0~git2312220600.68c53e~oibaf~m (git-68c53ec 2023-12-22 mantic-oibaf-ppa), OpenCL: OpenCL 3.0, Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1920x1200 VkFFT 1.3.4 Test: FFT + iFFT R2C / C2R Benchmark Score > Higher Is Better a . 6310 |===================================================================== b . 6146 |=================================================================== c . 6198 |==================================================================== VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in half precision Benchmark Score > Higher Is Better a . 13702 |=================================================================== b . 13923 |==================================================================== c . 13748 |=================================================================== VkFFT 1.3.4 Test: FFT + iFFT C2C Bluestein in single precision Benchmark Score > Higher Is Better a . 1112 |===================================================================== b . 1116 |===================================================================== c . 1111 |===================================================================== VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in double precision Benchmark Score > Higher Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in single precision Benchmark Score > Higher Is Better a . 7500 |===================================================================== b . 7495 |===================================================================== c . 7496 |===================================================================== VkFFT 1.3.4 Test: FFT + iFFT C2C multidimensional in single precision Benchmark Score > Higher Is Better a . 5869 |==================================================================== b . 5952 |===================================================================== c . 5843 |==================================================================== VkFFT 1.3.4 Test: FFT + iFFT C2C Bluestein benchmark in double precision Benchmark Score > Higher Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling Benchmark Score > Higher Is Better a . 8510 |===================================================================== b . 8503 |===================================================================== c . 8506 |===================================================================== Libplacebo 6.338.2 Test: deband_heavy FPS > Higher Is Better a . 78.22 |==================================================================== b . 78.16 |==================================================================== c . 78.10 |==================================================================== Libplacebo 6.338.2 Test: polar_nocompute FPS > Higher Is Better a . 151.38 |=================================================================== b . 151.42 |=================================================================== c . 151.27 |=================================================================== Libplacebo 6.338.2 Test: hdr_peakdetect FPS > Higher Is Better a . 336.77 |=================================================================== b . 254.33 |=================================================== c . 274.71 |======================================================= Libplacebo 6.338.2 Test: hdr_lut FPS > Higher Is Better a . 371.63 |=================================================================== b . 371.86 |=================================================================== c . 371.35 |=================================================================== Libplacebo 6.338.2 Test: av1_grain_lap FPS > Higher Is Better a . 405.16 |=================================================================== b . 405.80 |=================================================================== c . 403.90 |=================================================================== Libplacebo 6.338.2 Test: gaussian FPS > Higher Is Better a . 370.73 |=================================================================== b . 370.69 |=================================================================== c . 370.58 |=================================================================== NAMD 3.0b6 Input: ATPase with 327,506 Atoms ns/day > Higher Is Better a . 0.54640 |================================================================= b . 0.55272 |================================================================== c . 0.53465 |================================================================ NAMD 3.0b6 Input: STMV with 1,066,628 Atoms ns/day > Higher Is Better a . 0.16793 |================================================================== b . 0.16810 |================================================================== c . 0.16812 |================================================================== CacheBench Test: Read MB/s > Higher Is Better a . 8945.28 |================================================================== b . 8946.16 |================================================================== c . 8946.29 |================================================================== CacheBench Test: Write MB/s > Higher Is Better a . 109630.17 |================================================================ b . 109813.15 |================================================================ c . 109693.87 |================================================================ CacheBench Test: Read / Modify / Write MB/s > Higher Is Better a . 103769.80 |=============================================================== b . 104513.36 |================================================================ c . 104746.11 |================================================================ LZ4 Compression 1.9.4 Compression Level: 1 - Compression Speed MB/s > Higher Is Better a . 755.41 |=================================================================== b . 754.76 |=================================================================== c . 752.43 |=================================================================== LZ4 Compression 1.9.4 Compression Level: 1 - Decompression Speed MB/s > Higher Is Better a . 4114.5 |=================================================================== b . 4123.5 |=================================================================== c . 4139.1 |=================================================================== LZ4 Compression 1.9.4 Compression Level: 3 - Compression Speed MB/s > Higher Is Better a . 118.71 |=================================================================== b . 119.04 |=================================================================== c . 117.87 |================================================================== LZ4 Compression 1.9.4 Compression Level: 3 - Decompression Speed MB/s > Higher Is Better a . 3850.6 |=================================================================== b . 3870.7 |=================================================================== c . 3857.8 |=================================================================== LZ4 Compression 1.9.4 Compression Level: 9 - Compression Speed MB/s > Higher Is Better a . 41.13 |==================================================================== b . 41.16 |==================================================================== c . 40.77 |=================================================================== LZ4 Compression 1.9.4 Compression Level: 9 - Decompression Speed MB/s > Higher Is Better a . 4072.4 |=================================================================== b . 4077.0 |=================================================================== c . 4080.0 |=================================================================== dav1d 1.4 Video Input: Chimera 1080p FPS > Higher Is Better a . 403.43 |=================================================================== b . 405.66 |=================================================================== c . 401.11 |================================================================== dav1d 1.4 Video Input: Summer Nature 4K FPS > Higher Is Better a . 92.92 |============================================================= b . 102.65 |=================================================================== c . 94.69 |============================================================== dav1d 1.4 Video Input: Summer Nature 1080p FPS > Higher Is Better a . 325.60 |===================================================== b . 409.89 |=================================================================== c . 381.24 |============================================================== dav1d 1.4 Video Input: Chimera 1080p 10-bit FPS > Higher Is Better a . 296.56 |=================================================================== b . 296.79 |=================================================================== c . 297.27 |=================================================================== Intel Open Image Denoise 2.2 Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only Images / Sec > Higher Is Better a . 0.15 |===================================================================== b . 0.15 |===================================================================== c . 0.15 |===================================================================== Intel Open Image Denoise 2.2 Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only Images / Sec > Higher Is Better a . 0.16 |===================================================================== b . 0.16 |===================================================================== c . 0.16 |===================================================================== Intel Open Image Denoise 2.2 Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only Images / Sec > Higher Is Better a . 0.07 |===================================================================== b . 0.07 |===================================================================== c . 0.07 |===================================================================== GROMACS 2024 Implementation: MPI CPU - Input: water_GMX50_bare Ns Per Day > Higher Is Better a . 0.573 |==================================================================== b . 0.570 |==================================================================== c . 0.563 |=================================================================== ONNX Runtime 1.17 Model: GPT-2 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better a . 81.61 |=================================================================== b . 77.20 |=============================================================== c . 83.35 |==================================================================== ONNX Runtime 1.17 Model: GPT-2 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better a . 12.24 |================================================================ b . 12.95 |==================================================================== c . 11.99 |=============================================================== ONNX Runtime 1.17 Model: GPT-2 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 105.49 |=================================================================== b . 101.60 |================================================================= c . 104.92 |=================================================================== ONNX Runtime 1.17 Model: GPT-2 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 9.47309 |================================================================ b . 9.83571 |================================================================== c . 9.52371 |================================================================ ONNX Runtime 1.17 Model: yolov4 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better a . 2.89643 |=========================================================== b . 3.25429 |================================================================== c . 3.12461 |=============================================================== ONNX Runtime 1.17 Model: yolov4 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better a . 345.25 |=================================================================== b . 307.28 |============================================================ c . 320.04 |============================================================== ONNX Runtime 1.17 Model: yolov4 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 5.06166 |================================================================== b . 4.72968 |============================================================== c . 4.91617 |================================================================ ONNX Runtime 1.17 Model: yolov4 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 197.56 |=============================================================== b . 211.43 |=================================================================== c . 203.41 |================================================================ ONNX Runtime 1.17 Model: T5 Encoder - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better a . 90.92 |==================================================================== b . 89.81 |=================================================================== c . 89.56 |=================================================================== ONNX Runtime 1.17 Model: T5 Encoder - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better a . 11.00 |=================================================================== b . 11.13 |==================================================================== c . 11.16 |==================================================================== ONNX Runtime 1.17 Model: T5 Encoder - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 146.86 |=================================================================== b . 146.72 |=================================================================== c . 147.50 |=================================================================== ONNX Runtime 1.17 Model: T5 Encoder - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 6.80604 |================================================================== b . 6.81340 |================================================================== c . 6.77737 |================================================================== ONNX Runtime 1.17 Model: bertsquad-12 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better a . 4.25933 |========================================================== b . 4.83250 |================================================================== c . 3.97700 |====================================================== ONNX Runtime 1.17 Model: bertsquad-12 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better a . 234.78 |=============================================================== b . 206.93 |======================================================= c . 251.44 |=================================================================== ONNX Runtime 1.17 Model: bertsquad-12 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 6.90025 |================================================================== b . 6.38547 |============================================================= c . 6.83843 |================================================================= ONNX Runtime 1.17 Model: bertsquad-12 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 144.92 |============================================================== b . 156.60 |=================================================================== c . 146.23 |=============================================================== ONNX Runtime 1.17 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better a . 239.59 |=================================================================== b . 238.16 |================================================================== c . 240.22 |=================================================================== ONNX Runtime 1.17 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better a . 4.17219 |================================================================== b . 4.19665 |================================================================== c . 4.16112 |================================================================= ONNX Runtime 1.17 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 333.92 |================================================================ b . 350.10 |=================================================================== c . 328.36 |=============================================================== ONNX Runtime 1.17 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 2.99315 |================================================================= b . 2.85482 |============================================================== c . 3.04357 |================================================================== ONNX Runtime 1.17 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better a . 0.471017 |================================================================= b . 0.415642 |========================================================= c . 0.462395 |================================================================ ONNX Runtime 1.17 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better a . 2123.06 |========================================================== b . 2405.91 |================================================================== c . 2162.65 |=========================================================== ONNX Runtime 1.17 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 0.686147 |=========================================================== b . 0.753919 |================================================================= c . 0.680886 |=========================================================== ONNX Runtime 1.17 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 1457.41 |================================================================= b . 1326.40 |============================================================ c . 1468.67 |================================================================== ONNX Runtime 1.17 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better a . 9.98691 |================================================================ b . 8.98048 |========================================================= c . 10.19480 |================================================================= ONNX Runtime 1.17 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better a . 100.13 |============================================================ b . 111.35 |=================================================================== c . 98.09 |=========================================================== ONNX Runtime 1.17 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 14.92 |============================================================== b . 16.31 |==================================================================== c . 14.80 |============================================================== ONNX Runtime 1.17 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 67.03 |=================================================================== b . 61.31 |============================================================== c . 67.56 |==================================================================== ONNX Runtime 1.17 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better a . 99.28 |==================================================================== b . 98.49 |=================================================================== c . 97.33 |=================================================================== ONNX Runtime 1.17 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better a . 10.07 |=================================================================== b . 10.15 |=================================================================== c . 10.27 |==================================================================== ONNX Runtime 1.17 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 129.17 |=================================================================== b . 129.06 |=================================================================== c . 128.38 |=================================================================== ONNX Runtime 1.17 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 7.73997 |================================================================== b . 7.74651 |================================================================== c . 7.78713 |================================================================== ONNX Runtime 1.17 Model: super-resolution-10 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better a . 26.96 |================================================================= b . 28.40 |==================================================================== c . 28.25 |==================================================================== ONNX Runtime 1.17 Model: super-resolution-10 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better a . 37.08 |==================================================================== b . 35.21 |================================================================= c . 35.39 |================================================================= ONNX Runtime 1.17 Model: super-resolution-10 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 45.33 |================================================================== b . 47.04 |==================================================================== c . 42.81 |============================================================== ONNX Runtime 1.17 Model: super-resolution-10 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 22.06 |================================================================ b . 21.26 |============================================================== c . 23.36 |==================================================================== ONNX Runtime 1.17 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better a . 20.92 |============================================================= b . 21.60 |=============================================================== c . 23.19 |==================================================================== ONNX Runtime 1.17 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better a . 47.80 |==================================================================== b . 46.29 |================================================================== c . 43.11 |============================================================= ONNX Runtime 1.17 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 37.05 |=================================================================== b . 37.46 |==================================================================== c . 35.55 |================================================================= ONNX Runtime 1.17 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 26.99 |================================================================= b . 26.69 |================================================================= c . 28.13 |==================================================================== Llama.cpp b1808 Model: llama-2-7b.Q4_0.gguf Tokens Per Second > Higher Is Better a . 7.37 |==================================================================== b . 7.41 |===================================================================== c . 7.43 |===================================================================== Llama.cpp b1808 Model: llama-2-13b.Q4_0.gguf Tokens Per Second > Higher Is Better a . 3.84 |===================================================================== b . 3.84 |===================================================================== c . 3.84 |===================================================================== Llama.cpp b1808 Model: llama-2-70b-chat.Q5_0.gguf Tokens Per Second > Higher Is Better a . 0.03 |===================================================================== b . 0.03 |===================================================================== c . 0.03 |===================================================================== Llamafile 0.6 Test: llava-v1.5-7b-q4 - Acceleration: CPU Tokens Per Second > Higher Is Better a . 7.07 |===================================================================== b . 7.09 |===================================================================== c . 7.12 |===================================================================== Llamafile 0.6 Test: mistral-7b-instruct-v0.2.Q8_0 - Acceleration: CPU Tokens Per Second > Higher Is Better a . 4.26 |===================================================================== b . 4.28 |===================================================================== c . 4.27 |===================================================================== Llamafile 0.6 Test: wizardcoder-python-34b-v1.0.Q6_K - Acceleration: CPU Tokens Per Second > Higher Is Better a . 0.05 |===================================================================== b . 0.05 |===================================================================== c . 0.05 |=====================================================================