opencl benchmark test

AMD Ryzen Threadripper PRO 7995WX 96-Cores testing with a HP 8B24 (U65 Ver. 01.01.04 BIOS) and NVIDIA RTX A4000 16GB on Ubuntu 23.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2401055-PTS-OPENCLBE94.

opencl benchmark testProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLCompilerFile-SystemScreen ResolutionabcdAMD Ryzen Threadripper PRO 7995WX 96-Cores @ 6.44GHz (96 Cores / 192 Threads)HP 8B24 (U65 Ver. 01.01.04 BIOS)AMD Device 14a4128GB2 x 1024GB SAMSUNG MZVL21T0HCLR-00BH1NVIDIA RTX A4000 16GBNVIDIA GA104 HD AudioASUS VP28URealtek RTL8111/8168/8411Ubuntu 23.106.5.0-14-generic (x86_64)GNOME Shell 45.0X Server 1.21.1.7NVIDIA 535.129.034.6.0OpenCL 3.0 CUDA 12.2.147GCC 13.2.0ext43840x2160OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa108105Graphics Details- BAR1 / Visible vRAM Size: 256 MiB - vBIOS Version: 94.04.57.00.0bOpenCL Details- GPU Compute Cores: 6144Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

opencl benchmark testopencl-benchmark: FP64 Computeopencl-benchmark: FP32 Computeopencl-benchmark: INT64 Computeopencl-benchmark: INT32 Computeopencl-benchmark: INT16 Computeopencl-benchmark: INT8 Computeopencl-benchmark: Memory Bandwidth Coalesced Readopencl-benchmark: Memory Bandwidth Coalesced Writeabcd0.35522.0512.85710.3768.7038.244399.34406.120.35422.0012.83310.2538.6258.118399.33406.200.35322.0462.85910.3448.5998.218399.36406.190.35321.9952.85410.3448.5338.069399.26406.33OpenBenchmarking.org

ProjectPhysX OpenCL-Benchmark

Operation: FP64 Compute

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: FP64 Computeabcd0.07990.15980.23970.31960.3995SE +/- 0.002, N = 3SE +/- 0.002, N = 3SE +/- 0.001, N = 3SE +/- 0.001, N = 30.3550.3540.3530.3531. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: FP32 Compute

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: FP32 Computeabcd510152025SE +/- 0.04, N = 3SE +/- 0.08, N = 3SE +/- 0.04, N = 3SE +/- 0.07, N = 322.0522.0022.0522.001. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: INT64 Compute

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT64 Computeabcd0.64331.28661.92992.57323.2165SE +/- 0.029, N = 3SE +/- 0.018, N = 3SE +/- 0.028, N = 3SE +/- 0.029, N = 32.8572.8332.8592.8541. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: INT32 Compute

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT32 Computeabcd3691215SE +/- 0.03, N = 3SE +/- 0.08, N = 3SE +/- 0.08, N = 3SE +/- 0.08, N = 310.3810.2510.3410.341. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: INT16 Compute

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT16 Computeabcd246810SE +/- 0.012, N = 3SE +/- 0.028, N = 3SE +/- 0.092, N = 3SE +/- 0.036, N = 38.7038.6258.5998.5331. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: INT8 Compute

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT8 Computeabcd246810SE +/- 0.050, N = 3SE +/- 0.043, N = 3SE +/- 0.066, N = 3SE +/- 0.025, N = 38.2448.1188.2188.0691. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: Memory Bandwidth Coalesced Read

OpenBenchmarking.orgGB/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: Memory Bandwidth Coalesced Readabcd90180270360450SE +/- 0.04, N = 3SE +/- 0.04, N = 3SE +/- 0.02, N = 3SE +/- 0.04, N = 3399.34399.33399.36399.261. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: Memory Bandwidth Coalesced Write

OpenBenchmarking.orgGB/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: Memory Bandwidth Coalesced Writeabcd90180270360450SE +/- 0.21, N = 3SE +/- 0.25, N = 3SE +/- 0.05, N = 3SE +/- 0.10, N = 3406.12406.20406.19406.331. (CXX) g++ options: -std=c++17 -pthread -lOpenCL


Phoronix Test Suite v10.8.4