onnx new

AMD Ryzen Threadripper 3990X 64-Core testing with a Gigabyte TRX40 AORUS PRO WIFI (F6 BIOS) and AMD Radeon RX 5700 8GB on Pop 22.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2402035-NE-ONNXNEW6040.

onnx newProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerOpenGLVulkanCompilerFile-SystemScreen ResolutionabcdAMD Ryzen Threadripper 3990X 64-Core @ 2.90GHz (64 Cores / 128 Threads)Gigabyte TRX40 AORUS PRO WIFI (F6 BIOS)AMD Starship/Matisse4 x 32GB DDR4-3000MT/s CMK64GX4M2D3000C16Samsung SSD 970 EVO Plus 500GBAMD Radeon RX 5700 8GB (1750/875MHz)AMD Navi 10 HDMI AudioDELL P2415QIntel I211 + Intel Wi-Fi 6 AX200Pop 22.046.6.6-76060606-generic (x86_64)GNOME Shell 42.5X Server 1.21.1.44.6 Mesa 23.3.2-1pop0~1704238321~22.04~36f1d0e (LLVM 15.0.7 DRM 3.54)1.3.267GCC 11.4.0ext43840x2160OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0x830107aPython Details- Python 3.10.12Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Mitigation of untrained return thunk; SMT enabled with STIBP protection + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

onnx newonnx: GPT-2 - CPU - Parallelonnx: GPT-2 - CPU - Parallelonnx: GPT-2 - CPU - Standardonnx: GPT-2 - CPU - Standardonnx: yolov4 - CPU - Parallelonnx: yolov4 - CPU - Parallelonnx: yolov4 - CPU - Standardonnx: yolov4 - CPU - Standardonnx: T5 Encoder - CPU - Parallelonnx: T5 Encoder - CPU - Parallelonnx: T5 Encoder - CPU - Standardonnx: T5 Encoder - CPU - Standardonnx: bertsquad-12 - CPU - Parallelonnx: bertsquad-12 - CPU - Parallelonnx: bertsquad-12 - CPU - Standardonnx: bertsquad-12 - CPU - Standardonnx: CaffeNet 12-int8 - CPU - Parallelonnx: CaffeNet 12-int8 - CPU - Parallelonnx: CaffeNet 12-int8 - CPU - Standardonnx: CaffeNet 12-int8 - CPU - Standardonnx: fcn-resnet101-11 - CPU - Parallelonnx: fcn-resnet101-11 - CPU - Parallelonnx: fcn-resnet101-11 - CPU - Standardonnx: fcn-resnet101-11 - CPU - Standardonnx: ArcFace ResNet-100 - CPU - Parallelonnx: ArcFace ResNet-100 - CPU - Parallelonnx: ArcFace ResNet-100 - CPU - Standardonnx: ArcFace ResNet-100 - CPU - Standardonnx: ResNet50 v1-12-int8 - CPU - Parallelonnx: ResNet50 v1-12-int8 - CPU - Parallelonnx: ResNet50 v1-12-int8 - CPU - Standardonnx: ResNet50 v1-12-int8 - CPU - Standardonnx: super-resolution-10 - CPU - Parallelonnx: super-resolution-10 - CPU - Parallelonnx: super-resolution-10 - CPU - Standardonnx: super-resolution-10 - CPU - Standardonnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallelonnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallelonnx: Faster R-CNN R-50-FPN-int8 - CPU - Standardonnx: Faster R-CNN R-50-FPN-int8 - CPU - Standardabcd106.6269.3687884.98811.75893.41805292.5555.98899166.968158.9436.2888101.5539.843824.56634218.9888.69295115.028199.4625.0096262.0653.812350.7894451266.72.98135335.4137.4745133.78215.547964.31348.226320.7291119.1528.3885584.325411.855989.593111.159921.385246.755924.79640.324191.038310.973974.782713.36543.30419302.6455.87005170.353132.1147.5666886.589811.54484.42556225.9568.63703115.771195.4795.11408231.6544.314440.7981501253.222.91718342.7987.31859136.65815.441564.754347.719820.9515118.0378.4692884.268611.864188.736411.269221.012447.586025.572839.097981.497012.260265.186015.33583.30267302.8245.82904171.549119.8478.3411876.092413.13894.39522227.6368.58440116.484197.5315.06250223.9934.460680.7887131267.892.91581342.9537.21143138.66415.412864.877247.591121.0075117.0488.5404783.796111.931089.002911.234120.868447.914725.867738.655279.628412.548463.747415.67833.25386307.3395.81121172.083115.9718.6217973.874513.53364.40892226.8208.48632117.851198.9655.02241226.2804.415930.7952791257.912.90364344.3907.16765139.51215.426364.820848.120420.7765117.5048.5074784.051311.894688.927311.243720.853447.950225.442639.3208OpenBenchmarking.org

ONNX Runtime

Model: GPT-2 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: GPT-2 - Device: CPU - Executor: Parallelabcd20406080100SE +/- 0.26, N = 3SE +/- 0.14, N = 3SE +/- 0.24, N = 3106.6391.0481.5079.631. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: GPT-2 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: GPT-2 - Device: CPU - Executor: Parallelabcd3691215SE +/- 0.03121, N = 3SE +/- 0.02097, N = 3SE +/- 0.03777, N = 39.3687810.9739012.2602012.548401. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: GPT-2 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: GPT-2 - Device: CPU - Executor: Standardabcd20406080100SE +/- 0.62, N = 3SE +/- 0.72, N = 3SE +/- 0.21, N = 384.9974.7865.1963.751. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: GPT-2 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: GPT-2 - Device: CPU - Executor: Standardabcd48121620SE +/- 0.11, N = 3SE +/- 0.17, N = 3SE +/- 0.05, N = 311.7613.3715.3415.681. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: yolov4 - Device: CPU - Executor: Parallelabcd0.76911.53822.30733.07643.8455SE +/- 0.01341, N = 3SE +/- 0.02977, N = 3SE +/- 0.01952, N = 33.418053.304193.302673.253861. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: yolov4 - Device: CPU - Executor: Parallelabcd70140210280350SE +/- 1.22, N = 3SE +/- 2.72, N = 3SE +/- 1.84, N = 3292.56302.65302.82307.341. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: yolov4 - Device: CPU - Executor: Standardabcd1.34752.6954.04255.396.7375SE +/- 0.01409, N = 3SE +/- 0.00419, N = 3SE +/- 0.02691, N = 35.988995.870055.829045.811211. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: yolov4 - Device: CPU - Executor: Standardabcd4080120160200SE +/- 0.41, N = 3SE +/- 0.12, N = 3SE +/- 0.80, N = 3166.97170.35171.55172.081. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: T5 Encoder - Device: CPU - Executor: Parallelabcd4080120160200SE +/- 0.84, N = 3SE +/- 0.08, N = 3SE +/- 1.26, N = 3158.94132.11119.85115.971. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: T5 Encoder - Device: CPU - Executor: Parallelabcd246810SE +/- 0.04827, N = 3SE +/- 0.00537, N = 3SE +/- 0.09288, N = 36.288807.566688.341188.621791. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: T5 Encoder - Device: CPU - Executor: Standardabcd20406080100SE +/- 0.07, N = 3SE +/- 0.23, N = 3SE +/- 0.39, N = 3101.5586.5976.0973.871. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: T5 Encoder - Device: CPU - Executor: Standardabcd3691215SE +/- 0.00918, N = 3SE +/- 0.03970, N = 3SE +/- 0.07065, N = 39.8438211.5448013.1389013.533601. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: bertsquad-12 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: bertsquad-12 - Device: CPU - Executor: Parallelabcd1.02742.05483.08224.10965.137SE +/- 0.01003, N = 3SE +/- 0.03613, N = 9SE +/- 0.02513, N = 34.566344.425564.395224.408921. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: bertsquad-12 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: bertsquad-12 - Device: CPU - Executor: Parallelabcd50100150200250SE +/- 0.51, N = 3SE +/- 1.88, N = 9SE +/- 1.30, N = 3218.99225.96227.64226.821. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: bertsquad-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: bertsquad-12 - Device: CPU - Executor: Standardabcd246810SE +/- 0.01301, N = 3SE +/- 0.01460, N = 3SE +/- 0.07957, N = 38.692958.637038.584408.486321. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: bertsquad-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: bertsquad-12 - Device: CPU - Executor: Standardabcd306090120150SE +/- 0.17, N = 3SE +/- 0.20, N = 3SE +/- 1.11, N = 3115.03115.77116.48117.851. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallelabcd4080120160200SE +/- 2.06, N = 5SE +/- 1.45, N = 15SE +/- 0.56, N = 3199.46195.48197.53198.971. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallelabcd1.15072.30143.45214.60285.7535SE +/- 0.05402, N = 5SE +/- 0.03696, N = 15SE +/- 0.01436, N = 35.009605.114085.062505.022411. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: CaffeNet 12-int8 - Device: CPU - Executor: Standardabcd60120180240300SE +/- 2.43, N = 4SE +/- 0.47, N = 3SE +/- 0.82, N = 3262.07231.65223.99226.281. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: CaffeNet 12-int8 - Device: CPU - Executor: Standardabcd1.00372.00743.01114.01485.0185SE +/- 0.04592, N = 4SE +/- 0.00933, N = 3SE +/- 0.01606, N = 33.812354.314444.460684.415931. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: fcn-resnet101-11 - Device: CPU - Executor: Parallelabcd0.17960.35920.53880.71840.898SE +/- 0.009251, N = 3SE +/- 0.001362, N = 3SE +/- 0.011220, N = 30.7894450.7981500.7887130.7952791. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: fcn-resnet101-11 - Device: CPU - Executor: Parallelabcd30060090012001500SE +/- 14.56, N = 3SE +/- 2.19, N = 3SE +/- 17.69, N = 31266.701253.221267.891257.911. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: fcn-resnet101-11 - Device: CPU - Executor: Standardabcd0.67081.34162.01242.68323.354SE +/- 0.00899, N = 3SE +/- 0.00183, N = 3SE +/- 0.00422, N = 32.981352.917182.915812.903641. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: fcn-resnet101-11 - Device: CPU - Executor: Standardabcd70140210280350SE +/- 1.06, N = 3SE +/- 0.21, N = 3SE +/- 0.50, N = 3335.41342.80342.95344.391. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallelabcd246810SE +/- 0.07142, N = 3SE +/- 0.01975, N = 3SE +/- 0.02570, N = 37.474507.318597.211437.167651. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallelabcd306090120150SE +/- 1.33, N = 3SE +/- 0.38, N = 3SE +/- 0.50, N = 3133.78136.66138.66139.511. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: ArcFace ResNet-100 - Device: CPU - Executor: Standardabcd48121620SE +/- 0.02, N = 3SE +/- 0.03, N = 3SE +/- 0.05, N = 315.5515.4415.4115.431. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: ArcFace ResNet-100 - Device: CPU - Executor: Standardabcd1428425670SE +/- 0.09, N = 3SE +/- 0.12, N = 3SE +/- 0.19, N = 364.3164.7564.8864.821. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallelabcd1122334455SE +/- 0.26, N = 3SE +/- 0.21, N = 3SE +/- 0.10, N = 348.2347.7247.5948.121. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallelabcd510152025SE +/- 0.11, N = 3SE +/- 0.09, N = 3SE +/- 0.04, N = 320.7320.9521.0120.781. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standardabcd306090120150SE +/- 0.78, N = 3SE +/- 0.34, N = 3SE +/- 0.50, N = 3119.15118.04117.05117.501. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standardabcd246810SE +/- 0.05653, N = 3SE +/- 0.02479, N = 3SE +/- 0.03634, N = 38.388558.469288.540478.507471. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: super-resolution-10 - Device: CPU - Executor: Parallelabcd20406080100SE +/- 0.23, N = 3SE +/- 0.03, N = 3SE +/- 0.11, N = 384.3384.2783.8084.051. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: super-resolution-10 - Device: CPU - Executor: Parallelabcd3691215SE +/- 0.03, N = 3SE +/- 0.00, N = 3SE +/- 0.02, N = 311.8611.8611.9311.891. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: super-resolution-10 - Device: CPU - Executor: Standardabcd20406080100SE +/- 0.77, N = 3SE +/- 0.28, N = 3SE +/- 0.31, N = 389.5988.7489.0088.931. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: super-resolution-10 - Device: CPU - Executor: Standardabcd3691215SE +/- 0.10, N = 3SE +/- 0.04, N = 3SE +/- 0.04, N = 311.1611.2711.2311.241. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallelabcd510152025SE +/- 0.02, N = 3SE +/- 0.05, N = 3SE +/- 0.08, N = 321.3921.0120.8720.851. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallelabcd1122334455SE +/- 0.06, N = 3SE +/- 0.12, N = 3SE +/- 0.20, N = 346.7647.5947.9147.951. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standardabcd612182430SE +/- 0.13, N = 3SE +/- 0.13, N = 3SE +/- 0.21, N = 924.8025.5725.8725.441. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standardabcd918273645SE +/- 0.19, N = 3SE +/- 0.20, N = 3SE +/- 0.34, N = 940.3239.1038.6639.321. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt


Phoronix Test Suite v10.8.4