"Centaur Technology is the first to announce an x86 processor design that integrates a specialized coprocessor to accelerate deep learning. This coprocessor delivers greater AI performance than any CPU and frees the x86 cores to focus on general purpose tasks that continue to require x86 compatibility."
- Linley Gwennap , Editor in Chief,
Meeting the industry challenge for fast inference hardware
Beyond Hyperscale Cloud or Low-Power Mobile
Centaur’s microprocessor-design technology includes both a NEW high-performance x86 core AND the industry’s FIRST integrated AI Coprocessor for x86 systems. The x86 microprocessor cores deliver high instructions/clock (IPC) for server-class applications and support the latest x86 extensions such as AVX 512 and new instructions for fast transfer of AI data. The AI Coprocessor is a clean-sheet processor designed to deliver high performance and efficiency on deep-learning applications, freeing up the x86 cores for general-purpose computing. Both the x86 and AI Coprocessor technologies have now been proven in silicon using a new scalable SoC platform with eight x86 cores and an AI Coprocessor able to compute 20 trillion AI operations/sec with 20 terabytes/sec memory bandwidth. This SoC architecture requires less than 195mm2 in TSMC 16nm and provides an extensible platform with 44 PCIe lanes and 4 channels of PC3200 DDR4.
Centaur demonstrated its new technology in a reference system running at 2.5GHz. This system was used to submit official, audited (Preview) scores to MLPerf1 and showed that Centaur’s AI Coprocessor can classify an image in less than 330 microseconds while providing inference throughput equivalent to 23 high-end CPU cores from other x86 vendors2. Since Centaur’s technology is fully compatible with standard PCs and servers, the integrated AI Coprocessor can be augmented with off-chip GPUs or other AI accelerators for system-level scability.
 MLPerf v0.5 Inference Closed/Preview audited submission, Sept. 2019. MLPerf name and logo are trademarks. See www.mlperf.org for more information.
 MLPerf Inf-0.5-23. MobileNet-V1 on 112 cores (29203 fps) = 260.7 fps/core. It would require 23.17 cores to match Centaur (6042 fps)
MLPerf Inf-0.5-23. ResNet-50 v1.5 on 112 cores (5965.6 fps) = 53.3 fps/core. It would require 22.9 cores to match Centaur (1218.5 fps)
Attendees at the ISC East trade show in NYC saw Centaur’s new technology up close for the first time. The demo showcased video analytics using Centaur’s reference system with x86-based network-video-recording (NVR) software from Qvis Labs. In addition to conventional, real-time object detection/classification, Centaur was the only vendor at the show to highlight leading-edge applications such as semantic segmentation (pixel level image classification) and a new technique for human pose estimation (“stick figures”). Centaur is focused on improving the hardware price/performance and software productivity for platforms to support this next wave of research applications and speed deployment into new server-class products.