"250 Servers in a box." That's how Nvidia describes the DGX-1 — the world's first commercially available supercomputer specifically built for deep learning. Packing in eight Tesla P100 GPUs that are capable of delivering up to 170 teraflops at peak performance, it is hands-down the most powerful system Nvidia has ever brought to market. We took some snapshots of this AI behemoth on the GTC showroom floor.
The DGX-1 is pre-built supercomputer boasting eight 16GB Tesla GPUs, a 7TB SSD, Dual 10GbE Quad InfiniBand 100Gb networking, an NVLink Hybrid Cube Mesh and a pair of Xeon processors. In terms of raw computing power, it represents an astonishing 12x speed-up over the previous year. According to Nvidia, it provides the throughput of 250 CPU-based servers, networking, cables and racks: all in a single box.
In addition to the aforementioned hardware, the unit comes pre-loaded with deep learning software and development tools for speedier deployment. This includes Nvidia's Deep Learning GPU Training System (DIGITS), CUDA Deep Neural Network library (cuDNN) version 5, Caffe, Theano, Torch and a range of cloud management tools, software updates and a repository for containerized applications.
The aforementioned hardware and software essentially arms researchers and data scientists with the requisite power for deep learning on a massive scale. This will allow AI systems to be trained much faster than ever before.
As Nvidia CEO Jen-Hsun Huang explained during the DGX-1's unveiling: "Data scientists and AI researchers today spend far too much time on home-brewed high performance computing solutions. The DGX-1 is easy to deploy and was created for one purpose: to unlock the powers of superhuman capabilities and apply them to problems that were once unsolvable."
Most of the power behind the DGX-1 comes from its eight Tesla P100 graphics cards. As the first full GPU based on Nvidia's Pascal architecture, these units are just as formidable as they look. Adhering to the TSMC 16nm FinFET manufacturing process, each GPU sports a core clock of 1328MHz, high-capacity HBM2 memory, 720GB/s of memory bandwidth, 64 FP32 CUDA cores and a die that packs in 15.3 billion transistors.
Each GPU provides 10.6 teraflops of single precision floating point performance; a 3.7TB boost over the enthusiast-level Titan X. But the real performance comes from NVLink interconnect support, which allows multiple GPUs to connect directly to each other for maximum application scalability. This is like PCI Express on steroids.
If you're wondering how this thing would fare as a gaming system, forget about it: Tesla GPUs are aimed strictly at enterprise customers — they don't even come with HDMI or DisplayPort outputs which means you can't connect them to a monitor. Plus, there's also the $129,000 price tag to worry about.
Stanford University, Berkely, NYU and the University of Oxford will be among the first institutions to get DGX-1s. Nvidia will also be partnering with Massachusetts General Hospital to bring the power of DGX-1 to medical research; specifically in the areas of radiology, pathology and genomics.
Chris Jager travelled to GTC 2015 in San Jose, California as a guest of Nvidia.