Over the past few years, there has been an explosion of interest in deep learning as a better way to train machines to do certain tasks. Because of this, GPUs, which are well suited for processing many pieces of similar data simultaneously, have also become the centerpiece technology for deep learning development.

However, even with GPUs, it still takes too much time to create a new algorithm with deep learning out of large amounts of data. One of the issues is that deep learning and other GPU-related operations don?t scale that well over multiple GPUs.

The conventional method for accelerating deep learning that would normally be done on a single GPU is to link multiple computers in parallel and share the data across them. However, as more machines are added into the mix, it becomes progressively harder to share data between the machines, and the total performance starts seeing diminishing returns.
The company then tested it on AlexNet, an image classification neural network. Fujitsu?s technology managed to achieve a performance 14.7x that of a single GPU when using 16 GPUs, and a 27x improvement when it used 64 GPUs.

No software is perfectly parallel today, which is why we?re still seeing ?only? a 27x improvement when using 64 GPUs, instead of a 64x improvement. However, the performance of Fujitsu software still showed an improvement in learning speeds of 46% for 16 GPUs and 71% for 64 GPUs, compared to more conventional software.

The more efficient software can help academics, governments and other companies significantly shorten the time it takes for them train a deep learning algorithm for various research purposes and product development.

Fujitsu aims to start commercializing this technology as part of Fujitsu?s AI technology, Human Centric AI Zinrai, by the end of 2016.