PlayStation 3 Gravity Grid

The Sony PlayStation 3 has a number of unique features that make it particularly suited for scientific computation. First, the PS3 is an open platform, which essentially means that one can run a different system software on it (*AS OF MARCH 2010 THIS IS NO LONGER TRUE*) for example: PowerPC Linux. Next, it has a revolutionary processor called the Cell processor which was developed by Sony, IBM and Toshiba. This processor has a main CPU (called the PPE) and several (six (6) for the PS3) special compute engines (called SPEs) available for raw computation. Moreover, each SPE performs vector operations, which implies that they can compute on multiple data, in a single step (SIMD). Finally, its incredibly low cost make it very attractive as a scientific computing node as part of a compute cluster. In fact, its highly plausible that the raw computing power-per-dollar that the PS3 offers, is significantly higher than anything else on the market today.

Thanks to a very generous, partial donation by Sony & the Air Force Research Lab, we have a 400+ PS3 cluster in our department, that we call PS3 Gravity Grid. Check out some pictures of the cluster here. For instructions on how this cluster was built, please visit our companion site:

Here is a list of research articles published using results generated using this cluster: Phys. Rev. D78 064042 (2008)Class. Quant. Grav. 26 015014 (2009)PPAM (2009)PDCS (2009)IJMSSC (2009)Phys. Rev. D81 104009 (2010)CPC (2010)HPCS (2010)Class. Quant. Grav. 28 025012 (2011)Phys. Rev. D83 124002 (2011)Preprint arXiv:1312.5210 (2013)Gen. Rel. Grav. 46, 1672 (2014)CSC’14 (2014)Phys. Rev. D93 041501R (2016)Phys. Rev. D96 024020 (2017);


Binary Black Hole Coalescence using Perturbation Theory (Khanna)

This project broadly deals with estimating properties of the gravitational waves produced by the merger of two black holes. Gravitational waves are “ripples” in space-time that travel at the speed of light. These were theoretically predicted by Einstein’s general relativity, but have never been directly observed. Currently, there is an extensive search being performed for these waves by the newly constructed NSF LIGO laboratory and various other such observatories in Europe and Asia. The ESA and NASA also have a mission planned in the near future — the LISA mission — that will also be attempting to detect these waves. To learn more about these waves and the recent attempts to observe them, please visit the eLISA mission website

The computer code for solving the extreme-mass-ratio limit of this problem (commonly referred to as EMRI) is essentially an inhomogeneous wave-equation solver which includes a mathematically complicated source-term. The source-term describes how the smaller black hole (or star) generates gravitational waves as it moves in the space-time of the larger hole. Because of the mathematical complexity of the source-term, it is the most computationally intensive part of the whole calculation. On the PS3’s Cell processor, it is precisely this part of the computation that is “farmed out” to the six (6) SPEs. This approach essentially eliminates the entire time spent on the source computation and yields a speed up of over a factor of six (6) over a PPE-only computation. It should be noted that the context of this computation is double-precision floating point operations. In single-precision, the speed-up is significantly higher. Furthermore, we distribute the entire computational domain across the sixteen (16) PS3s using MPI (message passing) parallelization. This enables each PS3 to work on its part of the domain and communicate the appropriate boundary data to the others as needed, on-the-fly.  Overall, the  performance of our 16 PS3 Gravity Grid compares to nearly 100 cores of high-end Intel Xeon processors or as many as  500 nodes of an IBM Blue Gene supercomputer

Kerr Black Hole Radiative “Tails” (Khanna, Field)

This research is about developing an understanding of the late-time behavior of physical fields (scalar, vector, tensor) evolving in a rotating (Kerr) black hole space-time. It is well known that at very late times such fields exhibit power-law decay, but the value of the actual index of this power-law “tail” behavior is somewhat controversial — different researchers quote different results in the published literature. The goal of this project is to perform highly accurate computations to generate quality numerical data that would help resolve this conflict. The nature of the computations is such that not only does one require high accuracy but also high numerical floating point precision i.e. quadruple (128-bit) and octal (256-bit) precision, to obtain the quality data as needed for these studies.    

We implemented high precision floating point arithmetic on the Cell’s SPEs by developing a scaled down port of the LBNL QD Library. This approach yields a factor of four (4) gain in performance over a PPE-only computation and a factor of thirteen (13) gain over the performance of the native long double datatype on the PPE.

HPL – Standard supercomputer cluster benchmark (Khanna)

This project is about performing a standard parallel LINPACK cluster benchmark on our sixteen (16) PS3 cluster. This is the benchmark that is used by the site that lists the most powerful supercomputers in the world. We worked with IBM to port their QS22 Cell BE blade benchmark code to our PS3 cluster. Our 16 PS3 Gravity Grid generates a total performance of 40 GFLOP/s (40 billion calculations per second). It should be noted that this benchmark was run in double-precision and because of the limited RAM on each PS3 we were only able to fit a matrix of size 10K on the entire cluster. Thus, these testing conditions are far from optimal. Even with the 40 GFLOP/s, our PS3 cluster is very competitive (in terms of performance-per-dollar) with the low-cost compute clusters out there. The benchmark code with Cell specific patches is available here: HPL

Questions? Feel free to contact Gaurav Khanna about this research and the PS3 Gravity Grid.