Master of Science (MS)
CUDA, Distributed Computing, MPI, Parallel Programming
The mainstream acceptance of heterogeneous computing and cloud computing is prompting a future of distributed heterogeneous systems. With current software development tools, programming such complex systems is difficult and requires an extensive knowledge of network and processor architectures. Providing an abstraction of the underlying network, message-passing interface (MPI) has been the standard tool for developing distributed applications in the high performance community. The problem of MPI lies with its message-passing model, which is less expressive than the shared-memory model. Development of heterogeneous programming tools, such as OpenCL, has only begun recently. This thesis presents Phalanx, a framework that extends the virtual architecture of CUDA for distributed heterogeneous systems. Using MPI, Phalanx transparently handles intercommunication among distributed nodes. By using the shared-memory model, Phalanx simplifies the development of distributed applications without sacrificing the advantages of MPI. In one of the case studies, Phalanx achieves 28x speedup compared with serial execution on a Core-i7 processor.
Lam, Siu Kwan, "Scaling CUDA for Distributed Heterogeneous Processors" (2012). Master's Theses. Paper 4143.