Large-Scale Matrix Computation By Parallel Computing

Suppose you have a matrix X with n rows and n columns and another matrix Y with n rows and n columns. Each element in X or Y is a random integer number between 0 and 255. Please write a simple program to compute X*Y+X (Method 1). Then, please write a parallel computing program defined by yourself using multi-thread CPU to compute X*Y+X (Method 2). Then, please write another parallel computing program defined by yourself (e.g., using GPU CUDA, or any methods you defined, etc.) to compute X*Y+X (Method 3). Please do the following experiments: When n=24 , the running time of Method 1, Method 2 and Method 3. When n=25 , the running time of Method 1, Method 2 and Method 3. When n=26 , the running time of Method 1, Method 2 and Method 3. … When n=219, the running time of Method 1, Method 2 and Method 3. When n=220, the running time of Method 1, Method 2 and Method 3. Please make a table and draw the changes of log2n and the running time of different methods for this experiment. You can use any language (C, C++, JAVA, Python, etc.) to implement the program. The running time does not include the time to generate the random numbers. Final report: >=3 pages, not including the references. In the report, please explain the algorithm flow and implementation details of the three methods, experimental results and your findings.

Final submission: source code + report + presentation file

You may also like...