parallel CUDA Assignment

What you need to do
You are to select a real world software application and manually parallelize it. That is, take any application that is not written in an explicitly parallel fashion and transform it so that it executes as efficiently as possible on a particular parallel computer. The software application can be whatever you like but you will obviously need access to its source code. It can for example be an open source application or an application that you have developed yourself, or perhaps one from your workplace. To be
amenable to parallelization it will need to be relatively computationally intensive, i.e. it will need to perform sufficient computation so that parallelizing that computation will potentially produce a noticeable difference in perceived execution time. For example, a word processing application would probably not be a good candidate as such applications are already normally adequately responsive to user interaction. Note, that some applications are more amenable to parallelization than others. It is not expected that a perfect linear speedup will be achieved for all applications – simply that your parallelization achieves as much performance improvement as is available.
You can use any parallel hardware that you have access to. You can make use of parallel computers provided by QUT or any other parallel computers you personally you have access to. It can be any form of parallel computer, e.g. multi-core, cluster, SMP, shared memory, distributed memory, GPU, etc. See criteria regarding scalable parallelism.
Again you can use whatever software that you have access to. This includes compilers, profilers, debuggers, libraries, etc. Some such software is available through QUT. You may use whatever programming language you wish and whatever parallel frameworks and libraries that you have access to.
Please use the #project channel in our Slack workspace if you have any questions regarding this assessment item.
What to submit
Project Proposal: Submit online form, describing:
1. Abriefdescriptionofthesequentialapplicationthatyouhaveselectedto parallelize. What does it do? Where did you find it? (1 paragraph max)
2. Discusswhetheryouthinktheproposedapplicationperformssufficient computation so that parallelizing it will potentially produce a noticeable difference in perceived execution time. (1paragraph max).
3. Whatparallelhardwareandparallelizationlanguage/frameworkareyou considering? E.g. targeting NVidia GPU programmed using CUDA. (1 paragraph max).

Code Help, Add WeChat: cstutorcs
The project proposal is designed to give you constructive feedback and to ensure you are on a productive path prior to final submission.
FinalSubmission: azipfileincludingboth:
1. Areportof10-15pages(notincludingappendices)describingyour outcomes. The report should address the following criteria:
A. Anexplanationoftheoriginalsequentialapplicationbeing parallelized, what it does (black box) and how it works (a high level description of software’s design/architecture). This might include call graphs, class diagrams, etc – whatever you find
useful to describe the structure of the original sequential
application.
B. Youranalysisofpotentialparallelismwithintheapplication.This
might include identification of existing loops or control flow constructs where parallelism might be found. Explanation of the data and control dependences that you analysed to determine which sections of code were safe to parallelize. Which of these is likely to be of sufficient granularity to be worth exploiting? Is it scalable parallelism? A discussion of changes required to expose parallelism, such as replacing algorithms or code restructuring transformations.
C. Howdidyoumapcomputationand/ordatatoprocessors? Which parallelism abstract tionsor programming language constructs did you use to perform synchronization?
D. Timing and profiling results, both before and after parallelization and a speedup graph.
E. Howdidyoutestthattheparallelversionproducedtheexact same results as the original sequential version?
F. A description of the compilers, software, tools, and techniques you used to parallelize the application.
G. The story of how you overcame performance problems/barriers (e.g. load imbalance, memory contention, granularity, data dependencies, etc) to improving parallel performance.
H. An explanation of the code that you added or modified to parallelize the application (including source code line count).
I. Reflect on your outcome – What have you learnt? How successful was your attempt? Do you think you’ve done as well as is possible? What might you have done differently?
2. Yoursourcecode(bothbeforeandafterversions)togetherwith instructions for compiling, running, hardware requirements and realistic input data sets.
CS Help, Email: tutorcs@163.com