6006CEM Assignment

Assignment Task
For this coursework, you are required to select a real-world problem of your choice and apply various machine learning algorithms and methods to solve the selected problem. This problem could be a classification, regression, or clustering problem.
Your first task comprises the following:
1. Select a real-world problem.
2. Select suitable dataset(s) for the chosen problem. You can combine more than one
3. Analyse the dataset and pre-process it
4. Select more than one (at least three) appropriate Machine Learning algorithm for
implementing the models.
5. Evaluate the created models on the selected data.
6. Tune the models to achieve better performance.
• If you are attempting this assessment as a resit, you must choose a different problem and dataset(s) from those you used for your previous attempt(s).
• You are advised to choose a dataset that allows you to demonstrate your ability to perform data analysis and pre-processing techniques such as, but not limited to, handling missing, categorical, non-numeric and duplicate values; outliers; scaling; etc. The selected dataset must contain at least 1800 samples, after pre-processing. The dataset cannot be one of the scikit-learn or synthetic datasets.
• If you are not sure where to start, you may find a list of suggested resources with numerous problems and datasets in the “Open Data Repositories” section on Aula.
• You can use existing algorithms or a combination of some of them, or even come up
with a new algorithm of your own.
• The required programming language is Python 3 (others are not accepted).
For the second task, you are required to submit a demonstration video recording the execution and performance of your implementation.
• The maximum length of the demonstration video is five minutes.
This document is intended for Coventry University Group students for their own use in completing their assessed work for this module. It must not be passed to third parties or posted on any website.
Page 2 of 10
程序代写 CS代考 加微信: cstutorcs
• You are NOT required to walk through every line of the source code, but it is important to demonstrate the execution of all stages and the corresponding outputs of the source code.
• Voice over the video should be used to describe what is happening and some of the reasoning throughout the video.
• Ensure that all texts, tables, graphs, etc. are of an appropriate size to view, free from noise and not blurred. Also, ensure that the audio is clear.
• You are required to use either Jupyter Notebook on a browser or Visual Studio Code when recording the demonstration video.
Write a report (2000 words) based on a literature review and the technical work. This should
1. A literature review on the broad field that your machine learning system sits.
2. A specification of the chosen problem area.
3. Comparing the approaches and results of other existing pieces of work on the same
4. Analysing and pre-processing the data.
5. Applying different algorithms and methods to build learning models.
6. Making appropriate adjustments to improve the models’ performances.
7. Evaluating the models.
• The first part of your report should review literature on a machine learning task on the broad filed that your machine learning system sits such as Natural Language Process, Computer Vision and Image Processing, etc.
• The second part of your report should focus on how algorithms/methods/techniques are actually applied or developments that are novel and specific to your work rather than how they work theoretically.
• Your report should include appropriate outcomes such as data analysis diagrams, outcomes from the models, code snippets, etc. to support your text.
• Include all your source code as text in Appendix B at the end of the report. Do not use screenshots of your code in Appendix B. Your code muse be presented as text (see coursework template).
• Ensure you use comments to demonstrate an understanding of all parts of your code.
• A course work template is provided as a guide in “Assessment” section on Aula
• The word limit includes quotations, but excludes the (GitHub, datasets, OneDrive)
URLs, bibliography, reference list, and appendices (see coursework template)
This document is intended for Coventry University Group students for their own use in completing their assessed work for this module. It must not be passed to third parties or posted on any website.
Page 3 of 10
Code Help, Add WeChat: cstutorcs
Submission Instructions:
The submission of your coursework must be in the form of ONE Word file through the
indicated Aula submission link. Other formats (other than MS Word) will not be accepted. The submission of the implementation and demonstration video must be in the form of:
• A URL of Coventry GitHub Repository, OR
• A URL of Coventry OneDrive shared folder
The URL must be included at the beginning of your report. **Examiners will not check for the required URLs in other places.
The shared folder or repository must be accessible by examiners, and should include the following:
• The URL to the selected dataset(s) in README or a separate file.
• The dataset(s) that are used for your problem.
• The source code with appropriate comments, and
• The demonstration video.
• No other platform is accepted. Please ensure that it is COVENTRY GitHub or COVENTRY OneDrive.
• Make sure that you add , , , and as Collaborators to facilitate marking >
• The submitted source code must be in the form of a (Jupyter) notebook (.ipynb)
• You must ensure that you commit your work appropriately (with the corresponding
outputs of all cells – if applicable – clearly present)
• Only include the notebook for your final submission (i.e., remove all draft notebooks)
• The following naming convention must be used for your repository or shared folder:
StudentID-Initials-s1
For example, a student Leo Messi whose student ID is 12345678 would name their repository or shared folder as 12345678-LM-s1
• A failure to use this naming convention may delay the release of marks and feedback for your coursework.
Marking and Feedback
How will my assignment be marked?
Your assignment will be marked by the module team.
This document is intended for Coventry University Group students for their own use in completing their assessed work for this module. It must not be passed to third parties or posted on any website.
Page 4 of 10

How will I receive my grades and feedback?
Provisional marks will be released once all submissions, including extensions, have been marked and internally moderated.
Feedback will be provided by the module team on Aula alongside the release of grades.
Your provisional marks and feedback should be available within 2 weeks (10 working days) after the extended deadline.
What will I be marked against?
Details of the marking criteria for this task can be found at the bottom of this assignment brief.
Assessed Module Learning Outcomes
The Learning Outcomes for this module align to the marking criteria which can be found at the end of this brief. Ensure you understand the marking criteria to ensure successful achievement of the assessment task. The following module learning outcomes are assessed in this task:
On completion of this module a student should be able to:
1. Apply the knowledge behind the principles, techniques and applications of machine learning.
2. Critically evaluate existing machine learning methods and select the most appropriate ones
for a certain task.
3. Analyse information, compare different machine learning techniques and produce an
academic written report as a result.
4. Conceptualise the role of modern machine learning approaches and their impact on society.
Assignment Support and Academic Integrity
If you have any questions about this assignment please see the Student Guidance on Coursework for more information.
Spelling, Punctuation, and Grammar:
You are expected to use effective, accurate, and appropriate language within this assessment task.
Academic Integrity:
The work you submit must be your own, or in the case of groupwork, that of your group. All sources of information need to be acknowledged and attributed; therefore, you must provide references for all sources of information and acknowledge any tools used in the production of your work, including Artificial Intelligence (AI). We use detection software and make routine checks for evidence of academic misconduct.
This document is intended for Coventry University Group students for their own use in completing their assessed work for this module. It must not be passed to third parties or posted on any website.
Page 5 of 10
Programming Help
Definitions of academic misconduct, including plagiarism, self-plagiarism, and collusion can be found on the Student Portal. All cases of suspected academic misconduct are referred for investigation, the outcomes of which can have profound consequences to your studies. For more information on academic integrity please visit the Academic and Research Integrity section of the Student Portal.
Support for Students with Disabilities or Additional Needs:
If you have a disability, long-term health condition, specific learning difference, mental health diagnosis or symptoms and have discussed your support needs with health and wellbeing you may be able to access support that will help with your studies.
If you feel you may benefit from additional support, but have not disclosed a disability to the University, or have disclosed but are yet to discuss your support needs it is important to let us know so we can provide the right support for your circumstances. Visit the Student Portal to find out more.
Unable to Submit on Time?
The University wants you to do your best. However, we know that sometimes events happen which mean that you cannot submit your assessment by the deadline or sit a scheduled exam. If you think this might be the case, guidance on understanding what counts as an extenuating circumstance, and how to apply is available on the Student Portal.
Administration of Assessment
Module Leader Name: Dr Dianabasi Nkantah Module Leader Email: Assignment Category: [Written / Artefact] Attempt Type: [Standard]
Component Code: CW
This document is intended for Coventry University Group students for their own use in completing their assessed work for this module. It must not be passed to third parties or posted on any website.
Page 6 of 10

Assessment Marking Criteria
Introduction & Required components (15%)
Literature Review (20%)
Data Analysis and Data Pre-processing (15%)
Implementation, Evaluation, Result Analysis & Conclusion (40%)
Presentation of the report (10%)
80 to 100%
The problem statement has been very clearly stated. Existing approaches have been very critically reviewed, in a manner that can be published, with relevant current in-text citations. Similarities and differences between the student’s work and existing work have been excellently discussed.
All required components have been submitted in the right format. Video is very clear and has been used to clearly demonstrate ownership of the work.
Exceptional in-depth critical analysis of literature on a board machine learning area.
Very clear demonstration of an understanding of the approaches applied and their impact on society.
Use of a wide range of relevant and up-to-date peer- reviewed, journal/conference articles in research.
There is evidence of data visualisation, with an outstanding explanation of all plots resulting from the visualisation. The discussion on data processing has covered, in great depth, the balance of the dataset, outliers, scaling, and how non- numeric, categorical, duplicate, and missing data/values were handled.
Multiple algorithms have been implemented, with an excellent justification of the choice of algorithms for the problem. The models have been comprehensively tuned, and there is a demonstration of understanding of all the hyper-parameters available for tuning. Multiple evaluation metrics have been used, and insightfully discussed. Outstanding, in-depth analysis of results leading to excellent conclusions.
Excellent, professional flow to the report. Excellently referenced report, with in- text citations. Work completed with very high degree of accuracy and proficiency. Exceptional communication and expression throughout the report. All diagrams and tables have been properly labelled. Complete source code presented as text in Appendix B.
The problem statement has been clearly stated. Existing approaches have been critically reviewed, with relevant current in-text citations. Similarities and differences between the student’s work and existing
Excellent in-depth critical analysis of literature on a board machine learning area.
Clear demonstration of an understanding of the approaches applied and their impact on society.
Use of relevant and up-to-date peer-reviewed,
There is evidence of data visualisation, with an excellent explanation of all plots resulting from the visualisation. The discussion on data processing has covered the balance of the dataset, outliers, scaling,
The models have been tuned, and there is a demonstration of understanding of the hyper- parameters used for tuning. Multiple evaluation metrics have been used, and insightfully discussed. Excellent, in-depth analysis of
Excellent flow to the report. Excellently referenced report, with in-text citations. Work completed with very high degree of accuracy and proficiency. Excellent communication and expression throughout the report. All diagrams and
This document is intended for Coventry University Group students for their own use in completing their assessed work for this module. It should not be passed to third parties or posted on any website.
Page 7 of 10

work have been very well discussed.
All required components have been submitted in the right format. Video is very clear and has been used to clearly demonstrate ownership of the work.
journal/conference articles in research.
and how non-numeric, categorical, duplicate, and missing data/values were handled. Multiple algorithms have been implemented, with a good justification of the choice of algorithms for the problem.
results leading to excellent conclusions.
tables have been properly labelled. Complete source code presented as text in Appendix B.
The problem statement has been clearly stated. Existing approaches have been critically reviewed, with relevant in-text citations. Similarities and differences between the student’s work and existing work have been well discussed. All required components have been submitted in the right format. Video is very clear and has been used to clearly demonstrate ownership of the work.
Very good in-depth critical analysis of literature on a board machine learning area.
Clear demonstration of an understanding of the approaches applied.
Use of relevant and up-to-date peer-reviewed, journal/conference articles in research.
There is evidence of data visualisation, with a very good explanation of all plots resulting from the visualisation. The discussion on data processing has covered most of the balance of the dataset, outliers, scaling, and how non-numeric, categorical, duplicate, and missing data/values were handled.
Multiple algorithms have been implemented, with some justification of the choice of algorithms for the problem. The models have been tuned, and there is a demonstration of understanding of some of the hyper-parameters used for tuning. Multiple evaluation metrics have been used and discussed. Very good, in-depth analysis of results leading to very good conclusions.
Very good flow to the report. Very well referenced report, with in-text citations. Work completed with a high degree of accuracy and proficiency. Very good communication and expression throughout the report. All diagrams and tables have been properly labelled. Complete source code presented as text in Appendix B.
The problem statement has been stated. Existing approaches have been reviewed, with in-text citations. Similarities and differences between the student’s work and existing work have been discussed.
All required components have been submitted in the right format. Video is clear and has
Good critical analysis of literature on a board machine learning area.
Demonstration of an understanding of the approaches applied.
Use of relevant peer-reviewed articles in research.
There is evidence of data visualisation, an explanation of plots resulting from the visualisation. The discussion on data processing has covered some of the balance of the dataset, outliers, scaling, and how non-numeric, categorical, duplicate, and
Multiple algorithms have been implemented, with limited justification of the choice of algorithms for the problem. The models have been tuned, and there is an awareness of the hyper-parameters used for tuning. Multiple evaluation metrics have been used, with a limited attempt at discussing
Good flow to the report. Well referenced report, with in- text citations. Work completed with a high degree of accuracy and proficiency. Good communication and expression throughout the report. Most diagrams and tables have been properly labelled. Complete source
This document is intended for Coventry University Group students for their own use in completing their assessed work for this module. It should not be passed to third parties or posted on any website.
Page 8 of 10

been used to demonstrate ownership of the work.
missing data/values were handled.
them. An analysis of results leading to good conclusions.
code presented as text in Appendix B.
A limited attempt has been made at stating the problem statement. Existing approaches have been identified. A limited attempt has been made to discuss the similarities and differences between the student’s work and existing work. All required components have been submitted in the right format. Video has been used to demonstrate ownership of the work.
An attempt at an analysis of literature on a board machine learning area.
An attempt to demonstrate of an understanding of the approaches applied.
Use of articles in research.
There is a limited attempt at data visualisation, with little to no explanation of plots resulting from the visualisation. The discussion on data processing has covered a few of the balance of the dataset, outliers, scaling, and how non-numeric, categorical, duplicate, and missing data/values were handled.
Multiple algorithms have been implemented, with no justification of the choice of algorithms for the problem. The models have been tuned, but there is no awareness of the hyper-parameters used for tuning. A few evaluation metrics have been used. Fairly good analysis of results leading to conclusions.
Fairly good flow to the report. Report has been referenced. Good communication and expression but not throughout the report. Some diagrams and tables have been properly labelled. Report would benefit from thorough proof-reading. Complete source code presented as text in Appendix B.
Minimal attempt has been made at stating the problem statement. Existing approaches have not been identified. Required components have not been submitted in the right format. Video has been used but does not demonstrate an understanding of the work.
Poor attempt at analysis of literature on a board machine learning area.
Minimal demonstration of an understanding of the approaches applied.
Use of irrelevant and outdated articles in research.
Very limited attempt at data visualisation, with little to no explanation of plots resulting from the visualisation. The discussion on data pre- processing is limited.
One algorithm has been implemented, with no justification of the choice of algorithm for the problem. The model has not been tuned. A few evaluation metrics have been used. Limited analysis of results leading to conclusions.
The report is difficult to read; would benefit from thorough proof-reading. Limited evidence of referencing. A few diagrams and tables have been properly labelled. Incomplete source code presented as text in Appendix B.
This document is intended for Coventry University Group students for their own use in completing their assessed work for this module. It should not be passed to third parties or posted on any website.
Page 9 of 10

Very minimal attempt has been made at stating the problem statement. Existing approaches have not been identified. Some required components are either missing or not in the right format.
Minimal attempt at analysis of literature on a board machine learning area.
An understanding of the approaches applied has not been demonstrated.
There is no evidence of research.
No attempt at data visualisation and analysis. Very little to no evidence of data pre-processing.
It is not clear if an algorithm has been implemented. The model has not been tuned. No evidence of model evaluation. No analysis of results leading to conclusions.
The report is difficult to read; would benefit from thorough proof-reading. No evidence of referencing. Diagrams and tables have not been properly labelled. Source code has not been presented as text in Appendix B.
This document is intended for Coventry University Group students for their own use in completing their assessed work for this module. It should not be passed to third parties or posted on any website.
Page 10 of 10