Description
CRN:
Name: Name: Name: |
ID: ID: ID: |
|
Instructions:
• You must submit two separate copies (one Word file and one PDF file) using the Assignment Template on Blackboard via the allocated folder. These files must not be in compressed format.
• It is your responsibility to check and make sure that you have uploaded both the correct files.
• Zero mark will be given if you try to bypass the SafeAssign (e.g.,misspell words, remove spaces between words, hide characters, usedifferent character sets, convert text into image or languages other than English or any kind of manipulation).
• Email submission will not be accepted.
• You are advised to make your work clear and well-presented. This includes filling your information on the cover page.
• You must use this template, failing which will result in zero mark.
• You MUST show all your work, and text must not be converted into an image, unless specified otherwise by the question.
• Late submission will result in ZERO mark.
• The work should be your own, copying from students or other resources will result in ZERO mark.
• Use Times New Roman font for all your answers.
Restricted – مقيد
Pg. 22 |
Project |
|
Learning Outcome(s): CLO 1, 2, 5
1, Demonstrate an understanding of the concepts of decision analysis and decision support systems (DSS) including probability, modelling, decisions under uncertainty, and real-world problems.
2, Describe advanced Business Intelligence, Business Analytics, Data Visualization, and Dashboards.
5, Improve hands-on skills using Excel, and Orange for building Decision Support Systems.
14 Marks
Project
Students can form groups consisting of three students and send their names to their instructors before 5thOctober2024. Otherwise, the instructors will form the groupsrandomly and assign any datasets to the groups.No two groups will work on the same dataset.
Select one dataset from the datasets provided in the bellow link.
For 28 Data Analysis Projects to Boost Your Skills [2024 Guide]:
https://www.springboard.com/blog/data-analytics/da…
For more free public datasets for EDA:
https://www.tableau.com/learn/articles/free-public…
✓ After the dataset is selected (or assigned), analyze the data using Microsoft Excel or Orange Data Mining to discover the structure of data, trends, patterns, or any anomalies in the data based on your own hypothesis.
✓ Perform the following six tasks.
✓ You should use visualization to aid your answers.
Your project will include two main parts:
1. The final project report which must incorporate all the following6 tasks and written using the provided template.(10 marks distributed among the below tasks).
2. A presentation that illustratesyour6 tasks completed in the project. (2 marks)
==========================================================
Task 1: Understand and describe the natureand structure of the selected dataset. (2 marks)
• Describe the dataset. Your description should answer the following questions: is it reliable? how was it collected? What its size?
• Identify the features of dataset.
• Propose hypothesis / assumptions (between 2 numerical variables) to validate.
Task 2: Check if your selected features have any of the following issues. Describe how you conducted the tests and how you addressed the issues. Support your answers with screenshots of the issues before and after the fixes. (1 mark)
• Missing values(0.25 for the test, fix and screenshot)
• Duplicate values(0.25)
• Data outliers(0.25)
• Any noise or irregularities (0.25)
Task 3: Provide descriptive statistics for the selectedfeatures using statistical method to understand the dataset more and answer the following analysis questions:(2 marks)
• Include any of the measure of central tendency such as the mean, median, and mode.
• Describe the spread of your data. This may include the measure of variance, standard deviation, skewness, and kurtosis.
(You are encouraged to impose other analysis questions based on any trend you notice in the dataset).
Task 4: Validate the hypothesis in Task1 by investigating the relationship between two quantitative variables you have chosen using correlation, regression and R-squared with possible conclusions.(2 marks)
Task 5:Show visual representation of your analysis (hint: use the right chart/graph for your data analysis).(1mark)
Task 6:Choose at least two ML models compatible with your dataset (e.g., Decision Tree, k-NN, Random Forest, SVM, k-means) and train the selected models using either Orange or Microsoft Excel (if applicable). Then, evaluate them using a confusion matrix. (2 mark)
Task 7:Build an active Dashboard which summarizes the most crucial factors (variables) that will help in decision-making process, and then demonstrate the effectiveness of your selection of those factors in the decision-making process. (2 marks)
Restricted – مقيد
Project
Deadline: Tuesday 04/12/2024 @ 23:59
[Total Mark for this Project is 14]
Group Details:
CRN:
Name:
ID:
Name:
ID:
Name:
ID:
Instructions:
• You must submit two separate copies (one Word file and one PDF file) using the Assignment Template on
Blackboard via the allocated folder. These files must not be in compressed format.
• It is your responsibility to check and make sure that you have uploaded both the correct files.
• Zero mark will be given if you try to bypass the SafeAssign (e.g., misspell words, remove spaces between
words, hide characters, use different character sets, convert text into image or languages other than English
or any kind of manipulation).
• Email submission will not be accepted.
• You are advised to make your work clear and well-presented. This includes filling your information on the cover
page.
• You must use this template, failing which will result in zero mark.
• You MUST show all your work, and text must not be converted into an image, unless specified otherwise by
the question.
• Late submission will result in ZERO mark.
• The work should be your own, copying from students or other resources will result in ZERO mark.
• Use Times New Roman font for all your answers.
Restricted – مقيد
Project
Pg. 01
Learning Outcome(s):
CLO 1, 2, 5
1, Demonstrate an
understanding of the
concepts of decision
analysis and decision
support systems (DSS)
including probability,
modelling, decisions under
uncertainty, and real-world
problems.
Project
14 Marks
Students can form groups consisting of three students and send their names to their
instructors before 5th October 2024. Otherwise, the instructors will form the groups
randomly and assign any datasets to the groups. No two groups will work on the same
dataset.
Select one dataset from the datasets provided in the bellow link.
For 28 Data Analysis Projects to Boost Your Skills [2024 Guide]:
2, Describe advanced
Business Intelligence,
Business Analytics, Data
Visualization, and
Dashboards.
For more free public datasets for EDA:
5, Improve hands-on skills
using Excel, and Orange
for building Decision
Support Systems.
anomalies in the data based on your own hypothesis.
✓
After the dataset is selected (or assigned), analyze the data using Microsoft Excel
or Orange Data Mining to discover the structure of data, trends, patterns, or any
✓
Perform the following six tasks.
✓
You should use visualization to aid your answers.
Your project will include two main parts:
1.
The final project report which must incorporate all the following 6 tasks and
written using the provided template. (10 marks distributed among the below tasks).
2.
A presentation that illustrates your 6 tasks completed in the project. (2 marks)
==========================================================
Task 1: Understand and describe the nature and structure of the selected dataset. (2
marks)
Restricted – مقيد
Project
Pg. 02
•
Describe the dataset. Your description should answer the following questions:
is it reliable? how was it collected? What its size?
•
Identify the features of dataset.
•
Propose hypothesis / assumptions (between 2 numerical variables) to validate.
Task 2: Check if your selected features have any of the following issues. Describe how you
conducted the tests and how you addressed the issues. Support your answers with screenshots of
the issues before and after the fixes. (1 mark)
•
Missing values (0.25 for the test, fix and screenshot)
•
Duplicate values (0.25)
•
Data outliers (0.25)
•
Any noise or irregularities (0.25)
Task 3: Provide descriptive statistics for the selected features using statistical method to
understand the dataset more and answer the following analysis questions: (2 marks)
•
Include any of the measure of central tendency such as the mean, median, and mode.
•
Describe the spread of your data. This may include the measure of variance, standard
deviation, skewness, and kurtosis.
(You are encouraged to impose other analysis questions based on any trend you notice in
the dataset).
Task 4: Validate the hypothesis in Task 1 by investigating the relationship between two
quantitative variables you have chosen using correlation, regression and R-squared with possible
conclusions. (2 marks)
Task 5: Show visual representation of your analysis (hint: use the right chart/graph for your data
analysis). (1 mark)
Restricted – مقيد
Project
Pg. 03
Task 6: Choose at least two ML models compatible with your dataset (e.g., Decision Tree, kNN, Random Forest, SVM, k-means) and train the selected models using either Orange or
Microsoft Excel (if applicable). Then, evaluate them using a confusion matrix. (2 mark)
Task 7: Build an active Dashboard which summarizes the most crucial factors (variables) that
will help in decision-making process, and then demonstrate the effectiveness of your selection
of those factors in the decision-making process. (2 marks)
Project Report
Restricted – مقيد
Purchase answer to see full
attachment