Our Services

Get 15% Discount on your First Order

[rank_math_breadcrumb]

Computer Science homework help

Computer Science Homework 2

Homework 2.

Question 1. Decision Tree Classifier [10 Points]

Data: The zip file “
hw2.q1.data.zip” contains 3 CSV files:

· “
hw2.q1.train.csv” contains 10,000 rows and 26 columns. The first column ‘
y’ is the output variable with 2 classes: 0, 1. The remaining 25 columns contain input features:
x_1, …, x
_25.

· “
hw2.q1.test.csv” contains 2,000 rows and 26 columns. The first column ‘
y’ is the output variable with 2 classes: 0, 1. The remaining 25 columns contain input features:
x_1, …, x
_25.

· “
hw2.q1.new.csv” contains 30 rows and 26 columns. The first column ‘
ID’ is an identifier for 30 unlabeled samples. The remaining 25 columns contain input features:
x_1, …, x
_25.

Task 1. [4 points]

Use 5-fold cross-validation with the 10,000 labeled exampled from “
hw2.q1.train.csv” to determine the fewest number of rules using which a decision tree classifier can achieve mean cross-validation accuracy of at least 0.96. Report the number of rules needed, the cross-validation accuracy obtained, and all the hyper-parameter values for the
DecisionTreeClassifier.

Fewest number of rules needed: ………………. (to achieve mean cross-validation accuracy of at least 0.96)

Mean cross-validation accuracy: ………………………. (
rounded to 4 decimal places)

Non-default hHyper-parameter values for selected DecisionTreeClassifier model:

Task 2. [2 Points]

Train a
DecisionTreeClassifier with the hyper-parameter values determined in Task 1 on all 10,000 training samples and use it to predict the output class ‘
y’ for the 2,000 examples in “
hw2.q1.test.csv”
. Report the following:

·
Accuracy on 2,000 test examples: …………………… (rounded to 4 decimal places)

·
Classification report for the 2,000 test examples:

·
Confusion matrix for the 2,000 test examples:

Task 3. [2 Points]

Use the model trained in Task 2 to predict the output class ‘
y’ for the 30 examples in “
hw2.q1.new.csv”. Specify the predicted classes in the table below:

ID	predicted y
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

Task 4. [2 Points]

Of the 25 input variables which ones are relevant for this classification task?

The following … input variables are relevant for this classification task: …………………

Display your trained decision tree:

Question 2. Supervised machine learning classifiers [10 Points]

Data: The zip file “
hw2.q2.data.zip” contains 3 CSV files:

· “
hw2.q2.train.csv” contains 8,000 rows and 11 columns. The first column ‘
y’ is the output variable with 4 classes: 0, 1, 2, 3. The remaining 10 columns contain input features:
x1, …, x
10.

· “
hw2.q2.test.csv” contains 2,000 rows and 11 columns. The first column ‘
y’ is the output variable with 4 classes: 0, 1, 2, 3. The remaining 10 columns contain input features:
x1, …, x
10.

· “
hw2.q1.new.csv” contains 30 rows and 10 columns. The first column ‘
ID’ is an identifier for 30 unlabeled samples. The remaining 10 columns contain input features:
x1, …, x
10.

Task 1. [6 points]

Use 4-fold cross-validation with the 8,000 labeled exampled from “
hw2.q2.train.csv” to identify a classifier that achieves mean cross-validation accuracy of at least 0.96. You should try several
Scikit-Learn classifiers, including:
GaussianNB, DecisionTreeClassifier, RandomForestClassifier, ExtraTreesClassifier, KNeighborsClassifier, LogisticRegression, SVC, and MLPClassifier. Try different hyper-parameter values for the better performing classifiers to obtain a good set of hyper-parameter values. Then select the best performing model. Report the following:

Selected model with hyper-parameter values

:

Mean cross-validation accuracy: ………………………. (
rounded to 4 decimal places)

Task 2. [2 Points]

Train the classifier with the hyper-parameter values determined in Task 1 on all 8,000 training samples and use it to predict the output class ‘
y’ for the 2,000 examples in “
hw2.q2.test.csv”
. Report the following:

·
Accuracy on 2,000 test examples: …………………… (rounded to 4 decimal places)

·
Classification report for the 2,000 test examples:

·
Confusion matrix for the 2,000 test examples:

Task 3. [2 Points]

Use the model trained in Task 2 to predict the output class ‘
y’ for the 30 examples in “
hw2.q2.new.csv”. Specify the predicted classes in the table below:

ID	predicted y
ID_001
ID_002
ID_003
ID_004
ID_005
ID_006
ID_007
ID_008
ID_009
ID_010
ID_011
ID_012
ID_013
ID_014
ID_015
ID_016
ID_017
ID_018
ID_019
ID_020
ID_021
ID_022
ID_023
ID_024
ID_025
ID_026
ID_027
ID_028
ID_029
ID_030

>Computer Science homework help

Share This Post

Order a Similar Paper and get 15% Discount on your First Order

Related Questions

SQL injection

Hey! ???? I need an expert in SQL injection, DDOS attack, Code injection attack, XSS attack! To talk further please contact me on discord at mara411 so we can talk more freely and then I will hire you on here! Thanks ????

Free CAD, FeniCS or paraview

I have attached the picture and sample work too, I need work as like sampl, but not the copypasted Make sure you can ask me multiple questions but not dont do rubbish work

database

2. Final Assignment – equivalent to 4,000 words The final module mark is based on two deliverables focused on the CarNow case study described below. – 50% of the final mark a. An advisory report – 50 % of the final mark Includes 5% (of the module grade) given for

Computer

Documentation Tabula Insurance Agency ENTER AND UPDATE COMPANY DATA Author: Ashanti Joyner Note: Do not edit this sheet. If your name does not appear in cell B6, please download a new copy of the file from the SAM website. Personnel Tabula Insurance Agency Personnel: April 4-10, 2024 Employee Name Salary

Computer class

All information is below Toronto converted a declining part of the city into a vibrant neighborhood using the smart city 1.0 approach when a local technology company introduced electric shuttle buses to replace private cars and intelligent traffic lights to regulate the flow of pedestrians, bicycles, and vehicles. From Frankl,

Week 15

Read attachments for assignments Week 14 Feedback Overall Feedback Well done on this assignment You will have to refine your tables and figures for your final submission. Always introduce them to the reader in preceding paragraph, properly create APA table, and cite figures. See Video: APA Tables and Figures

Prof Double R

PowerPoint Presentation: Narrative Presentation to the Board of Trustees The Centers for Medicare & Medicaid Services (CMS) has taken on a more visible role in health care. A great deal of change has transpired to improve patient safety and implementation of additional quality metrics. The new health care reform

Microsoft Access Graded Project

Need help with assignment This file is too large to display.View in new window

Week 14

Please read attachments for details image1.png

Week 13

Read attachments for details The Finishing Touches – Week 13 Instructions For this week’s assignment you will submit the material discussed in the lesson plan and summarized below: · A refined introduction (Mandatory) · Updated Title Page (Mandatory) · Copyright Page with Declaration (Mandatory) · Dedication Page (Optional) · Acknowledgement

SCMT699

please read attachments for assignment Feedback from week 10 Please address your design before your next submission. Its how you are going to go about conducting your research so other can duplicate it. This is a good book on it. Creswell, J. W. (2009). Research design: Qualitative, quantitative, and mixed

Discussion 2

Follow the attached instructions to complete this work. Using ChatGPT or another generative AI tool, you will request SQL code for a business problem using simple user requirement terms. Then you will plug that into MySQL to reverse engineer an ERD. You and your classmates will discuss misalignments between what

Week 10

Read attachment for details Week 8 Feedback Overall Feedback Theory is one of the most difficult concept to grasp. Your study must be based on a theory and align with what you are attempting to explore and what you are trying to answer based on previous gaps in research. Well

hw2

This problem exercises the basic concepts of game playing, using tic-tac-toe as an example. We define Xn as the number of rows, columns, or diagonals with exactly n X’s and no O’s. Similarly, On is the number of rows, columns, or diagonals with exactly n O’s. The utility function assigns

Incident Response

Please follow the PDF WGU Performance Assessment Please create report attach is the doc file to use also included are the lab results with screen shots of answer -Create “Incident Reporting Template” with file attach -Use screenshot evidence document, in .docx format, generated by the virtual lab for guidance and

Week 8

Read attachment for details Theoretical Framework – Week 8 Hide Assignment Information Turnitin™ Turnitin™ enabledThis assignment will be submitted to Turnitin™. Instructions This week you will submit your theoretical framework. The following description for this section of your thesis is from the End of Program Manual (EOP): Theoretical Framework/Approach: The

In Basketball Stars, a player attempts 25 shots in one game.

In basketball stars, a player attempts 25 shots in one game. a) If 15 shots are successful, what is the player’s shooting percentage? b) The next game, the player makes 18 out of 30 shots. Compare the two shooting percentages. c) What is the overall shooting percentage across both

problem

Research problems due 9/18 Please follow the instructions carefully for your research problem. Your argument and research input will significantly impact your grade. Ensure that you check for AI-generated content and plagiarism before submitting your paper. AI-generated content should not exceed 10%, and content from external sources should be limited