Information
AI Chat

This is a Premium Document. Some documents on Studocu are Premium. Upgrade to Premium to unlock it.

MGT6203 HW3 Part 2 (60 Points Total)

Course

data analytics in business (MGT 6203)

218 Documents

Students shared 218 documents in this course

University

Georgia Institute of Technology

Academic year: 2021/2022

Uploaded by:

lorraine lin

Georgia Institute of Technology

0followers

19Uploads

126upvotes

Recommended for you

Comments

Please sign in or register to post comments.

Related Studylists

mgt2603 MGT 6203

Preview text

MGT6203 HW3 Part 2 (60 Points Total)

Instructions:

####### For Homework 3 Part 2, please use this R notebook in Vocareum to submit your solutions. Vocareum is an educational cloud

####### platform for programming in several languages; it is based on the Jupyter notebook environment. This platform allows us to move

####### homework assignments to the cloud. The advantages are that all of you will be working in the same coding environment AND

####### peer reviewers will be able to run your R code easily. This way we eliminate some issues we might encounter when working on an

####### individual/local basis, such as library installations and Rstudio OS requirements; R notebooks work on mobile platforms and

####### tablets.

####### With R notebooks, you will be learning a new way of presenting data analysis reports, that is neat and flexible, where formatted

####### (English) text and (R) code can easily coexist on the same page. Notebooks can be also collaborative when needed. For now, we

####### are asking each of you to do your own work for homework. Think of R notebooks as interactive program-based Google docs or

####### MS-Office 360 docs; these are gradually replacing local files on our computers.

####### Many of you are new to the R notebooks and Vocareum platforms. We will provide TA help in Piazza with specific code if you have

####### questions. Here we list some important things to get you started. Please read through them carefully.

####### 1. Even though we are moving from your local envrionment to the cloud, our expectations from your homework will remain the

####### same. Same goes for the rubrics.

####### 2. Vocareum has its own cloud based file system, the data files you will be using for the assignments will be stored in the cloud with

####### path “../resource/asnlib/publicdata/FILENAME”.

####### 3. You will be able to import them with the same method as you do in RStudio, simply substitute the path name to the one specified in

####### the instructions. You won’t be able to modify these data files. You will be able to find the data files on Canvas/EdX if you would like

####### to explore them offline.

####### 4. For coding questions, you will be graded on the R code as well as the output in your submission.

####### 5. For interpretations or short response questions, please type the answers in the notebook’s markdown cells. To change a

####### code cell to a markdown cell, click on the cell, and in the dropdown menu above, switch the type of the cell block from “code” to

####### “markdown”. Adding print statements to code cells for short response/interpretation questions is also fine, as long as we

####### can clearly see the output of your response.

####### 6. You don’t need to, but if you would like to learn more about how to format your markdown cells, visit the following site:

####### earthdatascience/courses/intro-to-earth-data-science/file-formats/use-text-files/format-text-with-markdown-

####### jupyter-notebook/ (earthdatascience/courses/intro-to-earth-data-science/file-formats/use-text-files/format-text-

####### with-markdown-jupyter-notebook/). Jupyter notebook also support LaTeX.

####### 7. Feel free to delete or add as many additional cells as you need. But please try to keep your notebook clean and keep your

####### solution to a question directly under that question to avoid confusions.

####### 8. You may delete the #SOLUTION BEGINS/ENDS HERE comments from the cell blocks, they are just pointers that indicates where to

####### put you solutions.

####### 9. When you have finished the assignment, remember to rerun your notebook to check if it runs correctly. You can do so by

####### going to Kernel-> Restart & Run All. You may lose points if your solutions does not run successfully.

####### 10. Click the “Submit” button on the top right corner to turn in your assignment. Your assignment will enter the next phase for peer

####### review.

####### 11. You are allowed a total of 2 submissions for this assignment. So make sure that you submit your responses carefully. You will be

####### able to come back and resubmit your assignment as long as it is before the start of the peer review period.

Code

####### 12. Please remember to finish the peer reviews after you have submitted your assignment. You are responsible for grading the

####### work of three of your peers thoroughly, and in addherence to the rubrics. And you will be held accountable for peer grading. There

####### will be a 30% penalty to your grade if you fail to complete one or more peer reviews in proper fashion.

####### 13. Feel free to address your questions, concerns, and provide any feedback on Piazza. We will continuously try to improve going

####### forward.

####### 14. Good Luck!

About Package Installation:

####### Most of the packages (if not all) that you will need to complete this assignment are already installed in this environment. An easy way to

####### check this is to run the command: library(PackageName). If this command runs successfully then the package was already installed and

####### has been successfully attached to the code. If the command gave an error saying the Package was not found then follow the steps below

####### to successfully install the package and attach it to the code:

####### Use installed() command to return a table of the packages that are preinstalled in the environment.

####### To attach a preinstalled library in Vocareum, simply use library(PackageName)

####### To install a package that does not come with the provided environment, please use the following syntax:

####### install(“PackageName”, lib=“../work/”)

####### To attach a library you just installed, use the following syntax:

####### library(PackageName, lib=“../work/”)

####### Make sure the file location is the same as the above code snippets (“../work/”)

Instructions for Q1 to Q2:

####### Please use the Facebook Ad dataset ‘KAG’ for the next set of questions. We advise solving these questions using R (preferably using

####### dplyr library wherever applicable) after reviewing the code provided for Week 11 and other resources provided for learning dplyr in R

####### Learning Guide.

####### Load the dataset as below:

####### data <- read(“../resource/asnlib/publicdata/KAG”, stringsAsFactors = FALSE)

####### IMPORTANT NOTE: For no clicks and no amount spent, please consider CPC as 0.

Q1. (8 Points)

a. Among the ads that have the least CPC, which one ad leads to the most impressions? (provide ad_id

as the answer) (4 Points)

Hide

####### Solution Method 2:

#Apply data transformations: #1) Create a new column to represent the ad level CPM (named "CPM" in this example) #2) Group the data by campaign_id #3) Apply groupwise summarization of CPM via averaging (named "campaign_CPM" in this example)

warning messages about "summarise()" can be safely ignored

#3) Filter the data according to the maximum average campaign cpm (named "campaign_CPM" in this example) #4) Select and return the resulting campaign_id

data_Q1B_2 <- data_Q1 %>% mutate(CPM = round((Spent / Impressions) * 1000, 2)) %>% group_by(campaign_id) %>% summarise(campaign_CPM = mean(CPM)) %>% filter(campaign_CPM == max(campaign_CPM)) %>% select(campaign_id)

#Print the output (note the extra information isn't neccecary only the answer when grading) print(paste(paste("Campaign_id ", data_Q1B_2),' spent the least efficiently on brand awareness on average'))

[1] "Campaign_id 1178 spent the least efficiently on brand awareness on average"

Q2. (8 Points)

####### Assume each conversion (‘Total_Conversion’) is worth 10 dollars, each approved conversion (‘Approved_Conversion’) is worth 50 dollars.

####### ROAS (return on advertising spent) is revenue divided by advertising spent. Calculate ROAS and round it to two decimals. (Use ‘Spent’ as

####### the Cost in the given ROAS formula)

a. Make a boxplot of the ROAS grouped by gender for interest_id = 15, 21, 101 in one graph. Try to use

the function ‘+ scale_y_log10()’ in ggplot to make the visualization look better. The x-axis label should be

‘Interest ID’ while the y-axis label should be ROAS; and each interest_id will have two boxplots (one

boxplot for each gender). (4 Points)

####### Hint: Remember to filter the advertisements where there is no advertising spent.

####### Hint: ROAS should be rounded to the second decimal.

####### There are two possible solutions to this question.

####### Solution Method 1:

Hide

Hint:ROAS== ×100%

Revenue

Spent

10×TotalConversion+50×ApprovedConversion

Spent

Hide

#Load the ggplot2 library which we will use for creating the boxplots in this question library(ggplot2)

#Loading the dataset (same data as in Q1 just reloading and renaming for clarity) data_Q2 <- read("/cloud/project/KAG", stringsAsFactors = FALSE)

#Apply the following transformations to the data #1) Filter out all rows of the dataset where the Spent is equal to zero #2) Create a new column that represents Return on Ad Spend (ROAS) via the formula above

(named "ROAS" in this example)

#3) Filter rows of the dataset to only include the specified interest ids (101,15,21)

data_Q2_A <- data_Q2 %>% filter(Spent !=0) %>% mutate(ROAS = round((10Total_Conversion + 50Approved_Conversion)/Spent,2)) %>% filter(interest == 15 | interest == 21 | interest == 101)

#Use ggplot2 create the boxplot #Be sure to mark fill as "gender" after converting to a factor variable #Be sure to include the log base 10 scale for the y-axis

ggplot(data=data_Q2_A,aes(x=factor(interest), y=ROAS, fill=factor(gender))) + geom_boxplot() + scale_y_log10()+ ggtitle("Boxplot for the ROAS grouped by gender vs interest id") + labs(x="Interest Id", y="ROAS")

####### Solution Method 2:

Hide

(named "ROAS" in this example)

data_Q2_B <-data_Q2 %>% filter (Spent !=0) %>% mutate (ROAS = round((10Total_Conversion + 50Approved_Conversion)/Spent,2))

#Begining with the dataset above apply the following transformations #1) Filter out all rows except those with the campaign_id 1178 #2) Group the data on the gender variable #3) Apply groupwise summarization of ROAS via the median (named "medianROAS" in this example)

warning messages about "summarise()" can be safely ignored

#4) Apply groupwise summarization of ROAS via the mean (named "meanROAS" in this example)

warning messages about "summarise()" can be safely ignored

data_Q2_B <- data_Q2_B %>% filter(campaign_id==1178) %>% group_by(gender) %>% summarise(medianROAS = round(median(ROAS),2), meanROAS = round(mean(ROAS),2))

#Print the resulting Table showing the mean and median ROAS grouped by gender print(data_Q2_B)

####### gender

####### <int>

####### medianROAS

####### <dbl>

####### meanROAS

####### <dbl>

####### 0 1 3.

####### 1 0 1.

####### 2 rows

####### Solution Method 2:

(named "ROAS" in this example)

data_Q2_B_2 <-data_Q2 %>% filter (Spent !=0) %>% mutate (ROAS = round(((10Total_Conversion + 50Approved_Conversion)/Spent)*100,2))

warning messages about "summarise()" can be safely ignored

#4) Apply groupwise summarization of ROAS via the mean (named "meanROAS" in this example)

warning messages about "summarise()" can be safely ignored

data_Q2_B_2 <- data_Q2_B_2 %>% filter(campaign_id==1178) %>% group_by(gender) %>% summarise(medianROAS = round(median(ROAS),2), meanROAS = round(mean(ROAS),2))

#Print the resulting Table showing the mean and median ROAS grouped by gender print(data_Q2_B_2)

Hide

####### gender

####### <int>

####### medianROAS

####### <dbl>

####### meanROAS

####### <dbl>

####### gender

####### <int>

####### medianROAS

####### <dbl>

####### meanROAS

####### <dbl>

####### 0 159 314.

####### 1 92 192.

####### 2 rows

Instructions for Q3 to Q5:

####### Using the Advertising dataset and the following setup instructions to solve the questions.

#Load the pROC library which will be used in generating a ROC curve plot library(pROC) #Load the caret library which will be used to create and plot a confusion matrix library(caret) #Load the dplyr library for data manipulation library(dplyr) #Load the ggplot2 library for use with plotting our outputs library(ggplot2)

#Load the Q3 data data_Q3 <- read("/cloud/project/Advertising", header = TRUE, stringsAsFactors = FALSE)

#Transform the "Clicked_on_ad" column into a factor variable data_Q3$Clicked.on <- as(data_Q3$Clicked.on)

#show the top rows of the dataframe head(data_Q3)

####### Daily.Time.Spent.on

####### <dbl>

####### Age

####### <int>

####### Area

####### <dbl>

####### Daily.Internet

####### <dbl>

####### 1 68 35 61833 256.

####### 2 80 31 68441 193.

####### 3 69 26 59785 236.

####### 4 74 29 54806 245.

####### 5 68 35 73889 225.

####### 6 59 23 59761 226.

####### 6 rows | 1-5 of 10 columns

Q3. (7 Points)

Make a scatter plot for ‘Daily.Internet’ against ‘Age’. Separate the datapoints by different shapes

and/or color based on if the datapoint has clicked on the ad or not. (Clicked.on=0 means no, and

Clicked.on=1 means yes). Based off the general trends in the scatter plot you created, consider a

new data point where an individual has a ‘Daily.Internet’ less than or equal to 150, and an age of

40. Would this new individual be likely to click the ad or not click the ad?

Hide

#Create a logistic regression model using glm() and the specified features logistic_reg_model<-glm(Clicked.on ~ Daily.Time.Spent.on + Area+ Age, data=data_Q3, family=bi nomial(link='logit'))

#Print and show the model summary as output summary(logistic_reg_model)

Call: glm(formula = Clicked.on ~ Daily.Time.Spent.on + Area + Age, family = binomial(link = "logit"), data = data_Q3)

Deviance Residuals: Min 1Q Median 3Q Max -2 -0 -0 0 2.

Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 1+01 1+00 10 <2e-16 *** Daily.Time.Spent.on -2-01 1-02 -13 <2e-16 *** Area -1-04 1-05 -9 <2e-16 *** Age 1-01 1-02 9 <2e-16 ***

Signif. codes: 0 ‘’ 0 ‘’ 0 ‘’ 0 ‘.’ 0 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 1386 on 999 degrees of freedom Residual deviance: 391 on 996 degrees of freedom AIC: 399.

Number of Fisher Scoring iterations: 7

Q5. (7 Points)

Given the output above, how many false negative occurrences do you observe? Recall false negative

means the instances where the model predicts the case to be false when in reality it is true. For this

example, this refers to cases where the ad is clicked but the model predicts that it isn’t. Using the Proc()

library, use the roc() function to create and plot a ROC curve of our predictions and true labels.

####### There are two possible solutions for this question both of which will recieve full credit. The first is based upon using a positive

####### reference case of 0 which is the default confusion matrix function behavior. The second method that is acceptable would be

####### setting this optional argument to “positive=1”.

####### Solution Method 1:

Hide

#Create a new dataframe view of the columns needed for solving Q data_Q5 <-data_Q3 %>% select(Daily.Time.Spent.on, Age,Area,Clicked.on)

#Use the predict function to solve for the training dataset predictions pred <-predict(logistic_reg_model,data_Q5[,1:3],type='response')

#Append these resulting predictions to the dataframe as a new column data_Q5<-cbind(data_Q5,pred)

#Create a new column using mutate() which applies the .80 thresholding of the training set predictions data_Q5<-data_Q5 %>% mutate(pred_class=as(pred>=0))

#Create a confusion matrix showing the results of the training set predicitons compared to ground truth #Use the thresholded prediction labels as the predicted class #Use the known true labels as the reference case confusion<-confusionMatrix(data=as(data_Q5$pred_class),reference=data_Q5$Clicked.on)

#print/plot the resulting confusion matrix output print(confusion)

Confusion Matrix and Statistics

Reference Prediction 0 1 0 488 87 1 12 413

Accuracy : 0. 95% CI : (0, 0) No Information Rate : 0. P-Value [Acc > NIR] : < 2-

Kappa : 0.

Mcnemar's Test P-Value : 1-

Sensitivity : 0. Specificity : 0. Pos Pred Value : 0. Neg Pred Value : 0. Prevalence : 0. Detection Rate : 0. Detection Prevalence : 0. Balanced Accuracy : 0.

'Positive' Class : 0

####### Solution: Given the resulting confusion matrix and ‘positive class’ set to 0, we can infer the count of false negatives as the entry in

####### the table for Prediction = 1 (as 1=negative in this case) and Reference = 0 (as 0=positive in this case). Thus the number of false

####### negatives should be 12.

####### Solution Method 2:

Hide

#Use the Proc library and static prediciton threshold to plot the ROC curve #Warning messages about setting levels or direction can be safely ignored roc<-roc(data_Q5$Clicked.on,data_Q5$pred_class)

Setting levels: control = 0, case = 1 Setting direction: controls < cases

plot(roc)

#Print the observed AUC value of the curve print(paste("AUC value for logistic regression: ",roc$auc))

[1] "AUC value for logistic regression: 0"

####### Another acceptable solution would be to plot the standard ROC curve which is not based upon any single threshold value as

####### follows below.

####### Solution Method 2:

Hide

#Install and load the ROCR library #install("ROCR", lib="../work/") library(ROCR)

#Create a prediction object using our unthresholded predictoins and ground truth labels predictions <- prediction(as(data_Q5$pred),as(data_Q5$Clicked.on))

#Use the performance function to solve and plot the roc curve roc2 <- performance(predictions,"tpr", "fpr")

#plot the ROC curve plot(roc2,main="ROC curve for GLM model")

#Use the performance function to identify the curves' auc auc_ROCR <- performance(predictions, measure = "auc")

#Find the auc value from the results of the performance function auc_ROCR <- auc_ROCR@y[[1]]

#Print the plots' AUC print(paste("AUC value for logistic regression: ",auc_ROCR))

[1] "AUC value for logistic regression: 0"

Instructions for Questions Q6-Q8:

####### In response to the ongoing pandemic, a local restaurant has implemented social distancing measures, which include the closing its in

####### person dining areas. In order to keep the business going, the restaurant will now rely on drive through lines to handle customer ordering

####### and service. After implementing the new ordering system, management observes that customers arrive at the rate of 62 customers per

####### hour. Under the current system the restaurant has only 5 servers with a total service rate of 70 customers/hour.

Hide

arrival_rate <-

service_rate_per_server <- 14

#Create a vector that will represent each of the various possible numbers of servers number_of_servers <- seq(7,20)

service_rates <-number_of_servers*service_rate_per_server

#Convert the vector of number of servers to a vector and bind the vector of service rates to it as another c olumn data_Q7 <-as.data(number_of_servers) data_Q7 <- cbind(data_Q7,service_rates)

#Create a new column using mutate which will apply our formula for average wait time for each server/service rate data_Q7 <- data_Q7 %>% mutate(average_waits= (arrival_rate60)/(service_rates(service_rates-arrival_rate)))

#Plot the resulting entries of average waitime for each of the possible number of servers wait_plot <- plot(x=data_Q7$number_of_servers,y=data_Q7$average_waits,type ="b", xlab = 'Number of Servers o n Duty', ylab = 'Average Customer Wait Times (in Minutes)')

Q8. (4 Points)

a. Based on your plot above, at what number of servers does the average customer wait time drop

below 3 minutes? (2 Points)

#Plot not neccesary for full credit only shown for clarity/visualization plot(x=data_Q7$number_of_servers,y=data_Q7$average_waits,type = "b", xlab ='Number of Servers on Duty', ylab = 'Average Customer Wait Times (in Minutes)')

#Create a horizontal line that shows when we fall below 3 minutes in average wait time abline(h=3, col='blue')

Hide

#Create a vertical line that shows wthe first number of servers that satisfies our 3 minute wait constraint abline(v=8,col='red')

####### Solution: as we can see from the plot above the first valid value of servers (an integer) for which wait time is below 3 is 8 servers.

####### Plot and annotations are not required for full credit students only need to specificy that the correct answer is 8 servers.

b. Based on your plot above, Describe the behavior of the chart and give some commentary about the

relationship between the two variables. For example, is the chart increase or decreasing? What value

does average wait time approach if we continue to add more and more servers (to infinity)? (2 Points)

####### Solution: Student should have some generic commentary on the graph, must talk about the limits of the graph as severs on duty

####### approach infinity, must reference decreasing over the interval. The value of wait time asymptotically approaches 0.

Instructions for Questions Q9-Q10:

With the rise in popularity of cryptocurrency, XYZ Bank must be creative to retain its current customers.

To compete with cryptocurrencies like XRP, XYZ Bank wants to develop incentive programs and target

customers with a high risk of churning.

####### Note: For more information on how to use business analytics to analyze churn, refer to the Advanced Topics in the Marketing Modules.

Q9. (2 Points)

####### XYZ Bank wants to identify which segments are significant (95% confidence interval) at determining which customers are more likely to

####### churn. They want to focus on the following segments: CreditScore, Age, Tenure, Balance, NumOfProducts, HasCrCard, IsActiveMember,

####### EstimatedSalary.

Hide

####### Jaime 720 35 10 8765 1 Yes Yes 135000

#Turn each of the table features above into vectors for later use in creating a dataframe CreditScore <- c(800,520,580,720) Age <- c(25,23,32,35) Tenure <- c(7,2,3,10) Balance <- c(83456, 983,12999, 8765) NumOfProducts <- c(1,3,2,1) HasCrCard <- c("Yes", "Yes", "Yes", "Yes") IsActiveMember <- c("Yes", "No", "No", "Yes") EstimatedSalary <- c(105000, 92000, 45000,135000 ) Names <- c("John", "Jamal","Denise","Jamie") #Create a dataframe of the feature vectors data <- data(CreditScore = CreditScore, Age = Age, Tenure = Tenure, Balance = Balance, NumOfProducts = NumOfProducts, HasCrCard = HasCrCard, IsActiveMember = IsActiveMember, EstimatedSalry = EstimatedSalary)

#Use the previously used model to predict each customers probablity to churn preds <- predict(model, data, type="response") output_frame <- data(Name=Names,Predicted_Churn=preds) output_frame

####### Name

####### <chr>

####### Predicted_Churn

####### <dbl>

####### 1 John 0.

####### 2 Jamal 0.

####### 3 Denise 0.

####### 4 Jamie 0.

####### 4 rows

####### Solution: Given the output prediction dataframe above, it is clear that customer “Denise” has the highest probablity of churn.

####### Thus, based on this “Denise” is the customer most likely to churn.

Hide

Was this document helpful?

Premium

This is a Premium Document. Some documents on Studocu are Premium. Upgrade to Premium to unlock it.

MGT6203 HW3 Part 2 (60 Points Total)

Course: data analytics in business (MGT 6203)

218 Documents

Students shared 218 documents in this course

University: Georgia Institute of Technology

Was this document helpful?

This is a preview

Do you want full access? Go Premium and unlock all 19 pages

Access to all documents
Get Unlimited Downloads
Improve your grades

Upload

Share your documents to unlock

Already Premium?

12/16/21, 6:00 PM

MGT6203 HW3 Part 2 (60 Points Total)

ﬁle:///Users/yatinglin/Downloads/HW3_Part_2_V6.html

1/19

MGT6203 HW3 Part 2 (60 Points Total)

Instructions:

For Homework 3 Part 2, please use this R notebook in Vocareum to submit your solutions. Vocareum is an educational cloud

platform for programming in several languages; it is based on the Jupyter notebook environment. This platform allows us to move

homework assignments to the cloud. The advantages are that all of you will be working in the same coding environment AND

peer reviewers will be able to run your R code easily. This way we eliminate some issues we might encounter when working on an

individual/local basis, such as library installations and Rstudio OS requirements; R notebooks work on mobile platforms and

tablets.

With R notebooks, you will be learning a new way of presenting data analysis reports, that is neat and ﬂexible, where formatted

(English) text and (R) code can easily coexist on the same page. Notebooks can be also collaborative when needed. For now, we

are asking each of you to do your own work for homework. Think of R notebooks as interactive program-based Google docs or

MS-Oﬃce 360 docs; these are gradually replacing local ﬁles on our computers.

Many of you are new to the R notebooks and Vocareum platforms. We will provide TA help in Piazza with speciﬁc code if you have

questions. Here we list some important things to get you started. Please read through them carefully.

1. Even though we are moving from your local envrionment to the cloud, our expectations from your homework will remain the

same. Same goes for the rubrics.

2. Vocareum has its own cloud based ﬁle system, the data ﬁles you will be using for the assignments will be stored in the cloud with

path “../resource/asnlib/publicdata/FILENAME.csv”.

3. You will be able to import them with the same method as you do in RStudio, simply substitute the path name to the one speciﬁed in

the instructions. You won’t be able to modify these data ﬁles. You will be able to ﬁnd the data ﬁles on Canvas/EdX if you would like

to explore them oﬄine.

4. For coding questions, you will be graded on the R code as well as the output in your submission.

5. For interpretations or short response questions, please type the answers in the notebook’s markdown cells. To change a

code cell to a markdown cell, click on the cell, and in the dropdown menu above, switch the type of the cell block from “code” to

“markdown”. Adding print statements to code cells for short response/interpretation questions is also ﬁne, as long as we

can clearly see the output of your response.

6. You don’t need to, but if you would like to learn more about how to format your markdown cells, visit the following site:

https://www.earthdatascience.org/courses/intro-to-earth-data-science/ﬁle-formats/use-text-ﬁles/format-text-with-markdown-

jupyter-notebook/ (https://www.earthdatascience.org/courses/intro-to-earth-data-science/ﬁle-formats/use-text-ﬁles/format-text-

with-markdown-jupyter-notebook/). Jupyter notebook also support LaTeX.

7. Feel free to delete or add as many additional cells as you need. But please try to keep your notebook clean and keep your

solution to a question directly under that question to avoid confusions.

8. You may delete the #SOLUTION BEGINS/ENDS HERE comments from the cell blocks, they are just pointers that indicates where to

put you solutions.

9. When you have ﬁnished the assignment, remember to rerun your notebook to check if it runs correctly. You can do so by

going to Kernel-> Restart & Run All. You may lose points if your solutions does not run successfully.

10. Click the “Submit” button on the top right corner to turn in your assignment. Your assignment will enter the next phase for peer

review.

11. You are allowed a total of 2 submissions for this assignment. So make sure that you submit your responses carefully. You will be

able to come back and resubmit your assignment as long as it is before the start of the peer review period.

Code