Explain whether below scenarios is a classification or regression problem, and indicate whether we are most interested in inference or prediction. Finally, provide n and p.Here n is the nos of training examples and p is the nos of predictors or number of features.
(a) We collect a set of data on the top 500 firms in the US. For each
firm we record profit, number of employees, industry and the
CEO salary. We are interested in understanding which factors
affect CEO salary.
Ans: This is a supervised regression learning problem. We are interested in the inference. n=500, p=3.
(b) We are considering launching a new product and wish to know
whether it will be a success or a failure. We collect data on 20
similar products that were previously launched. For each prod-
uct we have recorded whether it was a success or failure, price
charged for the product, marketing budget, competition price,
and ten other variables.
Ans: This is a supervised classification learning problem. We are interested in the prediction. p=13, n=20
(c) We are interesting in predicting the % change in the US dollar in
relation to the weekly changes in the world stock markets. Hence
we collect weekly data for all of 2012. For each week we record
the % change in the dollar, the % change in the US market,
the % change in the British market, and the % change in the
German market.
Ans: This is a regression problem, and we are interested in the prediction. p=3, n=52.since the input is about weekly data and there are 52 weeks in a year so n=52.Here 3 features are the % change in the US market,the % change in the British market, and the % change in the German market.
And our target variable is the % change in the dollar.
d) Our website has collected the ratings of 1000 different restaurants by 10, 000 customers. Each customer has rated about 100 restaurants, and we would like to recommend restaurants to customers who have not yet been there.
Ans: This is a supervised classification learning problem. We are interested in the prediction. We assume that each customer has rated exactly 100 restaurants, the average number of rating each restaurant receives are 1,000.Here we are interested in recommending a restaurant to a customer.So now the problem is should we recommend it or not?So it is a two class classification problem .Since each restaurant has 1000 ratings , so our training data has 1000 samples.So n=1000.Here we know only one feature of the restaurant that is rating so p=1.
e) Is this TV series/movie/ad campaign going to be successful or not?The data given for 1000 samples.And for each sample we collect Money spent, Talent, Running Time, Producer, TV Channel, Air time slot, etc.
Ans: It is a binary (two class) classification problem.The response or target variable is will the series or movie will be successful or not.We want to predict the response.And the features or predictors are Money spent, Talent, Running Time, Producer, TV Channel, Air time slot.So p=6.And since we have 1000 samples , so n=1000.
f) Should this applicant be admitted into Utkal University , Vanivihar or not for MCA programme.We have 1000 samples.And each sample contains his/her 10th percentage,12th percentage,graduation percentage and OJEE score.
Ans: It is a binary (two class) classification problem.The response or target variable is admit or not admit.It is a prediction problem.
And the features or predictors are 10th percentage,12th percentage,graduation percentage and OJEE score.So p=4.And since we have 1000 samples , so n=1000.
(a) We collect a set of data on the top 500 firms in the US. For each
firm we record profit, number of employees, industry and the
CEO salary. We are interested in understanding which factors
affect CEO salary.
Ans: This is a supervised regression learning problem. We are interested in the inference. n=500, p=3.
(b) We are considering launching a new product and wish to know
whether it will be a success or a failure. We collect data on 20
similar products that were previously launched. For each prod-
uct we have recorded whether it was a success or failure, price
charged for the product, marketing budget, competition price,
and ten other variables.
Ans: This is a supervised classification learning problem. We are interested in the prediction. p=13, n=20
(c) We are interesting in predicting the % change in the US dollar in
relation to the weekly changes in the world stock markets. Hence
we collect weekly data for all of 2012. For each week we record
the % change in the dollar, the % change in the US market,
the % change in the British market, and the % change in the
German market.
Ans: This is a regression problem, and we are interested in the prediction. p=3, n=52.since the input is about weekly data and there are 52 weeks in a year so n=52.Here 3 features are the % change in the US market,the % change in the British market, and the % change in the German market.
And our target variable is the % change in the dollar.
d) Our website has collected the ratings of 1000 different restaurants by 10, 000 customers. Each customer has rated about 100 restaurants, and we would like to recommend restaurants to customers who have not yet been there.
Ans: This is a supervised classification learning problem. We are interested in the prediction. We assume that each customer has rated exactly 100 restaurants, the average number of rating each restaurant receives are 1,000.Here we are interested in recommending a restaurant to a customer.So now the problem is should we recommend it or not?So it is a two class classification problem .Since each restaurant has 1000 ratings , so our training data has 1000 samples.So n=1000.Here we know only one feature of the restaurant that is rating so p=1.
e) Is this TV series/movie/ad campaign going to be successful or not?The data given for 1000 samples.And for each sample we collect Money spent, Talent, Running Time, Producer, TV Channel, Air time slot, etc.
Ans: It is a binary (two class) classification problem.The response or target variable is will the series or movie will be successful or not.We want to predict the response.And the features or predictors are Money spent, Talent, Running Time, Producer, TV Channel, Air time slot.So p=6.And since we have 1000 samples , so n=1000.
f) Should this applicant be admitted into Utkal University , Vanivihar or not for MCA programme.We have 1000 samples.And each sample contains his/her 10th percentage,12th percentage,graduation percentage and OJEE score.
Ans: It is a binary (two class) classification problem.The response or target variable is admit or not admit.It is a prediction problem.
And the features or predictors are 10th percentage,12th percentage,graduation percentage and OJEE score.So p=4.And since we have 1000 samples , so n=1000.











 
No comments:
Post a Comment