Sunday, July 23, 2017

Flexible learning method vs Inflexible learning method

A method is said to be flexible if  that can fit many different possible functional forms of the unknown function f.And the flexible models follow the error or noise too closely.

1) What are the advantages and disadvantages of a very flexible (versus a less flexible) approach for regression or classification ? Under what circumstances might a more flexible approach be preferred to a less flexible approach? When might a less flexible approach be preferred ?

Ans: 

The advantages of a very flexible approach are that it may give a better fit for non-linear models and it decreases the bias.

The disadvantages of a very flexible approach are that it requires estimating a greater number of parameters, it follows the noise too closely (overfit) and it increases the variance.

A more flexible approach would be preferred to a less flexible approach when we are interested in prediction and not the interpretability of the results.

A less flexible approach would be preferred to a more flexible approach when we are interested in inference and the interpretability of the results.




2)Decide which one is better or worse ,flexible method or inflexible method for below examples?

a) The sample size n is extremely large, and the number of predictors p is small ?

Ans: A flexible method will fit the data closer and with the large sample size, would perform better than an inflexible approach(restrictive approach).

b) The number of predictors p is extremely large, and the number of observations n is small ?

Ans: A flexible method would overfit the small number of observations.So here we need a less flexible method will do better.

c)  The variance of the error terms, i.e. σ2=Var(ϵ), is extremely high?
 
Ans: The inflexible model is referred .Because the flexible might fit everything including the errors, unfortunately, this data set has large error. Thus the inflexible model is a good option for getting the knowledge about the trend and the pattern of the data. 

d)  The relationship between the predictors and response is highly non-linear, and σ2 is small.

Ans:The flexible model is preferred. The prior information has already said that this model is highly non-linear. The linear model must not provide an accurate estimation for the data set. Moreover, the variance of the error term is small; we do not need worry if the flexible model can also fit into the errors.  

e)  The relationship between the predictors and response is highly non-linear, and σ2 is large.

Ans:The inflexible model is preferred. Although the underlying distribution is highly non-linear, the large error might also be mistakenly included into the flexible model. Inflexible model can fit better here because it can give us an clear and accurate sense of the overall knowledge of the data set.

No comments:

Post a Comment