Q. In linear regression, what does the term 'overfitting' refer to?
A.
The model performs well on training data but poorly on unseen data
B.
The model is too simple to capture the underlying trend
C.
The model has too few features
D.
The model is perfectly accurate
Solution
Overfitting occurs when a model learns the noise in the training data instead of the actual underlying pattern, leading to poor performance on unseen data.
Correct Answer:
A
— The model performs well on training data but poorly on unseen data
Q. What does multicollinearity in linear regression refer to?
A.
High correlation between the dependent variable and independent variables
B.
High correlation among independent variables
C.
Low variance in the dependent variable
D.
Independence of errors
Solution
Multicollinearity occurs when two or more independent variables in a regression model are highly correlated, which can affect the stability of coefficient estimates.
Correct Answer:
B
— High correlation among independent variables
Q. What is the purpose of cross-validation in the context of linear regression?
A.
To increase the number of features
B.
To assess the model's performance on unseen data
C.
To reduce the training time
D.
To improve the model's accuracy
Solution
Cross-validation is used to assess how the results of a statistical analysis will generalize to an independent data set, helping to evaluate model performance.
Correct Answer:
B
— To assess the model's performance on unseen data
Q. Which of the following assumptions is NOT required for linear regression?
A.
Linearity
B.
Homoscedasticity
C.
Independence of errors
D.
Normality of predictors
Solution
While linear regression assumes linearity, homoscedasticity, and independence of errors, it does not require the predictors to be normally distributed.
Q. Which technique can be used to handle multicollinearity in linear regression?
A.
Increasing the sample size
B.
Removing one of the correlated variables
C.
Using a more complex model
D.
All of the above
Solution
To handle multicollinearity, one can increase the sample size, remove one of the correlated variables, or use more complex models like Ridge regression.