Q. In the context of model selection, what does cross-validation help to prevent?
-
A.
Overfitting
-
B.
Underfitting
-
C.
Data leakage
-
D.
Bias
Solution
Cross-validation helps to prevent overfitting by ensuring that the model performs well on unseen data.
Correct Answer:
A
— Overfitting
Learn More →
Q. What is the effect of using polynomial features in a linear regression model?
-
A.
It reduces the model complexity
-
B.
It can capture non-linear relationships
-
C.
It increases the risk of underfitting
-
D.
It eliminates multicollinearity
Solution
Polynomial features allow the model to capture non-linear relationships between the features and the target variable.
Correct Answer:
B
— It can capture non-linear relationships
Learn More →
Q. What is the main advantage of using ensemble methods like Random Forest over a single decision tree?
-
A.
They are faster to train
-
B.
They reduce variance and improve prediction accuracy
-
C.
They are easier to interpret
-
D.
They require less data
Solution
Ensemble methods like Random Forest reduce variance by averaging multiple decision trees, leading to improved prediction accuracy.
Correct Answer:
B
— They reduce variance and improve prediction accuracy
Learn More →
Q. What is the purpose of using regularization techniques in model selection?
-
A.
To increase the model's complexity
-
B.
To reduce the training time
-
C.
To prevent overfitting by penalizing large coefficients
-
D.
To improve the interpretability of the model
Solution
Regularization techniques prevent overfitting by adding a penalty for large coefficients in the model.
Correct Answer:
C
— To prevent overfitting by penalizing large coefficients
Learn More →
Q. Which feature transformation technique is used to normalize the range of features?
-
A.
One-Hot Encoding
-
B.
Min-Max Scaling
-
C.
Label Encoding
-
D.
Feature Extraction
Solution
Min-Max Scaling normalizes the range of features to a specified range, typically [0, 1].
Correct Answer:
B
— Min-Max Scaling
Learn More →
Q. Which of the following is a common method for handling missing data in a dataset?
-
A.
Removing all rows with missing values
-
B.
Replacing missing values with the mean or median
-
C.
Ignoring the missing values during training
-
D.
All of the above
Solution
Replacing missing values with the mean or median is a common method, though other methods can also be used depending on the context.
Correct Answer:
B
— Replacing missing values with the mean or median
Learn More →
Q. Which of the following is a common method for handling missing data?
-
A.
Removing all rows with missing values
-
B.
Imputing missing values with the mean or median
-
C.
Ignoring missing values during training
-
D.
Using a more complex model
Solution
Imputing missing values with the mean or median is a common method to handle missing data.
Correct Answer:
B
— Imputing missing values with the mean or median
Learn More →
Q. Which of the following is a disadvantage of using decision trees for model selection?
-
A.
They are easy to interpret
-
B.
They can easily overfit the training data
-
C.
They handle both numerical and categorical data
-
D.
They require less data preprocessing
Solution
Decision trees can easily overfit the training data, especially if they are not pruned or if the tree is too deep.
Correct Answer:
B
— They can easily overfit the training data
Learn More →
Q. Which of the following is a disadvantage of using too many features in a model?
-
A.
Increased interpretability
-
B.
Higher computational cost
-
C.
Better model performance
-
D.
Reduced risk of overfitting
Solution
Using too many features can lead to higher computational costs and may increase the risk of overfitting.
Correct Answer:
B
— Higher computational cost
Learn More →
Q. Which of the following techniques is NOT typically used in feature selection?
-
A.
Recursive Feature Elimination
-
B.
Principal Component Analysis
-
C.
Random Forest Importance
-
D.
K-Means Clustering
Solution
K-Means Clustering is an unsupervised learning algorithm used for clustering, not for feature selection.
Correct Answer:
D
— K-Means Clustering
Learn More →
Showing 1 to 10 of 10 (1 Pages)