Q. How does Random Forest handle missing values in the dataset?
A.
It ignores missing values completely
B.
It uses mean imputation for missing values
C.
It can use surrogate splits to handle missing values
D.
It requires complete data without any missing values
Show solution
Solution
Random Forest can use surrogate splits to handle missing values, allowing it to make predictions even with incomplete data.
Correct Answer:
C
— It can use surrogate splits to handle missing values
Learn More →
Q. In a Decision Tree, what does the term 'Gini impurity' refer to?
A.
A measure of the tree's depth
B.
A metric for evaluating model performance
C.
A criterion for splitting nodes
D.
A method for pruning trees
Show solution
Solution
Gini impurity is a criterion used to measure the impurity of a node, helping to determine the best feature to split on.
Correct Answer:
C
— A criterion for splitting nodes
Learn More →
Q. In Decision Trees, what does the Gini impurity measure?
A.
The accuracy of the model
B.
The purity of a node
C.
The depth of the tree
D.
The number of features used
Show solution
Solution
Gini impurity measures the impurity or disorder of a node, helping to determine the best split at each node.
Correct Answer:
B
— The purity of a node
Learn More →
Q. In Random Forests, what does the term 'out-of-bag error' refer to?
A.
Error on the training set
B.
Error on unseen data
C.
Error calculated from the samples not used in training a tree
D.
Error from the final ensemble model
Show solution
Solution
Out-of-bag error is an estimate of the model's performance calculated using the data points that were not included in the bootstrap sample for each tree.
Correct Answer:
C
— Error calculated from the samples not used in training a tree
Learn More →
Q. In the context of Decision Trees, what does 'pruning' refer to?
A.
Adding more branches to the tree
B.
Removing branches to reduce complexity
C.
Increasing the depth of the tree
D.
Changing the splitting criteria
Show solution
Solution
Pruning is the process of removing branches from a Decision Tree to prevent overfitting and improve generalization.
Correct Answer:
B
— Removing branches to reduce complexity
Learn More →
Q. What does the term 'feature importance' refer to in the context of Random Forests?
A.
The number of features used in the model
B.
The contribution of each feature to the model's predictions
C.
The correlation between features
D.
The total number of trees in the forest
Show solution
Solution
Feature importance indicates how much each feature contributes to the model's predictions, helping to identify the most influential variables.
Correct Answer:
B
— The contribution of each feature to the model's predictions
Learn More →
Q. What is a common method for feature importance evaluation in Random Forests?
A.
Permutation importance
B.
Gradient boosting
C.
K-fold cross-validation
D.
Principal component analysis
Show solution
Solution
Permutation importance is a common method used to evaluate feature importance in Random Forests by measuring the increase in prediction error when the feature's values are permuted.
Correct Answer:
A
— Permutation importance
Learn More →
Q. What is a common use case for Random Forests in real-world applications?
A.
Image recognition
B.
Natural language processing
C.
Credit scoring
D.
Time series forecasting
Show solution
Solution
Random Forests are widely used in credit scoring due to their ability to handle large datasets and provide robust predictions.
Correct Answer:
C
— Credit scoring
Learn More →
Q. What is a primary advantage of using Random Forests over a single Decision Tree?
A.
Lower computational cost
B.
Higher accuracy due to ensemble learning
C.
Easier to interpret
D.
Requires less data
Show solution
Solution
Random Forests combine multiple Decision Trees to improve accuracy and reduce overfitting, leveraging ensemble learning.
Correct Answer:
B
— Higher accuracy due to ensemble learning
Learn More →
Q. What is the main disadvantage of using a Decision Tree?
A.
High bias
B.
High variance
C.
Requires a lot of data
D.
Difficult to interpret
Show solution
Solution
Decision Trees are prone to high variance, meaning they can overfit the training data and perform poorly on unseen data.
Correct Answer:
B
— High variance
Learn More →
Q. What is the main purpose of using cross-validation when training a Decision Tree?
A.
To increase the size of the training set
B.
To tune hyperparameters
C.
To assess the model's generalization ability
D.
To visualize the tree structure
Show solution
Solution
Cross-validation helps in assessing how the model will generalize to an independent dataset, thus providing a better estimate of its performance.
Correct Answer:
C
— To assess the model's generalization ability
Learn More →
Q. What is the purpose of the 'bootstrap' sampling method in Random Forests?
A.
To create a balanced dataset
B.
To ensure all features are used
C.
To generate multiple subsets of the training data
D.
To improve model interpretability
Show solution
Solution
Bootstrap sampling allows Random Forests to create multiple subsets of the training data, which helps in building diverse trees.
Correct Answer:
C
— To generate multiple subsets of the training data
Learn More →
Q. What is the purpose of the 'n_estimators' parameter in a Random Forest model?
A.
To define the maximum depth of each tree
B.
To specify the number of trees in the forest
C.
To set the minimum samples required to split a node
D.
To determine the number of features to consider at each split
Show solution
Solution
'n_estimators' specifies the number of trees in the Random Forest, which affects the model's performance and stability.
Correct Answer:
B
— To specify the number of trees in the forest
Learn More →
Q. What is the role of 'bootstrap sampling' in Random Forests?
A.
To select features for each tree
B.
To create multiple subsets of the training data
C.
To evaluate model performance
D.
To increase the depth of trees
Show solution
Solution
Bootstrap sampling involves creating multiple subsets of the training data by sampling with replacement, which helps in building diverse trees.
Correct Answer:
B
— To create multiple subsets of the training data
Learn More →
Q. What is the role of 'max_features' in Random Forests?
A.
To limit the number of trees in the forest
B.
To control the maximum depth of each tree
C.
To specify the maximum number of features to consider when looking for the best split
D.
To determine the minimum number of samples required to split an internal node
Show solution
Solution
'max_features' controls how many features are considered for splitting at each node, which helps in reducing correlation among trees.
Correct Answer:
C
— To specify the maximum number of features to consider when looking for the best split
Learn More →
Q. What is the role of the 'max_depth' parameter in a Decision Tree?
A.
It determines the maximum number of features to consider
B.
It limits the number of samples at each leaf
C.
It restricts the maximum depth of the tree
D.
It controls the minimum number of samples required to split an internal node
Show solution
Solution
The 'max_depth' parameter limits how deep the Decision Tree can grow, helping to prevent overfitting.
Correct Answer:
C
— It restricts the maximum depth of the tree
Learn More →
Q. Which algorithm is typically faster to train on large datasets?
A.
Decision Trees
B.
Random Forests
C.
Both are equally fast
D.
Neither, both are slow
Show solution
Solution
Decision Trees are generally faster to train than Random Forests, as Random Forests require training multiple trees.
Correct Answer:
A
— Decision Trees
Learn More →
Q. Which evaluation metric is most appropriate for assessing the performance of a Decision Tree on a binary classification problem?
A.
Mean Squared Error
B.
Accuracy
C.
Silhouette Score
D.
R-squared
Show solution
Solution
Accuracy is a common metric for evaluating the performance of classification models, including Decision Trees.
Correct Answer:
B
— Accuracy
Learn More →
Q. Which of the following metrics is commonly used to evaluate the performance of a Decision Tree?
A.
Mean Squared Error
B.
Accuracy
C.
Silhouette Score
D.
F1 Score
Show solution
Solution
Accuracy is a common metric for evaluating the performance of classification Decision Trees.
Correct Answer:
B
— Accuracy
Learn More →
Q. Which of the following techniques can be used to handle missing values in Decision Trees?
A.
Imputation
B.
Ignoring missing values
C.
Using a separate category for missing values
D.
All of the above
Show solution
Solution
All of the mentioned techniques can be used to handle missing values in Decision Trees, depending on the context.
Correct Answer:
D
— All of the above
Learn More →
Showing 1 to 20 of 20 (1 Pages)