Q. In K-Means clustering, what does the 'K' represent?
-
A.
The number of features
-
B.
The number of clusters
-
C.
The number of iterations
-
D.
The number of data points
Solution
'K' represents the number of clusters that the algorithm will create from the data.
Correct Answer:
B
— The number of clusters
Learn More →
Q. What is DBSCAN primarily used for in clustering?
-
A.
To find spherical clusters
-
B.
To identify noise and outliers
-
C.
To classify data points
-
D.
To reduce dimensionality
Solution
DBSCAN is effective for identifying clusters of varying shapes and for detecting noise and outliers.
Correct Answer:
B
— To identify noise and outliers
Learn More →
Q. What is the main difference between K-Means and DBSCAN clustering algorithms?
-
A.
K-Means is faster than DBSCAN
-
B.
DBSCAN can find clusters of arbitrary shape
-
C.
K-Means requires labeled data
-
D.
DBSCAN is only for high-dimensional data
Solution
DBSCAN can find clusters of arbitrary shape, while K-Means assumes spherical clusters.
Correct Answer:
B
— DBSCAN can find clusters of arbitrary shape
Learn More →
Q. What is the main limitation of K-Means clustering?
-
A.
It is computationally expensive
-
B.
It requires a predefined number of clusters
-
C.
It can only handle numerical data
-
D.
It is sensitive to outliers
Solution
K-Means requires the user to specify the number of clusters in advance, which can be a limitation.
Correct Answer:
B
— It requires a predefined number of clusters
Learn More →
Q. What is the primary goal of clustering in unsupervised learning?
-
A.
To predict future outcomes
-
B.
To group similar data points together
-
C.
To label data points
-
D.
To reduce dimensionality
Solution
Clustering aims to group similar data points together based on their features without prior labels.
Correct Answer:
B
— To group similar data points together
Learn More →
Q. What type of data is best suited for clustering?
-
A.
Labeled data
-
B.
Time series data
-
C.
Unlabeled data
-
D.
Sequential data
Solution
Clustering is an unsupervised learning technique, making it best suited for unlabeled data.
Correct Answer:
C
— Unlabeled data
Learn More →
Q. Which evaluation metric is most suitable for assessing clustering performance?
-
A.
Accuracy
-
B.
F1 Score
-
C.
Adjusted Rand Index
-
D.
Mean Absolute Error
Solution
The Adjusted Rand Index measures the similarity between two data clusterings, making it suitable for evaluating clustering performance.
Correct Answer:
C
— Adjusted Rand Index
Learn More →
Q. Which of the following algorithms is commonly used for clustering?
-
A.
Linear Regression
-
B.
K-Means
-
C.
Support Vector Machine
-
D.
Decision Tree
Solution
K-Means is a popular clustering algorithm that partitions data into K distinct clusters.
Correct Answer:
B
— K-Means
Learn More →
Q. Which of the following applications can benefit from clustering?
-
A.
Customer segmentation
-
B.
Spam detection
-
C.
Image classification
-
D.
Time series forecasting
Solution
Customer segmentation is a common application of clustering, where customers are grouped based on purchasing behavior.
Correct Answer:
A
— Customer segmentation
Learn More →
Q. Which of the following is a real-world application of clustering?
-
A.
Spam detection in emails
-
B.
Image classification
-
C.
Market segmentation
-
D.
Sentiment analysis
Solution
Market segmentation is a common application of clustering, where customers are grouped based on purchasing behavior.
Correct Answer:
C
— Market segmentation
Learn More →
Q. Which of the following is NOT a characteristic of hierarchical clustering?
-
A.
Creates a tree-like structure
-
B.
Can be agglomerative or divisive
-
C.
Requires the number of clusters to be specified in advance
-
D.
Can visualize data relationships
Solution
Hierarchical clustering does not require the number of clusters to be specified in advance; it builds a hierarchy of clusters.
Correct Answer:
C
— Requires the number of clusters to be specified in advance
Learn More →
Showing 1 to 11 of 11 (1 Pages)