Q. In hierarchical clustering, what does 'agglomerative' refer to?
A.A method that starts with all points as individual clusters
B.A method that requires the number of clusters to be predefined
C.A technique that merges clusters based on distance
D.A type of clustering that uses a centroid
Solution
Agglomerative clustering begins with each data point as its own cluster and merges them iteratively based on distance until a single cluster is formed.
Correct Answer: A — A method that starts with all points as individual clusters
Q. In hierarchical clustering, what is agglomerative clustering?
A.A bottom-up approach to cluster formation
B.A top-down approach to cluster formation
C.A method that requires prior knowledge of clusters
D.A technique that uses K-means as a base
Solution
Agglomerative clustering is a bottom-up approach where each data point starts as its own cluster and pairs of clusters are merged as one moves up the hierarchy.
Correct Answer: A — A bottom-up approach to cluster formation
Q. In hierarchical clustering, what is the result of a dendrogram?
A.A visual representation of the clustering process
B.A table of cluster centroids
C.A list of data points in each cluster
D.A summary of the clustering algorithm's performance
Solution
A dendrogram visually represents the arrangement of clusters and the distances at which they are merged.
Correct Answer: A — A visual representation of the clustering process
Q. What is a common application of clustering in marketing?
A.Predicting customer behavior
B.Segmenting customers into distinct groups
C.Optimizing supply chain logistics
D.Forecasting sales trends
Solution
Clustering is often used in marketing to segment customers into distinct groups based on purchasing behavior or demographics.
Correct Answer: B — Segmenting customers into distinct groups
Q. What is a common application of K-means clustering in the real world?
A.Image segmentation
B.Spam detection
C.Sentiment analysis
D.Time series forecasting
Solution
K-means clustering is often used in image segmentation to group similar pixels together.
Correct Answer: A — Image segmentation
Q. What is a key advantage of using hierarchical clustering over K-means?
A.It requires less computational power
B.It does not require the number of clusters to be specified in advance
C.It is always more accurate
D.It can handle larger datasets
Solution
Hierarchical clustering does not require the number of clusters to be predetermined, allowing for more flexibility in exploring data.
Correct Answer: B — It does not require the number of clusters to be specified in advance
Q. What is a key characteristic of DBSCAN compared to K-means?
A.It requires the number of clusters to be specified
B.It can find clusters of arbitrary shape
C.It is faster than K-means for all datasets
D.It uses centroids to define clusters
Solution
DBSCAN can identify clusters of arbitrary shape and does not require the number of clusters to be specified in advance.
Correct Answer: B — It can find clusters of arbitrary shape
Q. What is the main advantage of hierarchical clustering over K-means?
A.It does not require the number of clusters to be specified in advance
B.It is faster and more efficient
C.It can handle larger datasets
D.It is less sensitive to outliers
Solution
Hierarchical clustering does not require the number of clusters to be predetermined, allowing for more flexibility in analysis.
Correct Answer: A — It does not require the number of clusters to be specified in advance
Q. What is the main advantage of using hierarchical clustering over K-means?
A.It is faster and more efficient
B.It does not require the number of clusters to be specified
C.It can handle large datasets better
D.It is less sensitive to outliers
Solution
Hierarchical clustering does not require the number of clusters to be specified in advance, allowing for more flexibility in cluster formation.
Correct Answer: B — It does not require the number of clusters to be specified
Q. What is the main difference between K-means and K-medoids clustering?
A.K-means uses centroids, while K-medoids uses actual data points
B.K-medoids is faster than K-means
C.K-means can only handle numerical data, while K-medoids can handle categorical data
D.K-medoids requires the number of clusters to be specified, while K-means does not
Solution
K-means uses centroids to represent clusters, while K-medoids uses actual data points as the center of clusters, making it more robust to outliers.
Correct Answer: A — K-means uses centroids, while K-medoids uses actual data points
Q. What is the primary objective of the K-means clustering algorithm?
A.To minimize the distance between points in the same cluster
B.To maximize the distance between different clusters
C.To create a hierarchical structure of clusters
D.To classify data into predefined categories
Solution
K-means aims to minimize the distance between points within the same cluster by assigning points to the nearest centroid.
Correct Answer: A — To minimize the distance between points in the same cluster
Q. Which distance metric is commonly used in K-means clustering?
A.Manhattan distance
B.Cosine similarity
C.Euclidean distance
D.Hamming distance
Solution
K-means typically uses Euclidean distance to measure the distance between data points and centroids.
Correct Answer: C — Euclidean distance
Q. Which of the following clustering methods is best suited for discovering non-linear relationships in data?
A.K-means
B.Hierarchical clustering
C.DBSCAN
D.Gaussian Mixture Models
Solution
DBSCAN is effective for discovering non-linear relationships and can identify clusters of varying shapes and sizes, unlike K-means.
Correct Answer: C — DBSCAN
Q. Which of the following clustering methods is sensitive to outliers?
A.K-means
B.Hierarchical clustering
C.DBSCAN
D.Gaussian Mixture Models
Solution
K-means is sensitive to outliers because they can significantly affect the position of the centroid, leading to poor clustering results.
Correct Answer: A — K-means
Q. Which of the following is NOT a type of hierarchical clustering?
A.Single linkage
B.Complete linkage
C.K-means linkage
D.Average linkage
Solution
K-means linkage is not a type of hierarchical clustering; it refers to the K-means algorithm itself.
Correct Answer: C — K-means linkage
Q. Which of the following scenarios is best suited for hierarchical clustering?
A.When the number of clusters is known
B.When the data is high-dimensional
C.When a hierarchy of clusters is desired
D.When speed is a priority
Solution
Hierarchical clustering is ideal when a hierarchy of clusters is needed, as it provides a detailed view of the data structure.
Correct Answer: C — When a hierarchy of clusters is desired
Q. Which of the following scenarios is K-means clustering NOT suitable for?
A.When clusters are spherical and evenly sized
B.When the number of clusters is known
C.When clusters have varying densities
D.When outliers are present in the data
Solution
K-means is not suitable for clusters with varying densities, as it assumes clusters are spherical and of similar size.
Correct Answer: C — When clusters have varying densities