Java: June 2017

For which of the following tasks might K-means clustering be a suitable algorithm? Select all that apply.

Given a database of information about your users, automatically group them into different market segments. CORRECT

Given sales data from a large number of products in a supermarket, figure out which products tend to form coherent groups (say are frequently purchased together) and thus should be put on the same shelf) CORRECT

Given historical weather records, predict the amount of rainfall tomorrow (this would be a real-valued output. WRONG

Given sales data from a large number of products in a supermarket, estimate future sales for each of these products. WRONG

K-means is an iterative algorithm, and two of the following steps are repeatedly carried out in its inner-loop. Which two?

The cluster assignment step, where the parameters

c^{(i)}

are updated. CORRECT

Move the cluster centroids, where the centroids

μ_{k}

are updated. CORRECT

Randomly initialize the cluster centroids. WRONG

Test on the cross-validation set. WRONG

Suppose you have an unlabeled dataset

{x^{(1)}, \dots, x^{(m)}}

. You run K-means with 50 different random initializations, and obtain 50 different clusterings of the data. What is the recommended way for choosing which one of these 50 clusterings to use?

Plot the data and the cluster centroids, and pick the clustering that gives the most "coherent" cluster centroids. WRONG

Compute the distortion function

J (c^{(1)}, \dots, c^{(m)}, μ_{1}, \dots, μ_{k})

,        and pick the one that minimizes this.CORRECT

      Manually examine the clusterings, and pick the best one.WRONG

      Use the elbow method.   WRONG

Which of the following statements are true? Select all that apply.

K-Means will always give the same results regardless of the initialization of the centroids. WRONG

On every iteration of K-means, the cost function

J (c^{(1)}, \dots, c^{(m)}, μ_{1}, \dots, μ_{k})

(the distortion function) should either stay the same or decrease; in particular, it should not increase. CORRECT

Once an example has been assigned to a particular centroid, it will never be reassigned to another different centroid. WRONG

A good way to initialize K-means is to select K (distinct) examples from the training set and set the cluster centroids equal to these selected examples.
CORRECT

Java

Tuesday, June 13, 2017

K-means quiz

About Me