First, download the data zipfile found here
, unzip it, and read the included README file.
Two speeches are connected in the graph (with some probability) if they both are in favor or against of a certain legislation. are both speeches in favor (although they may be in favor of two different bills) or both speeches against (although they may be against two different bills.) Two speeches are also connected (with some probability) if they occur in the debate for same legislation. In general, with some low probability noisy links might occur.
Challenge: Two clusters, “For” and “Against”
The challenge is to cluster the 2740 speeches into two groups, the first group consisting of “Against” speeches and the second group consisting of “For” speeches. To start you off, you are told that points (rows, where the first row is number 0) 2, 13, 18, 24 are examples of speeches that belong to category “Against" (=label 0) and speeches 1, 3, 27, 177 are examples of speeches that belong to the “For”(=label 1) category. Your goal is to predict for the 2741 speeches weather they are for or against. The competition will be hosted in on vocareum. Your goal is to make as few mistakes as possible.