 |
 |
 |
 |
 |
 |
(a) What is a similarity measure?
|
|
|
(b) What is a similarity matrix?
|
|
|
(c) Suppose that you are clustering
documents based on co-
|
|
|
occurrence
of citations. Suggest a similarity
measure that you
|
|
might use.
|
|
|
(d) Explain the ideas behind the inverted
file algorithm for
|
|
|
calculating
a similarity matrix.
|
|