Goal
The goal of this task is to estimate the number of downloads that a paper
receives in its first two months in the arXiv.
Timeline
The task and data will be available April 6, 2003. Submissions must be
completed by July 21, 2003.
Input
Contestants will be given:
-
all of the datasets available for Task 1:
Citation Prediction.
-
for papers published in the following months, the downloads received from the
main site in each of its first 60 days in the arXiv.
-
February and March of 2000
-
February and April of 2001
-
March and April of 2002
Output
For each paper P submitted during the periods:
-
April 2000
-
March 2001
-
February 2002
contestants should report the estimated total number of downloads of P during
its first 60 days in the arXiv. Note that this is a single number for each
paper P, whereas the given data (3) provides a download log for the sixty days.
Evaluation
For each of the output periods (April 2000, March 2001, Feb 2002), the target
result is a vector X with one coordinate for the top 50 papers with the
greatest number of downloads in their first 60 days. For each of these papers
P, the value of P-th coordinate is the number of downloads of P during its
first 60 days.
Based on a contestant's download estimations, a vector Y will be constructed,
over the same set of 150 papers (50 from each period); the P-th coordinate of Y
will consist of the estimated number of downloads of P during its first 60
days.
The score of a prediction vector W will be equal to the L_1 difference between
the vectors X and Y.
|