The goal of this task is to estimate the number of downloads that a paper
receives in its first two months in the arXiv.
The task and data will be available April 6, 2003. Submissions must be
completed by July 21, 2003.
Contestants will be given:
all of the datasets available for Task 1:
for papers published in the following months, the downloads received from the
main site in each of its first 60 days in the arXiv.
February and March of 2000
February and April of 2001
March and April of 2002
For each paper P submitted during the periods:
contestants should report the estimated total number of downloads of P during
its first 60 days in the arXiv. Note that this is a single number for each
paper P, whereas the given data (3) provides a download log for the sixty days.
For each of the output periods (April 2000, March 2001, Feb 2002), the target
result is a vector X with one coordinate for the top 50 papers with the
greatest number of downloads in their first 60 days. For each of these papers
P, the value of P-th coordinate is the number of downloads of P during its
first 60 days.
Based on a contestant's download estimations, a vector Y will be constructed,
over the same set of 150 papers (50 from each period); the P-th coordinate of Y
will consist of the estimated number of downloads of P during its first 60
The score of a prediction vector W will be equal to the L_1 difference between
the vectors X and Y.