Task I: Citation Prediction

Winner: J N Manjunatha, Raghavendra Pandey, Sivaramakrishnan R., and
M Narasimha Murty (1329)

Second place: Claudia Perlich, Foster Provost, and Sofus Macskassy
(1360)

Third place: David Vogel (1398)
The number in parentheses after each winner is the L_1 difference between the
solution and the submission.
The solution for Task 1 is now available. The
first column is the hepth arxivid and the second column is (# of citations
from MayJuly)  (# of citations from FebApril) for all papers that received
at least 6 citations between Feb and April.
In addition, the full list of new citations
for all papers between May and July is also available.
Task II: Data Cleaning

Winner: David Vogel (421,582)

Second place: Sunita Sarawagi, Kapil M. Bhudhia, Sumana Srinivasan,
and V.G.Vinod Vydiswaran (516,242)

Third place: Martine Cadot and Joseph di Martino (538,013)
The number in parentheses after each winner is the size of the symmetric
difference between the submission and the solution.
The solution for Task 2 is a citation graph provided by SLAC/SPIRES for hepph
papers available as a zip file. Papers in the
left column cite papers in the right column.
Task III: Download Estimation

Winner: Janez Brank and Jure Leskovec (21,232)

Second place: Joseph Milana, Joseph Sirosh, Joel Carleton, Gabriela
Surpi, Daragh Hartnett, and Michinari Momma (21,950.6)

Third place: Kohsuke Konishi (23,759)
The number in parentheses after each winner is the L_1 difference between the
contestant's submission and the solution.
The actual download counts for the top 150 papers (50 from each of the three
missing periods) are available here. The left
column is the number of downloads the paper received in its first 60 days and
the right column is the hepth arxivid.
Task IV: Open Task

Winner: Amy McGovern, Lisa Friedland, Michael Hay, Brian Gallagher,
Andrew Fast, Jennifer Neville, and David Jensen. "Exploiting Relational
Structure to Understand Publication Patterns in HighEnergy Physics"

Second place: Shoude Lin and Hans Chalupsky. "Using Unsupervised
Link Discovery Methods to Find Interesting Facts and Connections in a
Bibliography Dataset"

Third place: Shawndra Hill and Foster Provost "The Myth of the
DoubleBlind Review"
The submissions for Task 4 were evaluated by a small program committee
consisting of the three
KDD Cup 2003 cochairs, Mark
Craven (University of WisconsinMadison),
David Page (University of WisconsinMadison), and
Soumen Chakrabarti (Indian Institute of Technology Bombay).
