Archive policy for the Cornell Computer Science Department

On December 1, 2004, the Cornell Computer Science Faculty by consensus adopted the following policy, which was proposed by Bill Arms, Joe Halpern and Steve Vavasis.
All papers emanating from the department will be saved in a publicly accessible archive like
This document will attempt to explain the rationale and implementation of the policy.

Rationale for the policy

A crisis has been evolving in the past few years in the realm of scholarly publishing because commercial journals have raised their prices substantially without a proportional benefit to the community of authors or readers. For example, the EMPS (Engineering, Math and Physical Sciences) library at Cornell has seen a 9% subscription increase in just the past year. The worst offender seems to be Elsevier, which publishes many CS journals.

A second looming concern with scholarly publishing is that commercial publishers are using pricing policies to push libraries into switching to all-electronic subscription. All-electronic subscription gives the commercial publisher unprecedented control over who can read articles and for what purposes those articles are used. Furthermore, an electronic subscription means that the publisher expands its role to become also the archivist of the material. There is no reason to believe that a company like Elsevier is qualified to usurp the role traditionally filled by libraries as the archivist of scholarly work over a period of decades or centuries. For more information about the problems faced by university libraries, please visit the home page of the SPARC project of the Association of Research Libraries.

An obvious solution to these problems is for the academic community as a whole to create its own archive under the control of scholars rather than a corporate board of directors. This is the goal behind We believe that all academics ought to include their publications in this kind of archive. Therefore, we are establishing this as a departmental policy. We would like to establish it as a policy for the whole world, but we have to start somewhere!

Naturally, a member of the department could easily follow this policy on his or her own initiative without the existence of a departmentwide policy. Indeed, several of us already archive our papers as a matter of course because archiving brings several benefits to the author including enhanced visibility of the result and proof of precedence of discovery. But we believe there are three reasons why it is useful to make archiving an official policy of the department.

  1. By making it a policy, we are making a public statement in favor of open archiving.
  2. There is clearly a snowball effect at work: the more computer scientists who archive, the more useful the archive becomes, and hence more people will archive, etc.
  3. If archiving becomes a policy, then the University Library, which has considerable expertise in the copyright issues involved, can help us to make sure that we protect our right to archive and distribute our materials when we sign journal copyright transfer agreements.

FAQ concerning this policy

What is archiving?

Archiving means storing and managing a document in a way that will ensure its availability over a long period of time, e.g., decades or centuries. University libraries are usually considered archival repositories. In contrast, personal home-pages are generally not considered archival repositories.

What makes archiving special?

In the past, archiving has meant that the document should be printed with long-lasting inks on long-lasting papers and stored in a stable environment (away from direct sunlight, high humidity, etc.) In addition, archiving also implies good cataloging mechanisms to make sure documents can be found when they are sought. In the internet era, archiving means that documents should be stored in a stable and well-backed up medium. It also means that a group of archivists must over the years take responsibility for updating digital documents in the case that encoding standards (such as PDF) and retrieval protocols (such as http) evolve.

What is is a web-based archival repository for scientific documents. Currently, it has three subject areas: physics, mathematics and computer science. It is supported by the Cornell University library and contains hundreds of thousands of papers from around the world. It is run by a self-perpetuating committee of academics. It was founded by Paul Ginsparg while he was at Los Alamos National Laboratory. Paul is currently a Professor of Physics and of Computing and Information Science at Cornell.

Why is considered archival?

Why should I want to submit my paper to

Please see the rationale section of this document for some reasons why archiving your paper is beneficial to your career.

But what if I want to keep my Latex source confidential?

There are several answers to this question. When you submit your Latex paper, you can check a box indicating that the Latex should not be distributed. In addition, if your paper has comments in it that you would prefer to keep confidential, you can run a perl script (available on the arxiv website) that strips comments prior to submission.

But many journals, e.g., ACM and SIAM, think that PDF is good enough. Really, why can't just use PDF?

The maintainers of have found incompatibilities in versions of PDF that may render your document unreadable by scholars over a very long period of time. There is currently a proposal for an archival version of pdf called PDF/A. If this proposal becomes reality, then will probably allow PDF/A submission.

What if my document is in Word?

In this case, arxiv allows you to submit the PDF version of your document.

Doesn't archiving violate a journal's copyright policy?

First, note that you can alter copyright transfer agreements to preserve more rights for yourself. Naturally, a journal might reject the paper if it disagrees with your alterations to the agreement, but we have heard that many people have successfully altered these agreements without adverse consequences. Later, we will post some possible alterations that people have successfully used on copyright transfer agreements.

Assuming you don't alter the agreement, you are subject to the terms of it. Here are the policies of some of the larger CS publishers.

What about embargo policies?

Some journals have embargo policies stating that a result may not be disclosed in any form prior to journal submission. Some very well known journals like Science have a firm embargo policy. An example of a CS publication with an embargo policy is SIGGRAPH proceedings. SIGGRAPH uses a blind reviewing system (i.e., paper reviewers are not told the names of the authors), and any web-distribution of a paper would undermine the possibility that it could be blindly reviewed. We are currently investigating specifically whether there is a workaround for SIGGRAPH. In the case of Cornell University employees, the matter appears to be moot since, as mentioned above, ACM gives authors permission to post papers after acceptance on their employers' websites.

Naturally, the journal's policy overrides this policy, i.e., we are not suggesting that anyone should violate a journal's policy in order to follow this one. On the other hand, if you are a believer in archiving and regularly submit papers to journals and conferences with embargo policies, then you can use this document as an argument to convince the journal to loosen its embargo policies.

Does undermine the traditional refereeing process? In other words, what will happen if everyone starts submitting papers to and they cease submitting them to refereed journals?

This is an interesting question that will need to be periodically revisited. The experience so far in physics (which has the largest subject area of is that the refereeing process has not been abandoned, i.e., the papers usually end up in refereed journals as well.

What if I simply don't want to archive my papers?

Compliance with this new policy of archiving papers is entirely voluntary. On the other hand, if you choose to ignore the policy, give some consideration to the reasons for this choice. If there are technical reasons not already covered by this FAQ, please bring them to the attention of an board members, for example, Joe Halpern.


I received helpful comments on this writeup from Bill Arms, Joe Halpern and Ross Atkinson.
Stephen A. Vavasis, Department of Computer Science, 4130 Upson Hall, Cornell University, Ithaca, NY 14853, Last update: April 12, 2005.