Donna Bergmark January, 2006
Cornell Computer Science
Ithaca, NY 14850 USA Email: bergmark@cs.cornell.edu

RESUME/VITA

WORK HISTORY AFTER RETIREMENT

I am interested in carrying our Web crawling work forward, replacing the impressive Mercator crawler with the new open source Heritrix crawler from the Internet Archive. Heritrix is, in many ways, the descendant of Mercator, and thus represents an attractive transition from our previous work.

A very good use for crawlers loaded with extensibility is to make a focussed crawler which can quickly and efficiently build up collections of URLs of Web pages relevant to a given topic.

Five Cornell M. Eng. students in the Fall of 2002 put together a very respectible collection building system using a pluggable Web crawler, based on Mercator. My retirement project is to migrate that project from Mercator to Heritrix. If the project succeeds, then we will have a nascent collection synthesis system based on an open source Web crawler.

So far it is going well, and I am having lots of fun.

WORK HISTORY BEFORE RETIREMENT

1999-Jan 2004: Researcher and Programmer/Analyst Specialist for the Cornell Digital Library Research Group.  My projects included reference linking, Web crawling, and the National Science Digital Library (NSDL) project. The work on automatic collection building by web crawling won the best paper award at JCDL 2002.

1998-1999: Programmer/Analyst Specialist for the Cornell Network Research Group, specializing in Computer Telephony and integrating the PSTN with the Internet.  Led a research project of 9 students in the Spring and Summer of 1999, which produced a component-based telephony API.  Solaris, Microsoft NT; Java, JTAPI, C/C++,  Lucent PBX and Dialogic gateway programming.   Streaming media and sound technologies.

1992-1998: Sr. Parallel Computing Specialist for the Cornell Theory Center (CTC), managing the Parallel Programming Tools team. Located, evaluated, and developed parallel programming tools for all parallel platforms at the center. Provided on-line documentation for tools and parallel languages; tested and diagnosed problems with tools, compilers, and operating systems. Provided parallel programming support; IBM RS/6000(SP), Unix, AIX, Sun OS, Irix; HPF, F90, C++, C, and HTML.

1988-1992: Manager of the Technical Integration Group within the CTC, consisting of 8 programmers. Under general direction of CTC management, provided technical information to Theory Center staff and users. Evaluated new technologies, developed hardware and software alternatives, and assisted in transferring useful products into the Theory Center. CMS, Unix 370, large IBM multiprocessors; Parallel Fortran, Fortran, C, LaTex, Expect.

1985-1988: Sr. Technical Advisor in the Cornell Theory Center. Benchmark the FPS T-Series; retargeted Parallel Pascal compiler to it. (The Parallel Pascal work won the best Paper award at the FPS Users Meeting in 1987).Transputers, Unix, VAX VMS; Occam, Pascal, Assembler, Fortran, Latex.

1981-1985: Assistant Director in the Academic Computing section of Cornell Computer Services (now Cornell Information Technogies). Administered 13 programmers, the graphics facility, custom programming, applications software, and scientific support. CMS, VM370; Assembler, Cobol, Pascal, PL/I.

1982-1983: (on leave from Cornell) Senior Programmer at Intermetrics, Inc. Cambridge, MA. Helped implement semantic analysis phase of large optimizing compiler for a high-level language. IBM uniprocessors,  Sun Unix; PL/I.

1978-1981: Senior Computer Staff Specialist in the Scientific Support Group of Cornell Computer Services. Co-authored APTRAN, a Fortran compiler for the FPS AP-190L. This compiler was in use at Cornell from April 1978 until summer 1986. As of September 1980, managed  the Software Support Group, with responsibility for all applications software on Cornell's mainframe. IBM 360, FPS Array Processor, CMS, VM370; Fortran, APAL.

1971 to 1978: Lecturer, Instructor, and Assistant Professor in Mathematics at Ithaca College. Taught Fortran, PL/I, COBOL, Assembler, machine architecture, operating systems, data processing, and general computer science. Helped develop the computer science curriculum at Ithaca College. Directed several independent research projects.

1970 to 1971: Coordinator of Academic Computer Services, Ithaca College, Computer Center. Responsible for applications programming, systems programming, user documentation, and user services. Wrote batch monitor for running student jobs, wrote the student job accounting system, and wrote a load-and-go Fortran compiler. RCA Spectra; Assembler 360.

1969 to 1970: Systems Analyst for Tectonics, Inc., Ithaca NY. Wrote and maintained data processing programs for several construction firms; taught programming courses as required. IBM 1130; Fortran.

1968 to 1969: Systems Programmer, SUNY Computer Center, Buffalo NY. Interfaced CDC, IBM, and UNIVAC remote batch terminals with SUNY's CDC 6400 computer, which involved modification and testing of  software provided by various vendors, as well as dealing with communications hardware. The project,  a success,  was presented at the 1968 IFIPS Congress as one of the first experiments in university networking using multi-vendor hardware. CDC 6400; CDC Assembler.

1966 to 1967: Programmer Analyst for American Cyanamid, Bound Brook NJ. Worked on several scientific Fortran programs (simulations, gas chromatograph dye matching, real-time process control). IBM 1800; Fortran, Assembler.

1964 to 1966: Programmer for the Harvard Business School, Cambridge, Mass. and grader for Written Analysis of Cases. Maintained the Harvard Business School Game. IBM 7094; Fortran.

EDUCATION

Carleton College and Boston University, BA in History
Cornell University, MS in Computer Science
Recent extramural courses in: computer science, materials science, chemistry, astronomy, classics, linguistics and physics.
Completed Alexander Hamilton Seminar on Installing & Managing NT Server 4.0.

RESEARCH INTERESTS

Compilers and languages, particularly for parallel processing systems. General interest in high performance computing, especially for science and engineering applications. Recent interest in network programming (Java, LDAP, IP Telephony and Web crawling).

SELECTED PUBLICATIONS AND PRESENTATIONS

D. Bergmark, Steve Hitchcock et al.  Open Citation Linking.  D-Lib Magazine (8,10).  October 2002.

D. Bergmark, C. Lagoze, and A. Sbityakov.  Focused Crawls, Tunneling, and Digital Libraries.  ACM European Conference on Digital Libraries, Rome, Italy, September 16-18, 2002. (Preprint)

D. Bergmark.  Using High Performance Systems to Build Collections for a Digital Library.  Proceedings of the 2002 International Conference on Parallel Processing Workshops (ICPP 2002 Workshops), Vancouver, Canada, August 18--21, 2002. (Postscript Preprint)

D. Bergmark.  Collection Synthesis.  ACM Proceedings of the Joint Conference on Digital Libraries 2002, Portland Ore (best paper award), July, 2002. (official) )

D. Bergmark, P. Phempoonpanich, and S. Zhao. ``Scraping the ACM Digital Library''. SIGIR Forum (35,2). Fall 2001.

D. Bergmark and C. Lagoze. An Architecture for Automatic Reference Linking. Proceedings of the European Conference on Digital Libraries, Darmstadt, DE, September 2001. (pdf)

D. Bergmark.  Automatic Extraction of Reference Linking Information from Online Documents.  Technical Report TR 2000-1821, Computer Science Department, Cornell University, November, 2000. (postscript)(pdf)

D. Bergmark and C. Lagoze.  Reference Linking the Web's Scholarly Papers.   Technical Report TR 2001-1835, Computer Science Department, Cornell University, February, 2001.  (postscript)(pdf)

D. Bergmark, W. Arms, and C. Lagoze.  An Architecture for Reference Linking.  Technical Report TR 2000-1820, Computer Science Department, Cornell University, October, 2000. (postscript )(pdf)

D. Bergmark.  Link Accessibility in Electronic Journal Articles.  Technical Report TR 2000-1793, Computer Science Department, Cornell University, March, 2000. (postscript)(pdf)(html)

D. Bergmark and S. Keshav.  Building Blocks for IP Telephony. IEEE Communications Magazine, pages 88-94, April 2000. (postscript version)

D. Bergmark.  ITX Programmer's Guide.  Cornell Computer Science Technical Report TR99-1768. http://www.cs.cornell.edu/cnrg/telephony/JavaDocs/HTML_Guide/HTML_Guide.html.

D. Bergmark. Tools for HPF Programmers. A tutorial presented at Supercomputing '97, November 1997.

B. Appelbe and D. Bergmark. Software tools for high-performance computing: Survey and recommendations. Scientific Programming, pages 239-249, Fall 1996.

D. Bergmark. Optimization and parallelization of a commodity trade model for the IBM SP2 using parallel programming tools. In Proceedings of 1995 International Conference on Supercomputing, Barcelona Spain, pages 227-236, July 1995.

D. Bergmark. ``The optimization and parallelization of an economic model using KSR programming tools'', Invited talk, KSR Users' Group Meeting, Manchester U.K. July 1994.

C.M. Pancake and D. Bergmark, ``Do parallel languages respond to the needs of scientific programmers?'', Computer, 23(12):13-23, December 1990.

A longer list of my work can be found at http://www.cs.cornell.edu/bergmark/resume_long.html.