Cornell University

Fall 2007

The past decade has seen a convergence of social and technological networks, with systems such as the World Wide Web characterized by the interplay between rich information content, the millions of individuals and organizations who create it, and the technology that supports it. This course covers recent research on the structure and analysis of such networks, and on models that abstract their basic properties. Topics include combinatorial and probabilistic techniques for link analysis, centralized and decentralized search algorithms, network models based on random graphs, and connections with work in the social sciences.

The course prerequisites include introductory-level background in algorithms, graphs, probability, and linear algebra.

The work for the course will consist primarily of two problem sets, a short reaction paper, and a more substantial project.

- Problem Set 1. (due 9/14)
- Problem Set 2. (due 9/28)

(1) **Complex Networks and the Web**

- Background on the Web
- V. Bush. As We May Think. Atlantic Monthly, July 1945.
- World Wide Web Consortium. A Little History of the World Wide Web, 1945-1995.
- A. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan, R. Stata, A. Tomkins, J. Wiener. Graph structure in the web. 9th International World Wide Web Conference, May 2000.
- V. Bush. As We May Think. Atlantic Monthly, July 1945.

- Background on the Web

(2) **Small-World Properties in Networks**

- Survey Paper
- J. Kleinberg. Complex Networks and Decentralized Search Algorithms. Proceedings of the International Congress of Mathematicians (ICM), 2006.

- Empirical studies of the Small-World Phenomenon
- S. Milgram. The small world problem. Psychology Today 1(1967).
- J. Travers and S. Milgram. An experimental study of the small world problem. Sociometry 32(1969).
- P. Killworth and H. Bernard, Reverse small world experiment. Social Networks 1(1978).
- J. Kleinfeld. Could it be a Big World After All? The `Six Degrees of Separation' Myth. Society, April 2002.
- Peter Sheridan Dodds, Roby Muhamad, Duncan J. Watts. An Experimental Study of Search in Global Social Networks. Science 301(2003), 827.
- Duncan Watts and I discuss the small-world problem on NPR's Science Friday (8 August 2003).
- P. Killworth, C. McCarty, H.R. Bernard, M. House. The accuracy of small-world chains in social networks. Social Networks 28(2006), 85-96.
- S. Milgram. The small world problem. Psychology Today 1(1967).

- Models for Small-World Networks and Decentralized Search
- Watts, D. J. and S. H. Strogatz. Collective dynamics of 'small-world' networks. Nature 393:440-42(1998).
- J. Kleinberg. The small-world phenomenon: An algorithmic perspective. Proc. 32nd ACM Symposium on Theory of Computing, 2000.
- J. Kleinberg. Small-World Phenomena and the Dynamics of Information. Advances in Neural Information Processing Systems (NIPS) 14, 2001.
- D. J. Watts, P. S. Dodds, M. E. J. Newman Identity and Search in Social Networks. Science, 296, 1302-1305, 2002.
- Lada A. Adamic, Rajan M. Lukose, Amit R. Puniyani, Bernardo A. Huberman. Search in Power-Law Networks. Phys. Rev. E, 64 46135 (2001).
- A. Clauset and C. Moore How Do Networks Become Navigable? preprint at arxiv.org, 2003.
- Oskar Sandberg and Ian Clarke. The Evolution of Navigable Small-World Networks. arxiv cs.DS/0607025, July 2006.
- Watts, D. J. and S. H. Strogatz. Collective dynamics of 'small-world' networks. Nature 393:440-42(1998).

- Studies of Small-World Effects in On-Line Datasets
- F. Menczer. Growing and Navigating the Small World Web by Local Content. Proc. Natl. Acad. Sci. USA 99(22): 14014-14019, 2002
- L. Adamic, E. Adar. How To Search a Social Network. Social Networks, 27(3):187-203, July 2005.
- Lada A. Adamic and Bernardo A. Huberman Information dynamics in a networked world. in: Eli Ben-Naim, Hans Frauenfelder, Zoltan Toroczkai, (Eds.), 'Complex Networks', Lecture Notes in Physics, Springer, 2003.
- D. Liben-Nowell, J. Novak, R. Kumar, P. Raghavan, A. Tomkins. Geographic routing in social networks. Proc. Natl. Acad. Sci. USA, 102(Aug 2005).
- Jure Leskovec, Eric Horvitz. Worldwide Buzz: Planetary-Scale Views on an Instant-Messaging Network. MSR-TR-2006-186, 2007.
- F. Menczer. Growing and Navigating the Small World Web by Local Content. Proc. Natl. Acad. Sci. USA 99(22): 14014-14019, 2002

- Survey Paper

(3) ** Decentralized Search in Peer-to-Peer Networks **

- Survey Papers
- H. Balakrishnan, M.F. Kaashoek, D. Karger, R. Morris, and I. Stoica. Looking up data in P2P systems. Communications of the ACM 46:43-48, February 2003.
- E-K Lua, J. Crowcroft, M. Pias, R. Sharma and S. Lim. A Survey and Comparison of Peer-to-Peer Overlay Network Schemes. {\em IEEE Communications Surveys and Tutorials}, 7(2005).
- H. Balakrishnan, M.F. Kaashoek, D. Karger, R. Morris, and I. Stoica. Looking up data in P2P systems. Communications of the ACM 46:43-48, February 2003.

- Unstructured Approaches
- I. Clarke, O. Sandberg, B. Wiley, T. Hong. Freenet: A Distributed Anonymous Information Storage and Retrieval System. International Workshop on Design Issues in Anonymity and Unobservability, 2000. T. Hong. Performance. Peer-to-Peer: Harnessing the Power of Disruptive Technologies. (A. Oram, editor), O'Reilly and Associates, 2001.
- A. Goel, H. Zhang, and R. Govindan. Using the Small-World Model to Improve Freenet Performance. IEEE Infocom, 2002.
- I. Clarke, O. Sandberg, B. Wiley, T. Hong. Freenet: A Distributed Anonymous Information Storage and Retrieval System. International Workshop on Design Issues in Anonymity and Unobservability, 2000. T. Hong. Performance. Peer-to-Peer: Harnessing the Power of Disruptive Technologies. (A. Oram, editor), O'Reilly and Associates, 2001.

- Structured Approaches
- C. Greg Plaxton, Rajmohan Rajaraman, Andrea W. Richa. Accessing Nearby Copies of Replicated Objects in a Distributed Environment. ACM Symposium on Parallel Algorithms and Architectures, SPAA 1997.
- S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker. A Scalable Content-Addressable Network. ACM SIGCOMM, 2001
- A. Rowstron, P. Druschel. Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems. 18th IFIP/ACM International Conference on Distributed Systems Platforms (Middleware 2001).
- I. Stoica, R. Morris, D. Karger, F. Kaashoek, H. Balakrishnan. Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications. ACM SIGCOMM, 2001.
- B. Y. Zhao, J. D. Kubiatowicz, A. D. Joseph, Tapestry: An Infrastructure for Fault-Tolerant Wide-Area Location and Routing. UC Berkeley Computer Science Division, Report No. UCB/CSD 01/1141, April 2001.
- Sylvia Ratnasamy, Scott Shenker and Ion Stoica. Routing Algorithms for DHTs: Some Open Questions. 1st International Workshop on Peer-to-Peer Systems (IPTPS), 2002.
- Dalia Malkhi, Moni Naor, David Ratajczak. Viceroy: A Scalable and Dynamic Emulation of the Butterfly. ACM Symposium on Principles of Distributed Computing, 2002.
- G. S. Manku, M. Bawa, P. Raghavan. Symphony: Distributed hashing in a small world. Proc. 4th USENIX Symposium on Internet Technologies and Systems, 2003.
- G. Manku, M. Naor, and U. Wieder. Know Thy Neighbor's Neighbor: The Power of Lookahead in Randomized P2P Networks. In Proc. of ACM Symp. on Theory of Computing (STOC), 2004.
- C. Greg Plaxton, Rajmohan Rajaraman, Andrea W. Richa. Accessing Nearby Copies of Replicated Objects in a Distributed Environment. ACM Symposium on Parallel Algorithms and Architectures, SPAA 1997.

- Survey Papers

(4) **Cascading Behavior in Networks**

- Survey Paper
- J. Kleinberg. Cascading Behavior in Networks: Algorithmic and Economic Issues. In
__Algorithmic Game Theory__(N. Nisan, T. Roughgarden, E. Tardos, V. Vazirani, eds.), Cambridge University Press, 2007.- J. Kleinberg. Cascading Behavior in Networks: Algorithmic and Economic Issues. In

- Diffusion of Innovation in Social Networks
- M. Granovetter. Threshold models of collective behavior. American Journal of Sociology 83(6):1420-1443, 1978.
- T. Schelling. Micromotives and Macrobehavior. Norton, 1978.
- An applet by Sean Luke that simulates a version of the Schelling segregation model. (There are many other simulations on the Web as well.)
- D. Strang and S. Soule. Diffusion in organizations and social movements: From hybrid corn to poison pills. Annual Review of Sociology, 24:265--290, 1998.
- S. Morris. Contagion. Review of Economic Studies 67 (2000), 57-78.
- H. Peyton Young. The Diffusion of Innovations in Social Networks. Santa Fe Institute Working Paper 02-04-018.
- E. Berger. Dynamic Monopolies of Constant Size. Journal of Combinatorial Theory Series B 83(2001), 191-200.
- P. Dodds and D. J. Watts. Universal Behavior in a Generalized Model of Contagion. Phyical Review Letters, 2004.
- E Lieberman, C Hauert, MA Nowak (2005). Evolutionary Dynamics on Graphs. Nature 433: 312-316
- D. Centola, M. Macy, V. Eguiluz. Cascade Dynamics of Multiplex Propagation. Physica A, to appear.
- L. Backstrom, D. Huttenlocher, J. Kleinberg, X. Lan. Group Formation in Large Social Networks: Membership, Growth, and Evolution. Proc. 12th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, 2006.
- Petter Holme and M. E. J. Newman. Nonequilibrium phase transition in the coevolution of networks and opinions. arXiv:physics/0603023v3
- Jure Leskovec, Lada Adamic, Bernardo Huberman. The Dynamics of Viral Marketing. ACM Conference on Electronic Commerce (EC 2006), Ann Arbor, MI, USA, 2006.
- N. Immorlica, J. Kleinberg, M. Mahdian, T. Wexler. The Role of Compatibility in the Diffusion of Technologies Through Social Networks. Proc. 8th ACM Conference on Electronic Commerce, 2007.
- M. Jackson, L. Yariv. Diffusion of Behavior and Equilibrium Properties in Network Games. American Economic Review (Papers and Proceedings), Vol 97, No. 2, pp 92-98, 2007.
- M. Granovetter. Threshold models of collective behavior. American Journal of Sociology 83(6):1420-1443, 1978.

- Finding Influential Nodes based on Cascade Models
- Pedro Domingos, Matt Richardson. Mining Knowledge-Sharing Sites for Viral Marketing. Eighth International Conference on Knowledge Discovery and Data Mining, KDD-2002.
- D. Kempe, J. Kleinberg, E. Tardos. Maximizing the Spread of Influence through a Social Network. Proc. 9th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, 2003.
- E. Mossel and S. Roch. On the Submodularity of Influence in Social Networks. ACM Symposium on Theory of Computing, 2007.
- Jure Leskovec, Andreas Krause, Carlos Guestrin, Christos Faloutsos, Jeanne VanBriesen, Natalie Glance. Cost-effective Outbreak Detection in Networks. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (ACM KDD), 2007.
- Pedro Domingos, Matt Richardson. Mining Knowledge-Sharing Sites for Viral Marketing. Eighth International Conference on Knowledge Discovery and Data Mining, KDD-2002.

- Cascading Effects Among Blogs and News Sources
- D. Gruhl, R. Guha, D. Liben-Nowell, A. Tomkins. Information Diffusion through Blogspace. Proc. International WWW Conference, 2004.
- E. Adar, L. Zhang, L. A. Adamic, R. M. Lukose. Implicit Structure and the Dynamics of Blogspace. Workshop on the Weblogging Ecosystem, at the International WWW Conference, 2004.
- Jure Leskovec, Mary McGlohon, Christos Faloutsos, Natalie Glance, Matthew Hurst. Cascading Behavior in Large Blog Graphs. SIAM International Conference on Data Mining (SDM) 2007.
- D. Gruhl, R. Guha, D. Liben-Nowell, A. Tomkins. Information Diffusion through Blogspace. Proc. International WWW Conference, 2004.

- Epidemic Algorithms in Networks
- Alan J. Demers, Daniel H. Greene, Carl Hauser, Wes Irish, John Larson, Scott Shenker, Howard E. Sturgis, Daniel C. Swinehart, Douglas B. Terry. Epidemic Algorithms for Replicated Database Maintenance. Operating Systems Review 22(1): 8-32 (1988)
- R. van Renesse. Scalable and secure resource location. Hawaii International Conference on System Sciences, 2000.
- R. van Renesse, K. Birman, W. Vogels. Astrolabe: A Robust and Scalable Technology For Distributed System Monitoring, Management, and Data Mining. to appear in ACM Transactions on Computer Systems, 2003.
- Chalee Asavathiratham. The Influence Model: A Tractable Representation for the Dynamics of Networked Markov Chains. Ph.D. Thesis, MIT 2000.
- Mor Harchol-Balter, Tom Leighton, Daniel Lewin. Resource Discovery in Distributed Networks. ACM Symposium on Principles of Distributed Computing (PODC), 1999, pp. 229-238.
- Shay Kutten, David Peleg Deterministic Distributed Resource Discovery. ACM Symposium on Principles of Distributed Computing (PODC), 2000.
- R. Karp, C. Schindelhauer, S. Shenker, B. Vocking. Randomized Rumor Spreading. 41st IEEE Symposium on Foundations of Computer Science, 2000.
- D. Kempe, J. Kleinberg, A. Demers. Spatial gossip and resource location protocols. Proc. 33rd ACM Symposium on Theory of Computing, 2001.
- Alan J. Demers, Daniel H. Greene, Carl Hauser, Wes Irish, John Larson, Scott Shenker, Howard E. Sturgis, Daniel C. Swinehart, Douglas B. Terry. Epidemic Algorithms for Replicated Database Maintenance. Operating Systems Review 22(1): 8-32 (1988)

- Survey Paper

(5) **Power-Law Distributions**

- Survey Paper
- M. Mitzenmacher A Brief History of Generative Models for Power Law and Lognormal Distributions. Internet Mathematics, vol 1, No. 2, pp. 226-251, 2004.

- Models that Generate Power-Law Degrees in Networks
- A.-L. Barabasi, Reka Albert, and Hawoong Jeong. Mean-field theory for scale-free random networks. Physica A 272 173-187 (1999).
- Bernardo A. Huberman, Lada A. Adamic. Growth dynamics of the World-Wide Web. Nature, 399 (1999) 130.
- R. Kumar, P. Raghavan, S. Rajagopalan, D. Sivakumar, A. Tomkins, and E. Upfal. Stochastic models for the Web graph. 41th IEEE Symp. on Foundations of Computer Science, 2000, pp. 57-65.
- W. Aiello, F. Chung, L. Lu. Random evolution of massive graphs. Handbook of Massive Data Sets, (Eds. James Abello et al.), Kluwer, 2002, pages 97-122.
- B. Bollobas, C. Borgs, J. Chayes, and O. Riordan Directed scale-free graphs Proceedings of the 14th ACM-SIAM Symposium on Discrete Algorithms (2003), 132-139.
- R. Kleinberg, J. Kleinberg, Isomorphism and Embedding Problems for Infinite Limits of Scale-Free Graphs. Proc. 15th ACM-SIAM Symposium on Discrete Algorithms, 2005.
- A. Fabrikant, E. Koutsoupias, C. Papadimitriou. Heuristically Optimized Trade-offs: A New Paradigm for Power Laws in the Internet. 29th International Colloquium on Automata, Languages, and Programming (ICALP), 2002.
- Noam Berger, Christian Borgs, Jennifer Chayes, Raissa D'Souza, and Robert Kleinberg. Competition-Induced Preferential Attachment. Proceedings of the 31st International Colloquium on Automata, Languages, and Programming (ICALP 2004), pages 208-221.
- M. Molloy and B. Reed. A Critical Point for Random Graphs with a Given Degree Sequence. Random Structures and Algorithms 6(1995) 161-180.
- M. E. J. Newman, S. H. Strogatz and D. J. Watts, Random graphs with arbitrary degree distributions and their applications. Phys. Rev. E 64, 026118 (2001).
- J. Carlson and J. Doyle. Highly Optimized Tolerance: A Mechanism for Power Laws in Designed Systems. Physical Review E 60:2(1999).
- A.-L. Barabasi, Reka Albert, and Hawoong Jeong. Mean-field theory for scale-free random networks. Physica A 272 173-187 (1999).

- Survey Paper

(6) **Economic Models for Behavior in Networks**

- Survey Papers
- M. Jackson. A Survey of Models of Network Formation: Stability and Efficiency. In Group Formation in Economics; Networks, Clubs and Coalitions, (G. Demange and M. Wooders, eds.), Cambridge University Press, 2004.
- Christos H. Papadimitriou, Algorithms, Games, and the Internet. Proc. 33rd ACM Symposium on Theory of Computing, 2001.
- E. Tardos. Network Games. Proc. 36th ACM Symposium on Theory of Computing, 2004.
- T. Roughgarden. Selfish Routing. Ph.D. thesis, Cornell University, 2002.
- M. Jackson. A Survey of Models of Network Formation: Stability and Efficiency. In Group Formation in Economics; Networks, Clubs and Coalitions, (G. Demange and M. Wooders, eds.), Cambridge University Press, 2004.

- Strategic Behavior in Network Design
- M. Jackson and A. Wolinsky. A Strategic Model of Social and Economic Networks. Journal of Economic Theory, Vol. 71, No. 1, 1996, pp 44--74.
- J. Feibenbaum, C. Papadimitriou, and S. Shenker. Sharing the Cost of Multicast Transmissions. Journal of Computer and System Sciences, 63:21--41, 2001.
- Alex Fabrikant, Ankur Luthra, Elitza Maneva, Christos H. Papadimitriou, and Scott Shenker, In Proc. of 2003 PODC, pages 347-351.`
- E. Anshelevich, A. Dasgupta, E. Tardos, T. Wexler. Near-Optimal Network Design with Selfish Agents. Proc. 35th ACM Symposium on Theory of Computing, 2003
- E. Anshelevich, A. Dasgupta, J. Kleinberg, E. Tardos, T. Wexler, T. Roughgarden. The Price of Stability for Network Design with Fair Cost Allocation. In FOCS 2004.
- R. Johari and J. N. Tsitsiklis. Efficiency Loss in a Network Resource Allocation Game. Mathematics of Operations Research, 2004.
- Jacomo Corbo and David C. Parkes. The Price of Selfish Behavior in Bilateral Network Formation. In the Proc. 24rd ACM Symp. on Principles of Distributed Computing (PODC'05), Las Vegas, Nevada, USA, pages 99-107, 2005.
- S. Goyal, F. Vega-Redondo. Learning, Network Formation, and Coordination, Games and Economic Behavior 50, 2005
- E. Even-Dar, M. Kearns, and S. Suri. A Network Formation Game for Bipartite Exchange Economies. ACM-SIAM Symposium on Discrete Algorithms, 2007.
- M. Jackson and A. Wolinsky. A Strategic Model of Social and Economic Networks. Journal of Economic Theory, Vol. 71, No. 1, 1996, pp 44--74.

- Economic Interaction on Networks
- Cook, K., Emerson, R., Gillmore, M. & Yamagishi, T. 1983. The distribution of power in exchange networks. American Journal of Sociology 89: 275-305.
- R. Kranton, D. Minehart. Commpetition in Buyer-Seller Networks. Review of Economic Design, 5(3), September 2000, pp. 301-331. S. Kakade, M. Kearns, L. Ortiz, R. Pemantle, and S. Suri. Economic Properties of Social Networks. Proceedings of NIPS 2004.
- L. Blume, D. Easley, J. Kleinberg, E. Tardos. Trading Networks with Price-Setting Agents. Proc. 8th ACM Conference on Electronic Commerce, 2007.
- Douglas Gale and Shachar Kariv. Financial Networks. American Economic Review: Papers and Proceedings.
- Cook, K., Emerson, R., Gillmore, M. & Yamagishi, T. 1983. The distribution of power in exchange networks. American Journal of Sociology 89: 275-305.

- Survey Papers

(7) **Link Analysis for Web search**

- Survey Papers
- S. Chakrabarti, B. Dom, D. Gibson, J. Kleinberg, S.R. Kumar, P. Raghavan, S. Rajagopalan, A. Tomkins. Mining the link structure of the World Wide Web. IEEE Computer, August 1999.
- A. Arasu, J. Cho, H. Garcia-Molina, A. Paepcke, S. Raghavan. Searching the Web. ACM Transactions on Internet Technology 1(1): 2-43 (2001)
- S. Chakrabarti, B. Dom, D. Gibson, J. Kleinberg, S.R. Kumar, P. Raghavan, S. Rajagopalan, A. Tomkins. Mining the link structure of the World Wide Web. IEEE Computer, August 1999.

- Hubs and Authorities, PageRank, and variants
- J. Kleinberg. Authoritative sources in a hyperlinked environment. Proc. 9th ACM-SIAM Symposium on Discrete Algorithms, 1998. Extended version in Journal of the ACM 46(1999). Also appears as IBM Research Report RJ 10076, May 1997.
- S. Brin and L. Page. The Anatomy of a Large-Scale Hypertextual Web Search Engine. Proc. 7th International World Wide Web Conference, 1998.
- S. Chakrabarti, B. Dom, D. Gibson, J. Kleinberg, S.R. Kumar, P. Raghavan, S. Rajagopalan, A. Tomkins, Mining the link structure of the World Wide Web. IEEE Computer, August 1999.
- A. Borodin, J. S. Rosenthal, G. O. Roberts, P. Tsaparas, Finding Authorities and Hubs From Link Structures on the World Wide Web. 10th International World Wide Web Conference, May 2001.
- Dimitris Achlioptas, Amos Fiat, Anna Karlin, Frank McSherry, Web Search via Hub Synthesis. 42nd IEEE Symposium on Foundations of Computer Science, 2001, p.611-618.
- Davood Rafiei, Alberto Mendelzon. What is this Page Known for? Computing Web Page Reputations. Proc. WWW9 Conference, Amsterdam, May 2000
- Pedro Domingos, Matt Richardson. The Intelligent Surfer: Probabilistic Combination of Link and Content Information in PageRank. Advances in Neural Information Processing Systems 14, 2002.
- Taher H. Haveliwala. Topic-Sensitive PageRank. 11th International World Wide Web Conference, 2002.
- Alon Altman and Moshe Tennenholtz. Ranking Systems: The PageRank Axioms. Proceedings of ACM EC 2005.
- J. Kleinberg. Authoritative sources in a hyperlinked environment. Proc. 9th ACM-SIAM Symposium on Discrete Algorithms, 1998. Extended version in Journal of the ACM 46(1999). Also appears as IBM Research Report RJ 10076, May 1997.

- Connections with Social Networks and Citation Analysis.
- G. Pinski, F. Narin. Citation influence for journal aggregates of scientific publications: Theory, with application to the literature of physics. Information Processing and Management, 12(1976), pp. 297--312.
- L. Katz. A new status index derived from sociometric analysis. Psychometrika 18(1953).
- C.H. Hubbell. An input-output approach to clique identification. Sociometry 28(1965).
- P. Bonacich. Power and Centrality: A family of measures. American Journal of Sociology 92(1987).
- G. Pinski, F. Narin. Citation influence for journal aggregates of scientific publications: Theory, with application to the literature of physics. Information Processing and Management, 12(1976), pp. 297--312.

- Survey Papers

(8) **Spectral Analysis**

- Spectral Analysis of Data
- Deerwester, S., Dumais, S. T., Landauer, T. K., Furnas, G. W. and Harshman, R. A. Indexing by latent semantic analysis. Journal of the Society for Information Science, 41(6), 391-407 (1990).
- Christos Papadimitriou, Prabhakar Raghavan Hisao Tamaki, Santosh Vempala. Latent Semantic Indexing: A Probabilistic Analysis. 17th ACM Symposium on the Principles of Database Systems, 1998.
- Yossi Azar, Amos Fiat, Anna Karlin, Frank McSherry and Jared Saia. Spectral Analysis of Data. 33rd ACM Symposium on Theory of Computing, 2001.
- D. Donoho. High-Dimensional Data Analysis: The Curses and Blessings of Dimensionality. Notes to accompany lecture at AMS Conference on Mathematical Challenges of the 21st Century, August 2000.
- Deerwester, S., Dumais, S. T., Landauer, T. K., Furnas, G. W. and Harshman, R. A. Indexing by latent semantic analysis. Journal of the Society for Information Science, 41(6), 391-407 (1990).

- Spectral Analysis of Networks
- N. Alon. Eigenvalues and Expanders. Combinatorica 6(1986), pp. 83-96.
- A. Sinclair and M. Jerrum. Approximate Counting, Uniform Generation, and Rapidly Mixing Markov Chains. Information and Computation 82(1989), pp. 93-133.
- F.R.K. Chung. Spectral Graph Theory. AMS Press. 1997.
- Daniel A. Spielman and Shang-Hua Teng. Spectral Partitioning Works: Planar graphs and finite element meshes. Proceedings of the 37th Annual IEEE Conference on Foundations of Computer Science, 1996. and UC Berkeley Technical Report number UCB CSD-96-898.
- Dimitris Achlioptas, Amos Fiat, Anna Karlin, Frank McSherry, Web Search via Hub Synthesis. 42nd IEEE Symposium on Foundations of Computer Science, 2001, p.611-618.
- A. Y. Ng, A. X. Zheng, and M. I. Jordan. Link analysis, eigenvectors, and stability. International Joint Conference on Artificial Intelligence (IJCAI), 2001.
- A. Y. Ng, A. X. Zheng, and M. I. Jordan. Stable algorithms for link analysis. 24th International Conference on Research and Development in Information Retrieval (SIGIR 2001). Hoff, P.D., Raftery, A.E., and Handcock, M.S. Latent Space Approaches to Social Network Analysis. Journal of the American Statistical Association , vol. 97(2002), no. 460, 1090-1098.
- N. Alon. Eigenvalues and Expanders. Combinatorica 6(1986), pp. 83-96.

- Random Walks on Networks
- L. Lovasz. Random Walks on Graphs: A Survey. Combinatorics: Paul Erdos is Eighty (vol. 2), 1996, pp. 353-398.
- Ziv Bar-Yossef, Alexander Berg, Steve Chien, Jittat Fakcharoenphol, and Dror Weitz. Approximating Aggregate Queries about Web Pages via Random Walks. 26th International Conference on Very Large Databases (VLDB), 2000, pages 535-544.
- Soumen Chakrabarti, Mukul Joshi, Kunal Punera, and David M. Pennock. The structure of broad topics on the Web. 11th World Wide Web conference, May 2002.
- L. Lovasz. Random Walks on Graphs: A Survey. Combinatorics: Paul Erdos is Eighty (vol. 2), 1996, pp. 353-398.

- Spectral Analysis of Data

(9) **Rank Aggregation and Meta-Search**

- Social Choice Theory
- K. Arrow. Social Choice and Individual Values. Wiley, 1951.
- H.P. Young. Condorcet's Theory of Voting. American Political Science Review 82(1988), pp. 1231-1244.
- K. Arrow. Social Choice and Individual Values. Wiley, 1951.

- Rank Aggregation Algorithms
- Cynthia Dwork, Ravi Kumar, Moni Naor, D. Sivakumar. Rank Aggregation Methods for the Web. 10th International World Wide Web Conference, May 2001.
- Weiyi Meng, Clement Yu, King-Lup Liu. Building Efficient and Effective Metasearch Engines. ACM Computing Surveys 34(2002).
- William W. Cohen, Robert E. Schapire, and Yoram Singer. Learning to order things. Journal of Artificial Intelligence Research, 10: 243--270, 1999.
- T. Joachims. Optimizing Search Engines Using Clickthrough Data. Eighth International Conference on Knowledge Discovery and Data Mining, KDD-2002.
- R. Fagin, R. Kumar, M. Mahdian, D. Sivakumar, and E. Vee. Comparing and Aggregating Rankings with Ties. 23rd ACM Symposium on Principles of Database Systems (PODS 2004).
- Cynthia Dwork, Ravi Kumar, Moni Naor, D. Sivakumar. Rank Aggregation Methods for the Web. 10th International World Wide Web Conference, May 2001.

- Social Choice Theory

(10) **The Time Axis**

- Survey Paper
- J. Kleinberg. Temporal Dynamics of On-Line Information Streams. Draft chapter for the forthcoming book
*Data Stream Management: Processing High-Speed Data Streams*(M. Garofalakis, J. Gehrke, R. Rastogi, eds.), Springer.- J. Kleinberg. Temporal Dynamics of On-Line Information Streams. Draft chapter for the forthcoming book

- Short Time Scales: Usage Data and Bursty Dynamics
- L.R. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. In Proc. IEEE, Vol. 77, No. 2, pp. 257-286, Feb. 1989
- J. Allan, J.G. Carbonell, G. Doddington, J. Yamron, Y. Yang, Topic Detection and Tracking Pilot Study: Final Report. Proc. DARPA Broadcast News Transcription and Understanding Workshop, Feb. 1998.
- R. Swan, J. Allan, Automatic generation of overview timelines. Proc. SIGIR Intl. Conf. on Research and Development in Information Retrieval, 2000.
- S. Havre, B. Hetzler, L. Nowell, ThemeRiver: Visualizing Theme Changes over Time. Proc. IEEE Symposium on Information Visualization, 2000.
- J. Kleinberg. Bursty and Hierarchical Structure in Streams. Proc. 8th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, 2002.
- J. Aizen, D. Huttenlocher, J. Kleinberg, A. Novak. Traffic-Based Feedback on the Web. Proceedings of the National Academy of Sciences 101(Suppl.1):5254-5260, 2004.
- R. Kumar, J. Novak, P. Raghavan, A. Tomkins. On the bursty evolution of Blogspace. Proc. International WWW Conference, 2003.
- Y. Zhu and D. Shasha. Efficient Elastic Burst Detection in Data Streams. Proc. ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, 2003.
- E. Gabrilovich, S. Dumais, E. Horvitz. NewsJunkie: Providing Personalized Newsfeeds via Analysis of Information Novelty. Proceedings of the Thirteenth International World Wide Web Conference. May 2004.
- M. Vlachos, C. Meek, Z. Vagena, D. Gunopulos. Identifying Similarities, Periodicities and Bursts for Online Search Queries. Proc. ACM SIGMOD International Conference on Management of Data, 2004.
- A..-L. Barabasi. The origin of bursts and heavy tails in human dynamics. Nature 435, 207-211 (2005).
- Micah Dubinko, Ravi Kumar, Joseph Magnani, Jasmine Novak, Prabhakar Raghavan, Andrew Tomkins. Visualizing Tags over Time. WWW2006 Conference. See also the demo of flickr tag visualization.
- Xuerui Wang and Andrew McCallum. Topics over Time: A Non-Markov Continuous-Time Model of Topical Trends. Conference on Knowledge Discovery and Data Mining (KDD) 2006.
- Xuanhui Wang, ChengXiang Zhai, Xiao Hu, and Richard Sproat, Mining Correlated Bursty Topic Patterns from Coordinated Text Streams. Proceedings of the 2007 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'07 ), pages 784-793.
- L.R. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. In Proc. IEEE, Vol. 77, No. 2, pp. 257-286, Feb. 1989

- Longer Time Scales: Network Change and Evolution
- D. Fetterly, M. Manasse, M. Najork, J.L. Wiener. A large-scale study of the evolution of web pages. WWW 2003.
- A. Ntoulas, J. Cho, C. Olston. What's new on the web? The evolution of the web from a search engine perspective. WWW 2004.
- W. Koehler. A longitudinal study of Web pages continued: a consideration of document persistence. Information Research 9(2), paper 174, 2004.
- D. Spinellis. The decay and failures of web references. Communications of the ACM 46:71-77, 2003.
- S. Lawrence, D.M. Pennock, G.W. Flake, R. Krovetz, F.M. Coetzee, E. Glover, F.A. Nielsen, A. Kruger, C.L. Giles. Persistence of web references in scientific research. IEEE Computer 34:26-31, 2001.
- Z. Bar-Yossef, A.Z. Broder, R. Kumar, A. Tomkins. Sic Transit Gloria Telae: Towards an understanding of the web's decay. WWW 2004.
- J. Leskovec, J. Kleinberg, C. Faloutsos. Graphs over Time: Densification Laws, Shrinking Diameters and Possible Explanations. Proc. 11th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, 2005.
- Gueorgi Kossinets and Duncan J. Watts. Empirical Analysis of an Evolving Social Network Science 6 January 2006: Vol. 311. no. 5757, pp. 88 - 90
- R. Kumar, J. Novak and A. Tomkins. Structure and Evolution of Online Social Networks. In Proceedings of the Twelfth ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2006.
- D. Fetterly, M. Manasse, M. Najork, J.L. Wiener. A large-scale study of the evolution of web pages. WWW 2003.

- Survey Paper

(11) **Clustering, Classification, and Community Structures**

- Clustering and Communities in Networks
- M. Granovetter. The strength of weak ties. American Journal of Sociology, 78(6):1360-1380, 1973.
- R. Kumar, P. Raghavan, S. Rajagopalan, A. Tomkins. Trawling the web for emerging cyber-communities. 8th International World Wide Web Conference, May 1999.
- R. Agrawal, R. Srikant. Fast Algorithms for Mining Association Rules. 20th Int'l Conference on Very Large Databases (VLDB), 1994.
- S. Dill, R. Kumar, K. McCurley, S. Rajagopalan, D. Sivakumar, A. Tomkins. Self-similarity in the Web. 27th International Conference on Very Large Data Bases, 2001.
- Gary Flake, Steve Lawrence, C. Lee Giles, Frans Coetzee. Self-Organization and Identification of Web Communities. IEEE Computer, 35:3, March 2002.
- Gary Flake, K. Tsioutsiouliklis, R.E. Tarjan. Graph Clustering Techniques based on Minimum Cut Trees. Technical Report 2002-06, NEC, Princeton, NJ, 2002.
- M. Girvan and M. E. J. Newman. Community structure in social and biological networks. Proc. Natl. Acad. Sci. USA 99, 8271-8276 (2002).
- J. Kleinberg. An Impossibility Theorem for Clustering. Advances in Neural Information Processing Systems (NIPS) 15, 2002.
- J. Hopcroft, O. Khan, B. Kulis, and B. Selman. Natural communities in large linked networks. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 541--546,
- C. Faloutsos, K. McCurley and A. Tomkins. Fast Discovery of Connection Subgraphs. Tenth ACM SIGKDD Conference, Seattle, WA, 2004.
- M. Granovetter. The strength of weak ties. American Journal of Sociology, 78(6):1360-1380, 1973.

- Labeling and Classification using Networks of Pairwise Relationships
- J. Besag. Spatial interaction and the statistical analysis of lattice systems. J. Royal Statistical Society B, 36(1974).
- Soumen Chakrabarti, Byron E. Dom, and Piotr Indyk. Enhanced hypertext categorization using hyperlinks. Proceedings of the ACM International Conference on Management of Data, SIGMOD 1998, pages 307-318.
- O. Veksler. Efficient Graph-Based Energy Minimization Methods in Computer Vision. Ph.D. Thesis, Cornell University, 1999.
- J. Kleinberg, E. Tardos. Approximation Algorithms for Classification Problems with Pairwise Relationships: Metric Labeling and Markov Random Fields. Proc. 40th IEEE Symposium on Foundations of Computer Science, 1999.
- A. Broder, R. Krauthgamer, and M. Mitzenmacher. Improved Classification via Connectivity Information. ACM-SIAM Symposium on Discrete Algorithms, 2000.
- Avrim Blum, Shuchi Chawla. Learning from Labeled and Unlabeled Data using Graph Mincuts. International Conference on Machine Learning (ICML), 2001.
- T. Joachims, N. Cristianini, and J. Shawe-Taylor. Composite Kernels for Hypertext Categorisation. International Conference on Machine Learning (ICML), 2001.
- B. Taskar, P. Abbeel and D. Koller. Discriminative Probabilistic Models for Relational Data. Eighteenth Conference on Uncertainty in Artificial Intelligence (UAI 2002).
- J. Besag. Spatial interaction and the statistical analysis of lattice systems. J. Royal Statistical Society B, 36(1974).

- Clustering and Communities in Networks

- Collaboration and citation networks: For the 2003 KDD Cup competition, Johannes Gehrke, Paul Ginsparg, and I provided a dataset based on the arXiv pre-print database, which allows one to study the networks of co-authorships and citations among a large community of physicists. Here is the KDD Cup dataset and a paper describing the competition in more detail.
- KDD Cup dataset.
- P. Ginsparg, J. Gehrke, J. Kleinberg. Overview of the 2003 KDD Cup. SIGKDD Explorations, 2004.
- KDD Cup dataset.

- Internet topology: The network structure of the Internet can be studied at several levels of resolution. Here is a dataset at the autonomous system (AS) level.
- Web subgraphs: There are many such datasets available for download. One set is maintained by Panayiotis Tsaparas; the experiments that used this data are described in his Ph.D. thesis, and in other papers linked from his home page.
- Web data from Panayiotis Tsaparas.
- P. Tsaparas, Link Analysis Ranking, Ph.D. Thesis, Department of Computer Science, University of Toronto, 2004.
- Web data from Panayiotis Tsaparas.

- Semantic networks: Free association datasets for words have been collected by cognitive scientists; these are constructed by compiling the free responses of test subjects when presented with cue words. (For example, a test subject presented with the cue word `ice' might react with the word `cold,' `cream,' or `water.')
- Collaboration and citation networks: For the 2003 KDD Cup competition, Johannes Gehrke, Paul Ginsparg, and I provided a dataset based on the arXiv pre-print database, which allows one to study the networks of co-authorships and citations among a large community of physicists. Here is the KDD Cup dataset and a paper describing the competition in more detail.

Here is a list of the main topics for each lecture. The topics are based on a mixture of content from papers in the outline above; a paper (in many cases a survey) providing the conceptual starting point for each lecture is given as part of the list.

- Lecture 01 (08/24): Course overview.
- Lecture 02 (08/27): Graph-theoretic structure in the Web.
- Starting point (Lec. 2):
- A. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan, R. Stata, A. Tomkins, J. Wiener. Graph structure in the web. 9th International World Wide Web Conference, May 2000.

- Starting point (Lec. 2):

- Lecture 03 (08/29): Small-world experiments in social networks.
- Starting points (Lec. 3):
- J. Travers and S. Milgram. An experimental study of the small world problem. Sociometry 32(1969).
- J. Kleinfeld. Could it be a Big World After All? The `Six Degrees of Separation' Myth. Society, April 2002.
- Peter Sheridan Dodds, Roby Muhamad, Duncan J. Watts. An Experimental Study of Search in Global Social Networks. Science 301(2003), 827.
- J. Travers and S. Milgram. An experimental study of the small world problem. Sociometry 32(1969).

- Starting points (Lec. 3):

- Lecture 04 (08/31): Some Basic Random Graph Models.
- Lecture 05 (09/03): Random Graphs and Expansion
- Starting point (Lecs. 4-5):

- Lecture 06 (09/05): Models of Small-World Networks
- Lecture 07 (09/07): Decentralized Search in Small Worlds
- Lecture 08 (09/10): Hierarchical Network Models
- Lecture 09 (09/12): Network Datasets with Small-World Properties
- Starting point (Lecs. 6-9):
- J. Kleinberg. Complex Networks and Decentralized Search Algorithms. Proceedings of the International Congress of Mathematicians (ICM), 2006.

- Starting point (Lecs. 6-9):

- Lecture 10 (09/14): Decentralized Search in Peer-to-Peer Systems
- Starting point (Lec. 10):
- H. Balakrishnan, M.F. Kaashoek, D. Karger, R. Morris, and I. Stoica. Looking up data in P2P systems. Communications of the ACM 46:43-48, February 2003.

- Starting point (Lec. 10):

- Lecture 11 (09/17): Nearest-Neighbor Search in Metric Spaces
- Starting point (Lec. 11):
- David R. Karger, Matthias Ruhl. Finding nearest neighbors in growth-restricted metrics. STOC 2002: 741-750

- Starting point (Lec. 11):

- Lecture 12 (09/19): Models of Collective Action in Networks
- Starting point (Lec. 12):
- M. Granovetter. Threshold models of collective behavior. American Journal of Sociology 83(6):1420-1443, 1978.

- Starting point (Lec. 12):

- Lecture 13 (09/21): Game-Theoretic and Threshold-Based Models of Diffusion in Networks
- Lecture 14 (09/24): Simple Probabilistic Models of Contagion
- Lecture 15 (09/26): Probabilistic Contagion in a Network
- Lecture 16 (09/28): Finding Influential Sets of Nodes
- Lecture 17 (10/01): Empirical Studies of Cascades in Networks
- Starting point (Lecs. 13-17):
- J. Kleinberg. Cascading Behavior in Networks: Algorithmic and Economic Issues. In
__Algorithmic Game Theory__(N. Nisan, T. Roughgarden, E. Tardos, V. Vazirani, eds.), Cambridge University Press, 2007.- J. Kleinberg. Cascading Behavior in Networks: Algorithmic and Economic Issues. In

- Starting point (Lecs. 13-17):

- Lecture 18 (10/03): Collective Problem-Solving in Networks: Behavioral Graph Coloring (Lecturer: Sid Suri)
- Starting point (Lec. 18):
- M. Kearns, S. Suri, and N. Montfort. An Experimental Study of the Coloring Problem on Human Subject Networks. Science 313(5788), August 2006, pp. 824-827.

- Starting point (Lec. 18):

- Lecture 19 (10/05): Heavy-Tailed Distributions in Networks
- Lecture 20 (10/10): Preferential Attachment and Rich-get-Richer Processes
- Starting point (Lecs. 19-20):
- M. Mitzenmacher A Brief History of Generative Models for Power Law and Lognormal Distributions. Internet Mathematics, vol 1, No. 2, pp. 226-251, 2004.

- Starting point (Lecs. 19-20):

- Lecture 21 (10/12): Hierarchical Network Models and Zipf's Law
- Starting point (Lec. 21):
- Mitzenmacher survey and: J. Leskovec, J. Kleinberg, C. Faloutsos. Graphs over Time: Densification Laws, Shrinking Diameters and Possible Explanations. Proc. 11th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, 2005.

- Starting point (Lec. 21):

- Lecture 22 (10/15): Power Laws through Optimization
- Starting point (Lec. 22):
- Mitzenmacher survey and: A. Fabrikant, E. Koutsoupias, C. Papadimitriou. Heuristically Optimized Trade-offs: A New Paradigm for Power Laws in the Internet. 29th International Colloquium on Automata, Languages, and Programming (ICALP), 2002.

- Starting point (Lec. 22):

- Lecture 23 (10/17): Game-Theoretic Models of Network Formation
- Starting point (Lec. 23):
- Alex Fabrikant, Ankur Luthra, Elitza Maneva, Christos H. Papadimitriou, and Scott Shenker, On a Network Creation Game. In Proc. of 2003 PODC, pages 347-351.

- Starting point (Lec. 23):

- Lecture 24 (10/19): Strategic Behavior in Network Cost-Sharing
- Starting point (Lec. 24):
- E. Anshelevich, A. Dasgupta, J. Kleinberg, E. Tardos, T. Wexler, T. Roughgarden. The Price of Stability for Network Design with Fair Cost Allocation. In FOCS 2004.

- Starting point (Lec. 24):

- Lecture 25 (10/22): Network Models of Markets
- Starting point (Lec. 25):
- R. Kranton, D. Minehart. Commpetition for Goods in Buyer-Seller Networks. Review of Economic Design, 5(3), September 2000, pp. 301-331.

- Starting point (Lec. 25):

- Lecture 26 (10/24): Link Analysis Algorithms
- Lecture 27 (10/26): Link Analysis and Web Search
- Starting point (Lecs. 26-27):
- A. Arasu, J. Cho, H. Garcia-Molina, A. Paepcke, S. Raghavan. Searching the Web. ACM Transactions on Internet Technology 1(1): 2-43 (2001)

- Starting point (Lecs. 26-27):

- Lectures 28-29 (10/29-31): Temporal Evolution of Information Networks (Lecturer: David Williamson)
- Starting point (Lecs. 28-29):
- D. Fetterly, M. Manasse, M. Najork, J.L. Wiener. A large-scale study of the evolution of web pages. WWW 2003.
- A. Ntoulas, J. Cho, C. Olston. What's new on the web? The evolution of the web from a search engine perspective. WWW 2004.
- W. Koehler. A longitudinal study of Web pages continued: a consideration of document persistence. Information Research 9(2), paper 174, 2004.
- Z. Bar-Yossef, A.Z. Broder, R. Kumar, A. Tomkins. Sic Transit Gloria Telae: Towards an understanding of the web's decay. WWW 2004.
- J. Fowler, S. Jeon. The Authority of Supreme Court Precedent. Social Networks (to appear).
- D. Fetterly, M. Manasse, M. Najork, J.L. Wiener. A large-scale study of the evolution of web pages. WWW 2003.

- Starting point (Lecs. 28-29):

- Lecture 30 (11/2): Enumerating Web Communities
- Starting point (Lec. 30):
- R. Kumar, P. Raghavan, S. Rajagopalan, A. Tomkins. Trawling the web for emerging cyber-communities. 8th International World Wide Web Conference, May 1999.
- R. Agrawal, R. Srikant. Fast Algorithms for Mining Association Rules. 20th Int'l Conference on Very Large Databases (VLDB), 1994.
- R. Kumar, P. Raghavan, S. Rajagopalan, A. Tomkins. Trawling the web for emerging cyber-communities. 8th International World Wide Web Conference, May 1999.

- Starting point (Lec. 30):

- Lecture 31 (11/5): Spectral Analysis of Graphs: Connectivity
- Lectures 32-33 (11/7-9): Spectral Analysis of Graphs: Partitioning
- Starting point (Lecs. 31-33):
- Notes on Spectral Analysis of Graphs.
- Daniel A. Spielman and Shang-Hua Teng. Spectral Partitioning Works: Planar graphs and finite element meshes. Proceedings of the 37th Annual IEEE Conference on Foundations of Computer Science, 1996. and UC Berkeley Technical Report number UCB CSD-96-898.
- Notes on Spectral Analysis of Graphs.

- Starting point (Lecs. 31-33):

- Lecture 34 (11/12): Random Walks on Graphs: Stationary Distributions
- Lecture 35 (11/14): Random Walks on Graphs: Mixing Rates
- Starting point (Lecs. 34-35):

- Lecture 36 (11/16): Minimum Cuts for Graph Partitioning
- Starting point (Lec. 36):
- Yuri Boykov, Olga Veksler and Ramin Zabih. Fast Approximate Energy Minimization via Graph Cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence vol. 23, no. 11 pages 1222-1239 (2001).
- Avrim Blum, Shuchi Chawla. Learning from Labeled and Unlabeled Data using Graph Mincuts. International Conference on Machine Learning (ICML), 2001.
- Yuri Boykov, Olga Veksler and Ramin Zabih. Fast Approximate Energy Minimization via Graph Cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence vol. 23, no. 11 pages 1222-1239 (2001).

- Starting point (Lec. 36):

- Lecture 37 (11/19): Clustering in Networks
- Starting point (Lec. 37):
- Gary Flake, K. Tsioutsiouliklis, R.E. Tarjan. Graph Clustering Techniques based on Minimum Cut Trees. Technical Report 2002-06, NEC, Princeton, NJ, 2002.
- M.E.J. Newman, M. Girvan. Finding and evaluating community structure in networks. Phys. Rev. E 69, 026113 (2004).
- Gary Flake, K. Tsioutsiouliklis, R.E. Tarjan. Graph Clustering Techniques based on Minimum Cut Trees. Technical Report 2002-06, NEC, Princeton, NJ, 2002.

- Starting point (Lec. 37):