Hari Shreedharan's homepage

My Research

My research is primarly focused around Distributed Systems. My interest in this area came from my earlier projects dealing with large amounts of data and large scale processing.

My initial projects here at Cornell, mostly revolved around algorithms for processing and classifying large amounts of HTML data. I worked with Prof. John E Hopcroft during Spring 2009 on this. The algorithms we developed could be applied to any data provided we have a large database of preclassified information to work with, as training data. We are currently working on automating this and hopefully publishing it.

I tried to reason about algorithms to do this when the data runs into several terabytes, and was later introduced to MapReduce and Hadoop as a result. Together with Dr. Alan Demers, Principal Research Scientist, Cornell University, we tried to come up with algorithms to do specific relational operations cheaply using MapReduce. This work is currently in progress.

My current focus is on Fault Tolerant Replication. I am working with the Live Distributed Objects Group here at Cornell. My work currently focuses on Membership systems which are fault tolerant. More updates on this project will be up soon.