Asif Haque [resume]
Ph.D., Cornell University
Computer Science (minor : Operations Research)
Committee : Paul Ginsparg [wiki], Eric Friedman, David Williamson
Email: asif@cs.cornell.edu


Research

The broad aim of my research is to develop methodologies to facilitate the understanding of the interaction between information systems and social systems. Currently I am exploring arXiv, a scholarly communication system, to develop techniques that would ultimately generalize to a variety of settings.

As a system that has been functional and growing for almost two decades arXiv provides us with an ideal testbed to investigate scholarly behavior. A combination of author and citation network analysis, text analysis, and mining the wealth of log data such as downloads, referral logs, blog tracebacks is required to answer sociological queries. Furthermore, there are temporal dimensions to the analysis as simple as normalizing across time or as computationally challenging as tracing evolution. While conventional machine learning is appropriate for many prediction tasks, the issue of scalability suggests augmenting it with distributed computing paradigms such as Map-Reduce programming, better suited for large scale computation. [pagerank]

In our investigation [positional effects] of how the position of a paper in the daily arXiv listing affects its citation and readership the reasons for and the consequences of the phenomenon of authors jockeying for the top positions were thoroughly examined. Using machine learning methods long-term citations were correlated with a variety of readership features and suggestions for a hybrid measure of popularity were provided. Further investigation [additional positional effects] revealed interesting procrastination effects for the last few positions of the daily listing. Geographic bias as a strong reason for the effects associated with positions was ruled out.

Complementary to the metadata analysis is our attempt [phrases] to extract subtopical concepts, characterized by phrases, through a combination of text and network analysis, augmented by logs of search and readership data. Tracking concepts can provide a useful temporal overview of linked corpora [click trends] and this method of extracting subtopics automatically has potential. A principled way of using n-grams for categorization and subdocument similarity is also being explored.

I am contributing to an ethnographic study of comparing and contrasting co-author networks of different fields of science. The quantitative aspect of the study involves [mesoscopic analysis] clustering of co-author networks, characterizing subnetworks of clusters and identifying node roles in the network. An author name disambiguation algorithm [disambiguation], using co-authorship and self-citation, has been devised. We have measured the effectiveness of our algorithm through manually disambiguated samples from our data set, with an eye on the network structure.


Publications

  •     Information and Social System Interaction [abstract] [fulltext]
    Asif-ul Haque
    PhD Dissertation, Cornell University (2011)

  •     Phrases as Subtopical Concepts in Scholarly Text
    Asif-ul Haque, Paul Ginsparg
    Joint Conference on Digital Libraries 2011

  •     Resolving Author Name Homonymy to Improve Resolution of Structures in Co-author Networks [preprint]
    Theresa Velden, Asif-ul Haque, Carl Lagoze
    Joint Conference on Digital Libraries 2011
    Nominated for the best paper award

  •     Last but not Least: Additional Positional Effects on Citation and Readership in arXiv [preprint] [article]
    Asif-ul Haque, Paul Ginsparg
    Journal of the American Society for Information Science and Technology
    Vol 61, Issue 12, pages 2381--2388 (2010)

  •     Positional Effects on Citation and Readership in arXiv [preprint] [article]
    Asif-ul Haque, Paul Ginsparg
    Journal of the American Society for Information Science and Technology
    Vol 60, Issue 11, pages 2203--2218 (2009)

  •     A New Approach to Analyzing Patterns of Collaboration in Co-authorship Networks: Mesoscopic Analysis and Interperetation [preprint] [article]
    Theresa Velden, Asif-ul Haque, Carl Lagoze
    Scientometrics
    Vol 85, Number 1, pages 219--242 (2010)

  •     PageRank Calculation using Map Reduce [article]
    Vijayanand Chokkapu, Asif-ul Haque
    Cornell University Web Lab Tech Report (2008) [Web Lab Project]

  •     Drawing Lines by Uniform Packing [article]
    Asif-ul Haque, M Saifur Rahman, Mehedi Bakht, M Kaykobad
    Computers & Graphics, Vol 30, pages 207--212 (2006)

  •     On Average Length of Cycles in Complete Graphs
    Asif-ul Haque, M Saifur Rahman, M Sohel Rahman, M Kaykobad
    In Proceedings of the International Conference on Computer and Information Technology (2002)


Personal


Last Update: July 12, 2011