Example of the one-every-lecture project proposals in CS6742

As stated in the “overall course structure” page, the middle third of the course involved “Discussion of student project proposals, based on the readings for that class meeting. Each class meeting thus involves everyone reading at least one of the two assigned papers and posting a new research proposal based on the reading to Piazza.” Here is one of these proposals, posted with Elizabeth Murnane's permission.

(Incidentally, the reason I use "Professor Lee" as my name on Piazza is that so far there have always been a few students each year who need extra reminders that I am a "real professor". I think once students encounter female computer-science professors with some regularity, such measures will no longer be needed.)


 

Piazza question [posted by Elizabeth Murnane]

A1 Proposal: Wikipedia Talk page conversations as indicators of article popularity and quality
(I figured I'd get the ball rolling with an idea I have for Assignment 1. I've presented it loosely in the style of a standard Abstract/Introduction found in a research paper. I hope the writeup isn't overly formal and specific, but I thought for the first assignment, extra detail would be better than not enough).



This proposal attempts to pull together ideas from Mishne & Glance and Gilbert & Karahalios in order to study how language can be used to better understand the collaborative behaviors of Wikipedia editors during the consensus making process and how predictive such social dynamics are of article popularity and quality. The dataset of choice is the Wikipedian conversations corpus.

Mishne & Glance identify a number of linguistic attributes of blog comments (e.g., level of subjectivity, question mark usage, total number of comments) indicative of the corresponding blog posts' popularity. Drawing analogies between blog posts and Wikipedia articles and between blog comments and Wikipedia Talk page content, we ask whether these same properties of a Talk page are related to the article's popularity, where (again, following Mishne & Glance) popularity is defined in terms of page views and in-links.

Going further, we propose extracting more fine grained measures from Talk page content in addition to those that Mishne collect; and we take additional inspiration from Gilbert & Karahalios, who recognize that traits of Amazon users such as amateur vs. pro expertise levels can influence the type of reviews they write, the reactions they have to their own and others' reviews, and the underlying motivations for writing reviews in the first place.

Specifically, we focus on two sorts of measures -- behavioral patterns of the editors posting on the Talk page as well as syntactic, semantic, and psychometric qualities of the language used in those utterances. To offer a variety of examples for each (though note only a subset of these can be investigated during this assignment for feasibility's sake):
- For behavior, we can capture the total number of utterances made on the Talk page, the amount of time that passes between utterances, the number of unique editors who made an utterance, the average number of comments per unique editor, co-edit histories of these editors, how "old" (account age) an editor is, and her Administrator status.
- For language-use, we can capture the number of words used in an utterance; its sophistication or readability (using word length or a readability scale like LIX); its positive and negative emotion as well as arousal, dominance, and valence; and the topic similarity between utterances.

Additional or alternative directions for this work also exist -- for instance, evaluating not only an article's popularity but also its quality, which we could attempt to operationalize by measuring a number of article attributes drawn from [1], such as length, the number of unique editors, article length, and median revert time.

[1] Assessing information quality of a community-based encyclopedia. B. Stvilia, M. B. Twidale, L. C. Smith, and L. Gasser. Proceedings of the International Conference on Information Quality - ICIQ 2005, page 442--454. (2005).

instructors' answer [posted by Professor Lee]

Thanks for getting the ball rolling! In terms of interestingness and level of detail, this proposal is great.

I think there may be a property of the particular dataset supplied that might not be right for what you're thinking of, which is that the "talk pages" in the Wikipedia conversations corpus (WCC) aren't article talk pages, but user talk pages. So "popularity of the article" isn't a relevant notion for this corpus, unfortunately. (You were not expected to know this ahead of time!) Would you be interested in the (arguably more social) problem of trying to identify whether a participant in a conversation is an admin or not --- a question arguably of social status, arguably related to popularity, and also arguably related to Gilbert and Karahalios' pro/amateur distinction? You can then use our "Echoes of Power" paper as a performance reference.

For the purposes of this course and exercise (not for your further research or interest, natch), I suggest focusing on the language-use features, and perhaps just one or two of most interest to you. Sentiment might be a possibility (are admins generally positive people?). Readability (I had not heard of LIX; I'm more familiar with thinks like Flesch-Kincaid): are admins more "sophisticated"? On the other hand, these features might be most easily inferred from the body of comments the particular user has made, not just the comments made in the WCC. I'd be most excited by seeing something that explicitly incorporated the interactions that are recorded in the WCC. Like, perhaps person A, an admin, typically talks in a "grade 12" fashion with other admins, but a "grade 6" level with non-admins; that kind of audience adaptation [Bell, Allan. Language style as audience design. Language in Society 13 (2): 145-204, 1984] would be a cool feature to explore.

For the purposes of this assignment, I think it's OK to not necessarily go all the way to classification studies (i.e., seeing how good one can predict concept Y); it may suffice to show that feature X is correlated with concept Y, given the time limits.

ANYWAY, I hadn't really thought about how to handle the proposal-refinement process on Piazza, but I guess editing the original and including a note in the same post/question as to what you changed might be a good way to go...although I'm not sure if we'll be able to see the fact that the changes occurred... if you can think of a way to make that obvious --- perhaps by replying to this post --- that would be super awesome.

Also let me know if what you'd prefer is rather than rephrasing the question, grab your own corpus that is more relevant to your question. Is that going to be feasible for you (perhaps in collaboration with others) in the time allotted?

 

followup [posted by Professor Lee]

A quick follow-up: Table 3 shows how certain language attributes like word and sentence complexity vary for reviewers over time, showing some differences between gender, which is found in earlier work to correlate with helpfulness (which is something perhaps worth remarking on in and of itself). [Otterbacher, Jahna and Libby Hemphill. 2012. Learning the lingo? Gender, prestige and linguistic adaptation in review communities. Proceedings of CSCW] PDF

reply to followup [posted by Elizabeth Murnane]

OH! When I'd briefly browsed the corpus I hadn't realized they were user talk pages -- thanks for pointing that out! In that case, I'd still prefer to work with the same WCC dataset and would definitely be interested in attempting to relate characteristics of the conversation to attributes of a user and participants instead of an article.

Following your suggestion, accessible attributes could include admin status, # of awarded barnstars, account "age", # of edited articles, and so on -- and I agree that the more "social" metrics (e.g., the first 2 of those) that relate to reputation and recognition would be the best ones to focus on.

I think concentrating on the interactions makes great sense too. In addition to your suggestion to look at how someone treats admins vs. non-admins, I can imagine it also might be promising to investigate the personal histories of editors (i.e., mutual past interactions). For example, users who have co-edited articles previously or who have more frequently communicated on the talk pages (for a number of potential reasons, e.g., homophily of interests, ongoing contention, etc.) may demonstrate quite different behaviors than between editors who are more like strangers.

reply to reply [posted by Professor Lee]

Nice ideas all.