Attacked from within

CS6742:Excerpts from Attacked from within, by anaesthetica

Links have been included from the original piece, unmodified, and so may be broken in a number of places.

Traditional methods for protecting community from the effects of scale and poor behavior are now manifestly unfeasible. Raising barriers to entry, relying on the assumption that users will maintain only one registered account, and placing faith in the ability of admins and user moderation to reproduce a forum's organic culture are all easily circumvented, gamed, and/or ineffective when faced with the problems of scale. Moreover, they tend to reinforce self-destructive behaviors, by increasing returns to the most persistent rather than the most constructive, reinforcing groupthink, and providing ample targets for trolling and griefing.

This article attempts to fundamentally rethink what constitutes community and society on the web, and what possibilities exist for their maintenance and reconstruction in the face of scale and malicious users. The recommendations reached, after analyzing the weaknesses of the web forums we all know and love, are:

[snip]
Moderation should not focus on users or on comments in isolation, but on the relational quality of comments.
Passive moderation filters can mitigate problems of scale.
Preservation of community must shift from being based on exclusion to being based on demonstrated constructive interaction.
Forums should discriminate between content types: original content, links, and personal content.
Story promotion and front page position should be driven by conversation, not voting.

[snip]

According to the mythology we've received from the neckbeards we find squirreled away in server rooms, Eternal September [changed] the Internet [so it was no longer] a place of constructive conversation and engagement [snip]

[snip]

3 — "Technological solutions for social problems"

[snip]

While there may not be technological solutions for social problems, there may be institutional solutions for social problems. [Clay] Shirky is correct insofar as social dynamics on the web have a technological base: patterns of interaction are shaped by the software used to interact. Knowledge of the capabilities and constraints imposed by forum software conditions how users act, what possibilities they perceive, what type of behavior they expect, and (most importantly) how the system can be gamed. Software is the institutional context within which users act, and within which the collective action problem of maintaining a culture of quality interaction is (hopefully) overcome, despite the problems of scaling, multiple identities, bad behavior, and limited capacity of moderators.

What are the technological (really, institutional) problems that need fixing then?

[snip]
[snip]
[snip]
Moderation that better reflects quality, as opposed to simple agreement.
Moderation that lightens the load on admins.
[snip]

[snip]

Fourth, moderation systems ought to be geared toward identifying quality contributions, rather than signaling agreement. Current moderation systems are based on the premise that better comments will end up with better scores. This approach is wrongheaded and flawed. As anyone familiar with Digg's wretched comments can attest, clicking 'thumbs up' on a snarky, flamebaiting, or erroneous one-liner signals almost nothing about the actual quality of the comment. Approval voting systems, wherein comment worth is represented by a raw number score, create an "I agree with this post" dynamic to moderation. There is precious little difference between numerical score-based moderation and the <AOL>Me too!!!</AOL> posts that began flooding into Usenet in September 1993.

Slashdot is the only major forum with a comment moderation system that takes a step in the right direction. While all of its moderation options are either +1 or -1, they all include some kind of descriptor allowing the moderator to assert why the post deserves a higher (or lower) score: insightful, informative, interesting, funny, offtopic, troll, flamebait, etc. Yet they're still wedded to a score-based moderation system. A set of moderation options that reflected quality rather than "I agree with this post" would be a further step in the right direction. No numerical score ought to be visible. The moderation options would be the descriptions of the comments we'd like to see—informative, informative links, engages parent directly, witty—and of the comments we'd like to see less of—one-liner, personal attack, flamebait, troll, abusive links, spam, offtopic. Options to express agreement could be provided too, in order to prevent the descriptive moderation options from standing in as proxies for agreement (moderators rating comments they disagreed with highly in terms of quality might be given extra weight, assuming they're moderating in good faith). Score-based moderation systems foster groupthink and the promotion of content-less one-liners to the detriment of actual conversation. Moderation centered around what makes a good post provides an institutional foundation for altering the dynamics of users' moderation behavior.

[snip]

Fifth, and closely related to the above point, moderation systems need to be designed to lighten the load on moderators, whether they are admins or the regular users themselves. [snip] The solution that xkcd's Randall Munroe hit upon after reviewing the standard options faced by all rapidly scaling communities—restricted entry, moderators, user moderation, and sub-communities—was a system of passive moderation. Moderation would be automatically applied according to a predetermined set of criteria specifying what qualities a good comment would have. In Munroe's case, originality was the key, and any commenters attempting to say something that had already been said before would be penalized by increasing mute times. A similar project, the StupidFilter, being developed by one of our own, uses Bayesian logic to identify stupid comments based on a seed group of human-identified stupid comments. The criteria for stupidity include: over- or under-capitalization, too many text message abbreviations, excessive use of 'LOL' or exclamation points, and so on. Spam identification systems for email and blog comments (e.g. Akismet for WordPress) do much the same thing, identifying commonalities in junk messages and containing them in a junk/spam purgatory awaiting moderation.

Passive moderation can help solve the problem of moderator overload, just as spam filters aid managing one's email inbox or blog comments. Reducing the number of full-time admins to do moderation reduces the proclivities toward the "iron law of bureaucracy" and toward user-moderation abuse. Like the above passive systems, a Robot9000++ could be set to identify general characteristics of comments that make them good or bad: not only originality, but also ideal length of the post (with diminishing returns after a certain point), presence of links, paragraph structure, and so on. Likewise, it could identify posts that the typical profile of destructive or idiotic behavior: one-liners, ad hominems, common insults, links to shock sites, etc. False positives would be an issue, hopefully less of one over time if it had a Bayesian capacity to learn. But effective admin intervention and/or user moderation could correct erroneously downmodded comments.

Sixth, effective moderation systems will function best when pushed as far into the background of user interaction with the forum as possible. Munroe discovered, as did moot shortly thereafter, that announcing the rules of the game results in conversations and threads being overwhelmed by meta discussion and boundary-testing. Those with a stake in circumventing moderation (trolls, griefers, spammers, crapflooders, the usual set of malcontents) quickly discover the limits, whereas those who don't have the time to invest in circumventing the controls remain constrained ("when moderation is the law, only outlaws will be unmoderated"). Passive moderation and wordfilters ought not be immediately perceivable by the user: instead of blocking the user, muting them, or denying the comment from being posted, the systems should let the comment through. An automatic downmoderation ought to be applied to the offending post such that it will be below the threshold of normal comment viewing. However, downmodded comments ought to be discoverable and corrected by user moderation in the case of false positives. By obfuscating passive moderation systems, forums can achieve 'society through obscurity,' preventing moderation criteria from easy discoverability and gaming.

4 — The lowest common denominator

[snip]

Constructive conversation is central to community, not ideological like-mindedness or commonality of interests. Too many forums attempt to provide sub-communities on the basis of user self-selection, allowing the user to place themselves in categories of ideology, allegiance, or taste (e.g. Facebook groups, Last.fm groups, Wikipedian userboxes, etc.). Just as trading flames and ad hominems does not make for lasting interaction, groups of like-minded users decrying offenses to their objects of veneration and offering 'me too!' posts are among the least interesting forms of interaction on the internet. [Clay] (Shirky discusses these self-destructive forms of group interaction, citing psychoanalyst W. R. Bion's 1961 volume Experiences in Groups.)

Placing conversation at the center of analysis changes how we think about constructive interaction. Current moderation schemes focus on discrete comments as the unit of analysis: a comment is either good or bad in and of itself. Slashdot's foster care practice of reparenting highly moderated comments attached to poorly rated parents is indicative of this comment-as-island-unto-itself mode of thought. But if constructive conversation is the goal, the comment itself is the wrong unit of analysis. The conversation—the series of comments responding to one another—is the proper unit of analysis, and the most important aspects are not inherent to the comments themselves but are relational.

Conversation and moderation are not just content creation or judgments. Replying to other users and moderating comments are expressions of relationship. A one-liner, flippant, or flaming reply expresses at best a weak relationship, but usually a negative relationship between two users. Likewise a negative moderation is an indication of one user's low esteem for the contribution of another user. Conversely, a longer post that directly responds to another user—not necessarily agreeing, respectfully disagreeing or providing informative links are just as good—or a positive moderation provides an indication of constructive relations between two users.

Changing the unit of analysis from comments to conversations is the first step in determining how community might emerge from an anonymous society. We can take a two comment dyad as an example and apply an AND logic to the pair of comments' worth (as judged by both passive and user moderation):

Low value: a short snarky comment with an equally short snarky reply. Throwaway comments are throwaway interactions.
Low value: a constructive comment with a flame or one-liner reply. An unconstructive response doesn't indicate potential for a relationship.
Low value: a flamebait or troll garnering a nonetheless long and thought ought response. Feeding trolls, even if done calmly and patiently, is not constructive interaction.
High value: a medium- to long-sized thoughtful comment followed by a thoughtful response of similar length.

Constructive comment dyads are the best indicator of a potential relationship between two anonymous users. Positive moderation of one user for another user's comment does express relationship potential, but less so than commenting, because moderation is quick and one-way, whereas writing a comment that engages the other user signals greater potential for interaction. The implicit relationship forming of commenting is also a better indicator of interaction potential than self-selecting membership in groups. Even people with common interests or common ascriptive identities will not necessarily interact fruitfully. In this sense, grouping along a priori lines is based on the dubious assumption that people will interact best with 'their own kind.' The reality is that providing people with labels and identity/interest groupings is more likely to artificially divide users against one another and to reinforce the negative modes of group interaction identified by Bion.

Communal groups ought not be based on self-selection by users into predefined ascriptive categories, but will function best when they emerge from proven ability to interact constructively. Father-of-sociology Émile Durkheim labeled these different organizing principles 'organic solidarity' (in which individual differences are minimized) and 'mechanical solidarity' (in which differentiated individuals cooperate). The problem then becomes, how can the forum determine, on the basis of comments and moderation, which users belong in the same community as other users?

If we reconceptualize commenting and moderation behavior as links between users expressing a relationship, a method for community's emergence from the broader social milieu becomes clear. Just as hyperlinks between web pages express a relationship of value, as Sergey Brin and Larry Page realized by 1998, so too do replies and moderation create a network of interlinked users. The problem of scaling rendered Yahoo!'s categorization scheme obsolete, and the problem of fraudulent/malicious tagging left AltaVista's meta tag crawling fatally compromised—Google introduced a system capable both of scaling and resisting attack (significantly, resistant to a greater degree than Advogato's trust metric). A modified PageRank algorithm could take into account the positive and negative links between users, establishing overall assessments of users useful for distinguishing malicious users from normal users and for dispensing selective incentives to users producing valued contributions.

Analyzing user interactions as a network of positive and negative links also opens up further possibilities for assessing and grouping users. Small-world network theory is premised on the study of network nodes that exhibit clustering behavior. A clustering coefficient can be used to determine how self-contained a group of interlinked nodes is (what Durkheim would have called the group's dynamic density). A substantial number of software projects aim to analyze social networks in this manner. The advantage of analyzing networks rather than relying on ascriptive categories to generate communities is that each user's community will be a different set of users—preventing systemic groupthink and the negative group dynamics that occur in closed/exclusive communities. The key criteria in maintaining a given user's community group will be their ability to maintain a level of consistent, constructive interaction with the users in their network neighborhood.

[snip]

[How should items be promoted?]

The Wikimedia essay 'voting considered harmful' encapsulates the solution neatly. They grapple with many of the same problems considered here: the problem of multiple identities (dupe voting), tactical/malicious voting, avoiding groupthink, and the stifling of constructive discourse. But whereas Wikimedia aims for consensus decisions, a healthy web forum might settle instead for constructive conversation. That is, instead of voting +1/-1, users would vote with their comments.

Two extreme cases of commenting demonstrate the value of this approach. First, fluff submissions (e.g. images on Digg) tend to get very few comments, and the majority are of low quality ("cool pic! thanks for submitting!"). Second, sensationalist flamebait articles will rack up high numbers of low quality comments, as users post indignant one-liners, flames, personal attacks, and trolls. With passive moderation as described above, a great deal of these comments would have a hard time making it above a normal viewing threshold. With user moderation focused on comment quality rather than 'I agree with this post,' and an evaluation of quality that depends on comment dyads rather than single comments, back-and-forth flamewar threads, even if they racked up an impressive quantity of comments, would still have a very low quality of comments.

Constructive conversations (dyads of highly moderated comments) would be the key determinant of story promotion, not throwaway comments or flames. Because thoughtful comments take longer to construct and are premised on there being substantive content in the article (whether original content or a link), basing story promotion on comments will mitigate the problem of fluff articles on the front page. This method would also place the emphasis on the things important to sustaining a good site: user involvement and interaction. Any site can offer a collection of links, and those that do make commenting take the back seat (e.g. Digg and Reddit). Better sites offer a mix between being story driven and comment driven (e.g. Slashdot and k5). Still, a move toward being fully comment driven needs to take place.

Second, Graham contrasts the top-down vs. bubble-up front pages of Slashdot & Digg and Reddit & Delicious/popular respectively. Top down front pages are a simple temporal ordering of new stories, with no regard to the quality of conversation they produce. Graham notes that these encourage gaming of the story submission and promotion process, because new stories will occupy the top spot on the page and automatically command attention and clickthroughs. Bubble-up front pages allow the forum to decide on a criterion for a story's ascent to the top of the front page, balanced by a time-decay function. Delicious/popular pushes links up based on the number of bookmarks they've received, whereas Reddit and Hacker News move links based on up or down voting. (4chan occupies a median point between top-down and bubble-up methods, bumping threads to the top of Page 0 when they receive a new comment, tempered by limits to the max number of posts and images, and times each unique user can bump the thread.) Our hypothetical comment-driven forum would push stories up based on quality of conversation. Even if gaming the system could promote a story, it would not capture the top of the front page without being able to sustain users' interest enough to post thoughtful comments in response to the story and to one another.

[snip]