10/22/13: Annotating social acts: authority claims and alignment moves...; What to do about bad language Many people remarked on looking at out-of-vocabulary (OOV) rates. (We later described these as lexical innovations, and then neologisms). Q: when is something a true new lexical item, as opposed to a typo? - SRILM toolkit suggests using a character-based model when you hit an unrecognized token - thank about "successful" innovations. (see refs on website). An unusual case: adoption from one lang to another (see the "Beefmoves" paper) For alignment/authority: for other datasets, are there "cheap" labels? (E.g. in Thomas/Pang/Lee, we used votes as signals of [coarse-grained] agreement)