Proposals

    The first step that needs to be done is to allow for sufficient world knowledge.  To extend the world knowledge, the first steps are to investigate on some large sets of rules specifying allowable discourse entities we can assume are inter-inferable.  The WordNet maybe helpful for this procedure, though the problem will become strongly related to the word sense disambiguation problem, which in itself, is already an intricate problem. Second, the algorithm should be scaled up to handle global discourse coherence.  The algorithm needs to somehow acknowledge the title of the text.  Conceptually, having “read” the title, the algorithm would “point” to the corresponding set of relevant rules it derived from the WordNet and map the noun phrases it encounters in the body of the text to the set of rules. This would correspond to a human reading a title and recalling some of his/her already existing knowledge on the topic, and then relating what (s)he recalls to what (s)he would then read.  The third step involves keeping a separate list of frequently appearing words that are in the subject position.  The more frequent a word, the higher it is ranked in this second discourse entity list.  This most highly ranked element that matches the target referent will be the antecedent if there is no antecedent to be found in the original discourse entity list.  I only include those words that are in the subject position due to the strong evidence (GGG93) that the subject will be referred to again.  Since the algorithm processes the text incrementally, each element in the second entity list will only give number of occurrences of that element so far.  But building from the fact that subject of a sentence (the probable center of the utterance) is positioned towards the beginning of the sentence, the key words in a document would also be towards the beginning of the document.  Thus the second entity list should be able to provide the algorithm with a “global picture”, and so encouraging global coherence.
 
    I also propose a solution that will solve the problems arising from (a) incremental update and (b) the noun phrase “A and B” – evoking two different discourse entities: “A and B”, and “A” and “B”.  The idea imitates what human beings do.  Humans read a sentence from left to right, processing each word incrementally.  When he encounters one of the two situations mentioned above, he assigns the target referent with an “unknown” tag without affecting his big picture understanding of the document so far.  With situation (a), he knows from experience that the identity of the “unknown” tag will be revealed later in the sentence.  He will replace the tag with the first discourse entity that satisfies the three constraints.  With situation (b), there is more freedom and the tag may never again be reassigned.  He knows that if this is the case, then it is intended by the writer and no matter which interpretation he chooses, it will not affect his comprehension of the document.  An algorithm can imitate what a human does in situation (a) with relative ease.  And for situation (b), the algorithm must have both interpretations ready in the discourse entity list.  Their relative order with respect to each other does not matter, since only one will be chosen, if any.