Linkable.Analysis
Class AuthorSection

java.lang.Object
  |
  +--Linkable.Analysis.AuthorSection

public class AuthorSection
extends java.lang.Object

This class handles the parsing of the author section of a document. It decides whether or not to stay in the author section and whether or not to skip text or to look for an author name. The constructor specifies an array of tag strings that should be used to look for a potential author or author list. SAX events (handleStartTag, handleEndTag, and handleText) are mirrored here in the AuthorSection, which handles it specifically for the author section. None of these routines is called unless it is known that we are parsing the author section. After one of these routines returns false, they will not be called again for this paper.


Field Summary
private  ContextSection cs
           
private static boolean DEBUG
           
private  boolean grabAuthor
           
private  int index
           
private static java.lang.String ME
           
private  boolean notInTable
           
private  java.lang.String[] possibleStartTags
           
private  java.util.Vector startAuthorTags
           
private  java.util.Vector v
           
 
Constructor Summary
AuthorSection(ContextSection _cs, java.lang.String[] st)
           
 
Method Summary
private  boolean addText(char[] text, int offset, int length, java.lang.String textString)
          addText is given a hunk of text from the author section of a paper, which is not a header as determined by the parser, which is parsed into one or more authors names.
 Author[] getAuthors()
          getAuthors() returns an Author[] array out of the authors seen by this AuthorSection object so far.
protected  boolean gotAuthors()
           
private  void handleAuthor(java.lang.String textString)
           
protected  void handleEndTag(java.lang.String name)
          handles end tag.
protected  boolean handleStartTag(java.lang.String name, org.xml.sax.AttributeList attrs)
          Given a tag and its attributes, determine whether this could be the start of a new author list.
protected  boolean handleText(char[] text, int offset, int length, java.lang.String textString)
          Given a string of text, parse it as possibly being the start of the document body, else -- if grabAuthor is true -- parse it as an author name or author list.
private static boolean isEndOfAuthorSection(java.lang.String textString)
          isEndOfAuthorSection examines this hunk of text, which should be a header (as determined by the parser) and returns true if this could be the start of the body of the text.
private  void putAuthor(java.lang.String authorName)
           
 void setTable(boolean sw)
           
 
Methods inherited from class java.lang.Object
, clone, equals, finalize, getClass, hashCode, notify, notifyAll, registerNatives, toString, wait, wait, wait
 

Field Detail

ME

private static final java.lang.String ME

DEBUG

private static final boolean DEBUG

v

private java.util.Vector v

possibleStartTags

private java.lang.String[] possibleStartTags

startAuthorTags

private java.util.Vector startAuthorTags

index

private int index

cs

private ContextSection cs

notInTable

private boolean notInTable

grabAuthor

private boolean grabAuthor
Constructor Detail

AuthorSection

public AuthorSection(ContextSection _cs,
                     java.lang.String[] st)
Method Detail

setTable

public void setTable(boolean sw)

handleStartTag

protected boolean handleStartTag(java.lang.String name,
                                 org.xml.sax.AttributeList attrs)
Given a tag and its attributes, determine whether this could be the start of a new author list. If we are just beginning, this is also where the sequence of tags that starts up an author list is remembered.
Parameters:
The - name of the tag that has been encountered, e.g. "p"
The - AttributeList on that tag (relevant for HTML, TeX)

handleEndTag

protected void handleEndTag(java.lang.String name)
handles end tag. Basically back out of properly nested tags. When we've backed out all the way, then we can consider tags that start new authors or a Context Section.
Parameters:
the - name of the element or environment that is being ended.

handleText

protected boolean handleText(char[] text,
                             int offset,
                             int length,
                             java.lang.String textString)
Given a string of text, parse it as possibly being the start of the document body, else -- if grabAuthor is true -- parse it as an author name or author list.
Parameters:
The - string of text, including newlines and whitespace

isEndOfAuthorSection

private static boolean isEndOfAuthorSection(java.lang.String textString)
isEndOfAuthorSection examines this hunk of text, which should be a header (as determined by the parser) and returns true if this could be the start of the body of the text.
Parameters:
The - hunk of text as a String

addText

private boolean addText(char[] text,
                        int offset,
                        int length,
                        java.lang.String textString)
addText is given a hunk of text from the author section of a paper, which is not a header as determined by the parser, which is parsed into one or more authors names.
Parameters:
Hunk - of text in char[] form
Where - in the text the author string starts
How - long the string is
Hunk - of text in String form

getAuthors

public Author[] getAuthors()
getAuthors() returns an Author[] array out of the authors seen by this AuthorSection object so far.

handleAuthor

private void handleAuthor(java.lang.String textString)

putAuthor

private void putAuthor(java.lang.String authorName)

gotAuthors

protected boolean gotAuthors()