: Class AuthorSection

Class

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: INNER | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

Linkable.Analysis
Class AuthorSection

java.lang.Object
  |
  +--Linkable.Analysis.AuthorSection

public class AuthorSection
extends java.lang.Object

This class handles the parsing of the author section of a document. It decides whether or not to stay in the author section and whether or not to skip text or to look for an author name. The constructor specifies an array of tag strings that should be used to look for a potential author or author list. SAX events (handleStartTag, handleEndTag, and handleText) are mirrored here in the AuthorSection, which handles it specifically for the author section. None of these routines is called unless it is known that we are parsing the author section. After one of these routines returns false, they will not be called again for this paper.

Field Summary

private ContextSection cs


private static boolean DEBUG


private boolean grabAuthor


private int index


private static java.lang.String ME


private boolean notInTable


private java.lang.String[] possibleStartTags


private java.util.Vector startAuthorTags


private java.util.Vector v


Constructor Summary

AuthorSection(ContextSection _cs, java.lang.String[] st)


Method Summary

private boolean addText(char[] text, int offset, int length, java.lang.String textString)
          addText is given a hunk of text from the author section of a paper, which is not a header as determined by the parser, which is parsed into one or more authors names.

Author[] getAuthors()
          getAuthors() returns an Author[] array out of the authors seen by this AuthorSection object so far.

protected boolean gotAuthors()


private void handleAuthor(java.lang.String textString)


protected void handleEndTag(java.lang.String name)
          handles end tag.

protected boolean handleStartTag(java.lang.String name, org.xml.sax.AttributeList attrs)
          Given a tag and its attributes, determine whether this could be the start of a new author list.

protected boolean handleText(char[] text, int offset, int length, java.lang.String textString)
          Given a string of text, parse it as possibly being the start of the document body, else -- if grabAuthor is true -- parse it as an author name or author list.

private static boolean isEndOfAuthorSection(java.lang.String textString)
          isEndOfAuthorSection examines this hunk of text, which should be a header (as determined by the parser) and returns true if this could be the start of the body of the text.

private void putAuthor(java.lang.String authorName)


void setTable(boolean sw)


Methods inherited from class java.lang.Object

, clone, equals, finalize, getClass, hashCode, notify, notifyAll, registerNatives, toString, wait, wait, wait

Field Detail

ME

private static final java.lang.String ME

DEBUG

private static final boolean DEBUG

v

private java.util.Vector v

possibleStartTags

private java.lang.String[] possibleStartTags

startAuthorTags

private java.util.Vector startAuthorTags

index

private int index

cs

private ContextSection cs

notInTable

private boolean notInTable

grabAuthor

private boolean grabAuthor

Constructor Detail

AuthorSection

public AuthorSection(ContextSection _cs,
                     java.lang.String[] st)

Method Detail

setTable

public void setTable(boolean sw)

handleStartTag

protected boolean handleStartTag(java.lang.String name,
                                 org.xml.sax.AttributeList attrs)

Given a tag and its attributes, determine whether this could be the start of a new author list. If we are just beginning, this is also where the sequence of tags that starts up an author list is remembered.

Parameters:: The - name of the tag that has been encountered, e.g. "p"; The - AttributeList on that tag (relevant for HTML, TeX)

handleEndTag

protected void handleEndTag(java.lang.String name)

handles end tag. Basically back out of properly nested tags. When we've backed out all the way, then we can consider tags that start new authors or a Context Section.

Parameters:: the - name of the element or environment that is being ended.

handleText

protected boolean handleText(char[] text,
                             int offset,
                             int length,
                             java.lang.String textString)

Given a string of text, parse it as possibly being the start of the document body, else -- if grabAuthor is true -- parse it as an author name or author list.

Parameters:: The - string of text, including newlines and whitespace

isEndOfAuthorSection

private static boolean isEndOfAuthorSection(java.lang.String textString)

isEndOfAuthorSection examines this hunk of text, which should be a header (as determined by the parser) and returns true if this could be the start of the body of the text.

Parameters:: The - hunk of text as a String

addText

private boolean addText(char[] text,
                        int offset,
                        int length,
                        java.lang.String textString)

addText is given a hunk of text from the author section of a paper, which is not a header as determined by the parser, which is parsed into one or more authors names.

Parameters:: Hunk - of text in char[] form; Where - in the text the author string starts; How - long the string is; Hunk - of text in String form

getAuthors

public Author[] getAuthors()

getAuthors() returns an Author[] array out of the authors seen by this AuthorSection object so far.

handleAuthor

private void handleAuthor(java.lang.String textString)

putAuthor

private void putAuthor(java.lang.String authorName)

gotAuthors

protected boolean gotAuthors()