uk.ac.soton.harvester
Class EntityEncoder

java.lang.Object
  |
  +--uk.ac.soton.harvester.EntityEncoder

public class EntityEncoder
extends java.lang.Object

EntityEncoder is a convenience class that allows the deciter class to directly code entity strings without using an EntityWriter. Its purpose is to changeo non-ASCII characters in strings to their ISO-Latin-1 entity name equivalents.


Field Summary
(package private)  java.util.Dictionary d
          d provides a reverse lookup from character number to entity name
 
Constructor Summary
EntityEncoder()
           
 
Method Summary
(package private)  java.lang.String encode(java.lang.String s)
          encode encodes any unusual characters in a string as ISOLAtin-1 entities.
(package private)  java.lang.String entName(char ch)
          entName is a wrapper function which guarantees a safe name for a character position.
(package private)  java.lang.String PCDATA(java.lang.String s)
          PCDATA is just an alias for encode.
 
Methods inherited from class java.lang.Object
, clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

d

java.util.Dictionary d
d provides a reverse lookup from character number to entity name
Constructor Detail

EntityEncoder

public EntityEncoder()
Method Detail

entName

java.lang.String entName(char ch)
entName is a wrapper function which guarantees a safe name for a character position. It defaults to "unknown" for pathalogical cases.
Parameters:
ch - character value to look up
Returns:
ISOLatin-1 entity name of the character parameter (or "unknown" in pathalogical cases).

PCDATA

java.lang.String PCDATA(java.lang.String s)
PCDATA is just an alias for encode.

encode

java.lang.String encode(java.lang.String s)
encode encodes any unusual characters in a string as ISOLAtin-1 entities. Ordinary ASCII characters are left untouched. Some "ordinary" characters ('&','<','>') have to be usurped to conform to the XML standard. e.g. "Carr & René" is transformed into "Carr &mp; Ren&eacute;" .
Parameters:
s - the string to process
Returns:
the string with embedded characters replaced by entity names