uk.ac.soton.harvester
Class EntityReader
java.lang.Object
|
+--java.io.Reader
|
+--java.io.BufferedReader
|
+--uk.ac.soton.harvester.EntityReader
- public class EntityReader
- extends java.io.BufferedReader
EntityReader extends the behaviour of BufferedReader
so that any ISO-Latin-1 entities are replaced by their
ASCII/Unicode characters.
This class accompanies EntityWriter to allow
the processor to read data in from and write data out
to XML-based files.
Field Summary |
(package private) java.util.Dictionary |
d
d provides a lookup from entity name
to character number |
Fields inherited from class java.io.Reader |
lock |
Constructor Summary |
(package private) |
EntityReader(java.io.Reader in)
The main constructor allows an EntityReader
to be based on any kind of Reader. |
Method Summary |
(package private) java.lang.String |
entLookup(java.lang.String name)
entLookup is a wrapper function which guarantees a char
for an entity name. |
(package private) java.lang.String |
entString(java.lang.String s)
entString decodes any unusual characters in a string from
ISOLAtin-1 entities. |
java.lang.String |
readLine()
|
Methods inherited from class java.io.BufferedReader |
,
close,
mark,
markSupported,
read,
read,
readLine,
ready,
reset,
skip |
Methods inherited from class java.io.Reader |
read |
Methods inherited from class java.lang.Object |
clone,
equals,
finalize,
getClass,
hashCode,
notify,
notifyAll,
toString,
wait,
wait,
wait |
d
java.util.Dictionary d
- d provides a lookup from entity name
to character number
EntityReader
EntityReader(java.io.Reader in)
- The main constructor allows an EntityReader
to be based on any kind of Reader.
entLookup
java.lang.String entLookup(java.lang.String name)
- entLookup is a wrapper function which guarantees a char
for an entity name. It defaults to "_" for unrecognised entities.
- Parameters:
name
- entity name to be looked up. name may in fact be a
number of the form #n according to the rules of XML.- Returns:
- String value of length 1, whose first character is the
character represented by the entity name given as a parameter
(or "_" in pathalogical cases).
entString
java.lang.String entString(java.lang.String s)
- entString decodes any unusual characters in a string from
ISOLAtin-1 entities. Ordinary ASCII characters are
left untouched. Some "ordinary" characters ('&','<','>')
have also been usurped to conform to the XML standard.
e.g.
"Carr ∓ René" is transformed into
"Carr & René".
- Parameters:
s
- the string to process- Returns:
- the string with embedded entity names replaced.
readLine
public java.lang.String readLine()
throws java.io.IOException
- Overrides:
- readLine in class java.io.BufferedReader