|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectanl.aida.reader.AbstractAIDAComponentReader
anl.aida.reader.ProMedReader
public class ProMedReader
AIDAComponentReader implementation for reading archived ProMed files. Archived means that files are saved in some directory as individual files rather than some email format. The names of the files should be ProMedMail_*.txt where * is some number. This will process all the files in the directory.
Field Summary | |
---|---|
private java.util.Set<java.lang.String> |
badTitleWords
|
private java.util.Date |
date
|
private PreTextExtractor |
extractor
|
private java.io.Reader |
in
|
static java.lang.String |
INDEX_FILE
|
private ProMedPreProcessor |
proc
|
private ProMedTxtReader |
reader
|
Fields inherited from class anl.aida.reader.AbstractAIDAComponentReader |
---|
indexIter, lineItems, location, processors, startDate |
Fields inherited from interface anl.aida.reader.AIDAComponentReader |
---|
MESSAGE_DIGEST |
Constructor Summary | |
---|---|
ProMedReader()
|
Method Summary | |
---|---|
void |
close()
Closes this MIFSComponentReader and any resources it may have opened. |
protected java.lang.String |
getDocumentURL()
Gets the URL of the current document. |
protected java.lang.String |
getIndexFileKey()
Gets the name of the parameter key for the index file. |
protected ReaderResult |
getNextResult()
Gets the next ReaderResult. |
boolean |
hasNext()
Gets whether or not this MIFSComponentReader has more documents to process. |
private void |
incrementFile()
|
void |
initialize(org.apache.uima.resource.ConfigurableResource resource,
java.util.Date cacheStartDate)
Initializes this MIFSComponentReader, optionally using the resource. |
private boolean |
isUsefulContent(java.lang.String title)
|
private void |
next()
|
protected void |
postNext()
Called at the completion of AbstractAIDAComponentReader.getNext(CAS) |
Methods inherited from class anl.aida.reader.AbstractAIDAComponentReader |
---|
checkDate, getNext |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static final java.lang.String INDEX_FILE
private ProMedTxtReader reader
private ProMedPreProcessor proc
private java.io.Reader in
private PreTextExtractor extractor
private java.util.Set<java.lang.String> badTitleWords
private java.util.Date date
Constructor Detail |
---|
public ProMedReader()
Method Detail |
---|
public void close() throws java.io.IOException
AIDAComponentReader
close
in interface AIDAComponentReader
close
in class AbstractAIDAComponentReader
java.io.IOException
- if there is an error closing the reader.protected ReaderResult getNextResult() throws java.io.IOException
AbstractAIDAComponentReader
getNextResult
in class AbstractAIDAComponentReader
java.io.IOException
- if there is an error getting the resultprotected void postNext() throws java.io.IOException
AbstractAIDAComponentReader
AbstractAIDAComponentReader.getNext(CAS)
postNext
in class AbstractAIDAComponentReader
java.io.IOException
- if there is an error getting the resultprotected java.lang.String getIndexFileKey()
AbstractAIDAComponentReader
getIndexFileKey
in class AbstractAIDAComponentReader
public void initialize(org.apache.uima.resource.ConfigurableResource resource, java.util.Date cacheStartDate) throws org.apache.uima.resource.ResourceInitializationException
AIDAComponentReader
initialize
in interface AIDAComponentReader
initialize
in class AbstractAIDAComponentReader
resource
- the resource to use for configuration
org.apache.uima.resource.ResourceInitializationException
- if there is an error initializing
the readerprivate void next() throws java.io.IOException
java.io.IOException
private boolean isUsefulContent(java.lang.String title)
private void incrementFile() throws java.io.IOException
java.io.IOException
public boolean hasNext() throws java.io.IOException, org.apache.uima.collection.CollectionException
AIDAComponentReader
hasNext
in interface AIDAComponentReader
hasNext
in class AbstractAIDAComponentReader
java.io.IOException
- if there is an error in determining if there are more docs to process.
org.apache.uima.collection.CollectionException
protected java.lang.String getDocumentURL()
AbstractAIDAComponentReader
getDocumentURL
in class AbstractAIDAComponentReader
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |