anl.aida.reader
Class AbstractAIDAComponentReader

java.lang.Object
  extended by anl.aida.reader.AbstractAIDAComponentReader
All Implemented Interfaces:
AIDAComponentReader
Direct Known Subclasses:
AlbertLeaTribuneReader, ChicagoTribuneReader, ColumbusTelegramReader, ContentComponentReader, DailyGateReader, DailyIowanReader, DailySentinelReader, FremontTribuneReader, GoldenTriangleReader, GrandIslandIndependentReader, GuardianReader, JournalStarReader, LATimesReader, MuscatineJournalReader, NTARCReader, OmahaNewsStandReader, ProMedReader, QuadCityTimesReader, SiouxCityReader, YorkNewsTimesReader

public abstract class AbstractAIDAComponentReader
extends java.lang.Object
implements AIDAComponentReader

Abstract implementation of AIDAComponentReader that processes a ReaderResult into a CAS by iterating over a "standard" index and retreiving the results. Subclasses are responsible for producing the ReaderResult.


Field Summary
protected  IndexIterator indexIter
           
protected  java.lang.String[] lineItems
           
protected  java.lang.String location
           
protected  java.util.List<ReaderResultProcessor> processors
           
protected  java.util.Date startDate
           
 
Fields inherited from interface anl.aida.reader.AIDAComponentReader
MESSAGE_DIGEST
 
Constructor Summary
AbstractAIDAComponentReader()
           
 
Method Summary
protected  void checkDate(java.util.Date date)
          Checks if the date is before the cache start date and logs a warning.
 void close()
          Closes this MIFSComponentReader and any resources it may have opened.
protected  java.lang.String getDocumentURL()
          Gets the URL of the current document.
protected abstract  java.lang.String getIndexFileKey()
          Gets the name of the parameter key for the index file.
 void getNext(org.apache.uima.cas.CAS cas)
          Gets the next document etc.
protected abstract  ReaderResult getNextResult()
          Gets the next ReaderResult.
 boolean hasNext()
          Gets whether or not this MIFSComponentReader has more documents to process.
 void initialize(org.apache.uima.resource.ConfigurableResource resource, java.util.Date cacheStartDate)
          Initializes this MIFSComponentReader, optionally using the resource.
protected  void postNext()
          Called at the completion of getNext(CAS)
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

indexIter

protected IndexIterator indexIter

lineItems

protected java.lang.String[] lineItems

location

protected java.lang.String location

processors

protected java.util.List<ReaderResultProcessor> processors

startDate

protected java.util.Date startDate
Constructor Detail

AbstractAIDAComponentReader

public AbstractAIDAComponentReader()
Method Detail

getNext

public void getNext(org.apache.uima.cas.CAS cas)
             throws java.io.IOException,
                    org.apache.uima.collection.CollectionException
Description copied from interface: AIDAComponentReader
Gets the next document etc. into the CAS.

Specified by:
getNext in interface AIDAComponentReader
Parameters:
cas - the CAS to put the document into
Throws:
java.io.IOException - if there is an error reading the document
org.apache.uima.collection.CollectionException - if there is an error reading the document

checkDate

protected void checkDate(java.util.Date date)
Checks if the date is before the cache start date and logs a warning.

Parameters:
date - the date to check

getNextResult

protected abstract ReaderResult getNextResult()
                                       throws java.io.IOException
Gets the next ReaderResult.

Returns:
the next ReaderResult.
Throws:
java.io.IOException - if there is an error getting the result

getDocumentURL

protected java.lang.String getDocumentURL()
Gets the URL of the current document.

Returns:
the URL of the current document.

postNext

protected void postNext()
                 throws java.io.IOException
Called at the completion of getNext(CAS)

Throws:
java.io.IOException - if there is an error getting the result

close

public void close()
           throws java.io.IOException
Description copied from interface: AIDAComponentReader
Closes this MIFSComponentReader and any resources it may have opened.

Specified by:
close in interface AIDAComponentReader
Throws:
java.io.IOException - if there is an error closing the reader.

hasNext

public boolean hasNext()
                throws java.io.IOException,
                       org.apache.uima.collection.CollectionException
Description copied from interface: AIDAComponentReader
Gets whether or not this MIFSComponentReader has more documents to process.

Specified by:
hasNext in interface AIDAComponentReader
Returns:
true if done, otherwise false.
Throws:
java.io.IOException - if there is an error in determining if there are more docs to process.
org.apache.uima.collection.CollectionException

getIndexFileKey

protected abstract java.lang.String getIndexFileKey()
Gets the name of the parameter key for the index file. The index file contains the links etc to read.

Returns:
the name of the parameter key for the index file.

initialize

public void initialize(org.apache.uima.resource.ConfigurableResource resource,
                       java.util.Date cacheStartDate)
                throws org.apache.uima.resource.ResourceInitializationException
Description copied from interface: AIDAComponentReader
Initializes this MIFSComponentReader, optionally using the resource.

Specified by:
initialize in interface AIDAComponentReader
Parameters:
resource - the resource to use for configuration
Throws:
org.apache.uima.resource.ResourceInitializationException - if there is an error initializing the reader