anl.aida.reader
Interface DocumentProcessor

All Known Implementing Classes:
AlbertLeaIndexMaker.ArchiveMaker, AllAfricaHTMLReader.Processor, ChicagoTribuneHTMLReader.Processor, ChicagoTribuneIndexMaker.ArchiveMaker, ColumbusTelegramHTMLReader.Processor, ColumbusTelegramIndexMaker.ArchiveMaker, DailyGateHTMLReader.Processor, DailyGateIndexMaker.ArchiveMaker, DailyIowanIndexMaker.ArchiveMaker, DailySentinelHTMLReader.Processor, DailySentinelIndexMaker.ArchiveMaker, FremontTribuneHTMLReader.Processor, FremontTribuneIndexMaker.ArchiveMaker, GoldenTriangleHTMLReader.Processor, GoldenTriangleIndexMaker.ArchiveMaker, GrandIslandIndependentHTMLReader.Processor, GrandIslandIndexMaker.ArchiveMaker, GuardianHTMLReader.Processor, GuardianIndexMaker.ArchiveMaker, IndependentHTMLReader.Processor, IndependentIndexMaker.ArchiveMaker, JournalStarHTMLReader.Processor, JournalStarIndexMaker.ArchiveMaker, LATimesHTMLReader.Processor, LATimesIndexMaker.ArchiveMaker, MuscatineJournalHTMLReader.Processor, MuscatineJournalIndexMaker.ArchiveMaker, NTARCHtmlReader.NTARProcessor, NTARCIndexMaker.ArchiveMaker, NYTimesHTMLReader.NullProcessor, NYTimesHTMLReader.Processor, OmahaNewsStandHTMLReader.Processor, OmahaNewsStandIndexMaker.ArchiveMaker, PreTextExtractor.DocProc, ProMedIndexMaker.ArchiveMaker, QuadCityIndexMaker.ArchiveMaker, QuadCityTimesHTMLReader.Processor, SiouxCityHTMLReader.Processor, SiouxCityIndexMaker.ArchiveMaker, StandardHTMLReader.Processor, WordNetReader.ArchiveMaker, YorkNewsTimesIndexMaker.ArchiveMaker, YorkTimesHTMLReader.Processor

public interface DocumentProcessor

Interface for classes that can process lines of text, such as those extracted from an html page.


Method Summary
 boolean done()
          If this returns true, the code that is feeding lines to this processor can then stop.
 void processLine(java.lang.String line)
          Process the line.
 void reset()
          Resets the processor so it may be used with another document.
 

Method Detail

processLine

void processLine(java.lang.String line)
Process the line. The line may be text or a html link.

Parameters:
line - the line to process

done

boolean done()
If this returns true, the code that is feeding lines to this processor can then stop.

Returns:
true when the processor is done, otherwise false.

reset

void reset()
Resets the processor so it may be used with another document.