anl.aida.reader
Interface ContentReader

All Known Implementing Classes:
AllAfricaHTMLReader, CachedContentReader, ChicagoTribuneHTMLReader, ColumbusTelegramHTMLReader, DailyGateHTMLReader, DailySentinelHTMLReader, FremontTribuneHTMLReader, GoldenTriangleHTMLReader, GrandIslandIndependentHTMLReader, GuardianHTMLReader, IndependentHTMLReader, JournalStarHTMLReader, LATimesHTMLReader, MuscatineJournalHTMLReader, NTARCHtmlReader, NYTimesHTMLReader, OmahaNewsStandHTMLReader, QuadCityTimesHTMLReader, SiouxCityHTMLReader, StandardHTMLReader, YorkTimesHTMLReader

public interface ContentReader

Interface for classes that read / parse content from a url and return a ReaderResult.


Method Summary
 ReaderResult read(java.lang.String url, java.lang.String title, java.util.Date date, java.lang.String author)
          Reads the content from the url.
 

Method Detail

read

ReaderResult read(java.lang.String url,
                  java.lang.String title,
                  java.util.Date date,
                  java.lang.String author)
                  throws java.io.IOException
Reads the content from the url. The title etc. are passed in and should be returned in the ReaderResult.

Parameters:
url - the url to read
title - the document title
date - the document'd ate
author - the author (can be empty string)
Returns:
the ReaderResult
Throws:
java.io.IOException - if there is an error reading the content