anl.aida.reader
Class StringExtractor

java.lang.Object
  extended by anl.aida.reader.StringExtractor

public class StringExtractor
extends java.lang.Object

Extracts lines of text and links from a URL and passes them to a DocumentProcessor.


Field Summary
private  org.htmlparser.beans.StringBean bean
           
private static java.lang.String EXCEPTION_STRING
           
 
Constructor Summary
StringExtractor()
          Creates a StringExtractor that uses the default StringBean to extract the text.
StringExtractor(org.htmlparser.beans.StringBean bean)
          Creates a StringExtractor that uses the specified bean to extract the text.
 
Method Summary
private  void processStrings(DocumentProcessor processor)
           
 void run(java.lang.String url, DocumentProcessor processor)
          Runs this StringExtractor over the specified url.
 void run(java.lang.String url, java.lang.String referer, DocumentProcessor processor)
          Runs this StringExtractor over the specified url.
 void run(java.net.URLConnection conn, DocumentProcessor processor)
          Runs this StringExtractor over the specified url connection.
 void runWithManualFetch(java.lang.String url, DocumentProcessor processor)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

EXCEPTION_STRING

private static final java.lang.String EXCEPTION_STRING
See Also:
Constant Field Values

bean

private org.htmlparser.beans.StringBean bean
Constructor Detail

StringExtractor

public StringExtractor(org.htmlparser.beans.StringBean bean)
Creates a StringExtractor that uses the specified bean to extract the text.

Parameters:
bean - the bean to use.

StringExtractor

public StringExtractor()
Creates a StringExtractor that uses the default StringBean to extract the text.

Method Detail

run

public void run(java.lang.String url,
                java.lang.String referer,
                DocumentProcessor processor)
         throws java.io.IOException
Runs this StringExtractor over the specified url. Each extracted string will be passed to the specified DocumentProcessor until all the Strings have been passed or the processor returns done.

Parameters:
url - the url to run on
referer - a link to set in the http REFERER header. This is optional and can be an empty string
processor - the processor to pass the extracted strings to
Throws:
java.io.IOException - if there is an error extracting the strings.

run

public void run(java.net.URLConnection conn,
                DocumentProcessor processor)
         throws java.io.IOException
Runs this StringExtractor over the specified url connection. Each extracted string will be passed to the specified DocumentProcessor until all the Strings have been passed or the processor returns done.

Parameters:
conn - the url connection to run on
processor - the processor to pass the extracted strings to
Throws:
java.io.IOException - if there is an error extracting the strings.

run

public void run(java.lang.String url,
                DocumentProcessor processor)
         throws java.io.IOException
Runs this StringExtractor over the specified url. Each extracted string will be passed to the specified DocumentProcessor until all the Strings have been passed or the processor returns done.

Parameters:
url - the url to run on
processor - the processor to pass the extracted strings to
Throws:
java.io.IOException - if there is an error extracting the strings.

runWithManualFetch

public void runWithManualFetch(java.lang.String url,
                               DocumentProcessor processor)
                        throws java.io.IOException
Throws:
java.io.IOException

processStrings

private void processStrings(DocumentProcessor processor)
                     throws java.io.IOException
Throws:
java.io.IOException