public abstract class BaseFileCollectionReader extends BaseLeoCollectionReader
| Modifier and Type | Field and Description | 
|---|---|
protected FilenameFilter | 
filenameFilter
Filters out the files found by filename extension. 
 | 
protected String | 
fileNameFilterJSON
JSON representation of the filename filter used to pass this object to the CollectionReader descriptor for
 initialization. 
 | 
protected String | 
fileNameFilterName
Class name of the file name filter this reader is using. 
 | 
protected String | 
inputDirectoryPath
Path to the input directory. 
 | 
protected org.apache.log4j.Logger | 
LOG
Logger for class. 
 | 
protected String | 
mEncoding
Encoding type for the files being read in. 
 | 
protected ArrayList<File> | 
mFileCollection
Array of File objects being processed. 
 | 
protected int | 
mFileIndex
Index of the next file to be processed. 
 | 
protected File | 
mInDir
Input Directory File object to be searched for available files. 
 | 
protected boolean | 
mRecurse
Recurse flag we will search recursively in sub-directories if true. 
 | 
filters, textFilters| Constructor and Description | 
|---|
BaseFileCollectionReader()
Default constructor used during UIMA initialization. 
 | 
BaseFileCollectionReader(File inputDirectory,
                        boolean recurse)
Constructor that sets the input directory to be searched and the recurse flag that
 controls whether or not the reader will descend in to subdirectories. 
 | 
| Modifier and Type | Method and Description | 
|---|---|
protected void | 
findFiles(File f)
Find the list of files that meet the requirements. 
 | 
int | 
getCollectionSize()
Return the number of documents in the set. 
 | 
int | 
getCurrentIndex()  | 
String | 
getEncoding()
Return the encoding format that this CollectionReader will use for the source data. 
 | 
FilenameFilter | 
getFilenameFilter()
Get the filename filter that will be used to filter files found in the input directory. 
 | 
File | 
getInputDirectory()
Return the input directory this reader will search for files. 
 | 
abstract void | 
getNext(org.apache.uima.cas.CAS aCAS)
Get the next file to be processed in the pipeline. 
 | 
org.apache.uima.util.Progress[] | 
getProgress()  | 
boolean | 
hasNext()  | 
void | 
initialize()
This method is called during initialization, and does nothing by default. 
 | 
<T extends BaseFileCollectionReader> | 
setEncoding(String encoding)
Set the file encoding from the encoding string provided. 
 | 
<T extends BaseFileCollectionReader> | 
setFilenameFilter(FilenameFilter filenameFilter)
Set the FilenameFilter that this object will use to filter the files found in the input directory. 
 | 
<T extends BaseFileCollectionReader> | 
setInputDirectory(File inputDirectory)
Set the inputDirectory for this FileSubReader object. 
 | 
<T extends BaseFileCollectionReader> | 
setRecurseFlag(boolean recurse)
Set the recurse flag for this property. 
 | 
addFilters, addFilters, close, generateCollectionReaderDescription, getFilters, produceCollectionReaderdestroy, getCasInitializer, getProcessingResourceMetaData, initialize, isConsuming, reconfigure, setCasInitializer, typeSystemInitgetConfigParameterValue, getConfigParameterValue, setConfigParameterValue, setConfigParameterValuegetCasManager, getLogger, getMetaData, getResourceManager, getUimaContext, getUimaContextAdmin, setLogger, setMetaDataclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait@LeoConfigurationParameter(description="Path to the input directory", mandatory=true) protected String inputDirectoryPath
@LeoConfigurationParameter(description="If true then recurse in subdirectories, defaults to false.") protected boolean mRecurse
@LeoConfigurationParameter protected String mEncoding
@LeoConfigurationParameter(description="JSON representation of the filename filter.") protected String fileNameFilterJSON
@LeoConfigurationParameter(description="Cannonical name of the filename filter") protected String fileNameFilterName
protected File mInDir
protected int mFileIndex
protected FilenameFilter filenameFilter
protected org.apache.log4j.Logger LOG
public BaseFileCollectionReader()
public BaseFileCollectionReader(File inputDirectory, boolean recurse)
inputDirectory - Input directory to be searchedrecurse - Recurse flag will descend into subdirectories if true, defaults to false.public void initialize()
                throws org.apache.uima.resource.ResourceInitializationException
initialize in class BaseLeoCollectionReaderorg.apache.uima.resource.ResourceInitializationException - if a failure occurs during initialization.public String getEncoding()
public <T extends BaseFileCollectionReader> T setEncoding(String encoding)
encoding - encoding format to use.public File getInputDirectory()
public <T extends BaseFileCollectionReader> T setInputDirectory(File inputDirectory)
inputDirectory - the input directory to load files from.public FilenameFilter getFilenameFilter()
public <T extends BaseFileCollectionReader> T setFilenameFilter(FilenameFilter filenameFilter)
filenameFilter - FilenameFilter - extends BaseFileCollectionReader> Type of the reader instance to return protected void findFiles(File f)
f - the file to search. This should be a directory.public <T extends BaseFileCollectionReader> T setRecurseFlag(boolean recurse)
recurse - if true, sub directories are also searched, otherwise just the specified directory is
                 used for input.public int getCollectionSize()
public int getCurrentIndex()
public boolean hasNext()
                throws IOException,
                       org.apache.uima.collection.CollectionException
hasNext in interface org.apache.uima.collection.base_cpm.BaseCollectionReaderhasNext in class BaseLeoCollectionReaderIOException - if there is an error reading the data.org.apache.uima.collection.CollectionException - if retrieval of the next file failspublic abstract void getNext(org.apache.uima.cas.CAS aCAS)
                      throws IOException,
                             org.apache.uima.collection.CollectionException
getNext in interface org.apache.uima.collection.CollectionReadergetNext in class BaseLeoCollectionReaderaCAS - the cas to populate with the next document.IOException - if there is an error reading the data.org.apache.uima.collection.CollectionException - if retrieval of the next file failspublic org.apache.uima.util.Progress[] getProgress()
Copyright © 2018 Department of Veterans Affairs. All Rights Reserved.