public abstract class BaseFileCollectionReader extends BaseLeoCollectionReader
Modifier and Type | Field and Description |
---|---|
protected FilenameFilter |
filenameFilter
Filters out the files found by filename extension.
|
protected String |
fileNameFilterJSON
JSON representation of the filename filter used to pass this object to the CollectionReader descriptor for
initialization.
|
protected String |
fileNameFilterName
Class name of the file name filter this reader is using.
|
protected String |
inputDirectoryPath
Path to the input directory.
|
protected org.apache.log4j.Logger |
LOG
Logger for class.
|
protected String |
mEncoding
Encoding type for the files being read in.
|
protected ArrayList<File> |
mFileCollection
Array of File objects being processed.
|
protected int |
mFileIndex
Index of the next file to be processed.
|
protected File |
mInDir
Input Directory File object to be searched for available files.
|
protected boolean |
mRecurse
Recurse flag we will search recursively in sub-directories if true.
|
filters, textFilters
Constructor and Description |
---|
BaseFileCollectionReader()
Default constructor used during UIMA initialization.
|
BaseFileCollectionReader(File inputDirectory,
boolean recurse)
Constructor that sets the input directory to be searched and the recurse flag that
controls whether or not the reader will descend in to subdirectories.
|
Modifier and Type | Method and Description |
---|---|
protected void |
findFiles(File f)
Find the list of files that meet the requirements.
|
int |
getCollectionSize()
Return the number of documents in the set.
|
int |
getCurrentIndex() |
String |
getEncoding()
Return the encoding format that this CollectionReader will use for the source data.
|
FilenameFilter |
getFilenameFilter()
Get the filename filter that will be used to filter files found in the input directory.
|
File |
getInputDirectory()
Return the input directory this reader will search for files.
|
abstract void |
getNext(org.apache.uima.cas.CAS aCAS)
Get the next file to be processed in the pipeline.
|
org.apache.uima.util.Progress[] |
getProgress() |
boolean |
hasNext() |
void |
initialize()
This method is called during initialization, and does nothing by default.
|
<T extends BaseFileCollectionReader> |
setEncoding(String encoding)
Set the file encoding from the encoding string provided.
|
<T extends BaseFileCollectionReader> |
setFilenameFilter(FilenameFilter filenameFilter)
Set the FilenameFilter that this object will use to filter the files found in the input directory.
|
<T extends BaseFileCollectionReader> |
setInputDirectory(File inputDirectory)
Set the inputDirectory for this FileSubReader object.
|
<T extends BaseFileCollectionReader> |
setRecurseFlag(boolean recurse)
Set the recurse flag for this property.
|
addFilters, addFilters, close, generateCollectionReaderDescription, getFilters, produceCollectionReader
destroy, getCasInitializer, getProcessingResourceMetaData, initialize, isConsuming, reconfigure, setCasInitializer, typeSystemInit
getConfigParameterValue, getConfigParameterValue, setConfigParameterValue, setConfigParameterValue
getCasManager, getLogger, getMetaData, getResourceManager, getUimaContext, getUimaContextAdmin, setLogger, setMetaData
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
@LeoConfigurationParameter(description="Path to the input directory", mandatory=true) protected String inputDirectoryPath
@LeoConfigurationParameter(description="If true then recurse in subdirectories, defaults to false.") protected boolean mRecurse
@LeoConfigurationParameter protected String mEncoding
@LeoConfigurationParameter(description="JSON representation of the filename filter.") protected String fileNameFilterJSON
@LeoConfigurationParameter(description="Cannonical name of the filename filter") protected String fileNameFilterName
protected File mInDir
protected int mFileIndex
protected FilenameFilter filenameFilter
protected org.apache.log4j.Logger LOG
public BaseFileCollectionReader()
public BaseFileCollectionReader(File inputDirectory, boolean recurse)
inputDirectory
- Input directory to be searchedrecurse
- Recurse flag will descend into subdirectories if true, defaults to false.public void initialize() throws org.apache.uima.resource.ResourceInitializationException
initialize
in class BaseLeoCollectionReader
org.apache.uima.resource.ResourceInitializationException
- if a failure occurs during initialization.public String getEncoding()
public <T extends BaseFileCollectionReader> T setEncoding(String encoding)
encoding
- encoding format to use.public File getInputDirectory()
public <T extends BaseFileCollectionReader> T setInputDirectory(File inputDirectory)
inputDirectory
- the input directory to load files from.public FilenameFilter getFilenameFilter()
public <T extends BaseFileCollectionReader> T setFilenameFilter(FilenameFilter filenameFilter)
filenameFilter
- FilenameFilter - extends BaseFileCollectionReader> Type of the reader instance to return
protected void findFiles(File f)
f
- the file to search. This should be a directory.public <T extends BaseFileCollectionReader> T setRecurseFlag(boolean recurse)
recurse
- if true, sub directories are also searched, otherwise just the specified directory is
used for input.public int getCollectionSize()
public int getCurrentIndex()
public boolean hasNext() throws IOException, org.apache.uima.collection.CollectionException
hasNext
in interface org.apache.uima.collection.base_cpm.BaseCollectionReader
hasNext
in class BaseLeoCollectionReader
IOException
- if there is an error reading the data.org.apache.uima.collection.CollectionException
- if retrieval of the next file failspublic abstract void getNext(org.apache.uima.cas.CAS aCAS) throws IOException, org.apache.uima.collection.CollectionException
getNext
in interface org.apache.uima.collection.CollectionReader
getNext
in class BaseLeoCollectionReader
aCAS
- the cas to populate with the next document.IOException
- if there is an error reading the data.org.apache.uima.collection.CollectionException
- if retrieval of the next file failspublic org.apache.uima.util.Progress[] getProgress()
Copyright © 2018 Department of Veterans Affairs. All Rights Reserved.