|
Just to be sure this bit doesn't get lost anywhere:
Old indexer (JR-1.0.1) was using following textFilterClasses:
org.apache.jackrabbit.core.query.MsExcelTextFilter,
org.apache.jackrabbit.core.query.MsPowerPointTextFilter,
org.apache.jackrabbit.core.query.MsWordTextFilter,
org.apache.jackrabbit.core.query.PdfTextFilter,
org.apache.jackrabbit.core.query.HTMLTextFilter,
org.apache.jackrabbit.core.query.XMLTextFilter,
org.apache.jackrabbit.core.query.RTFTextFilter
New indexer (JR-1.3.3) is using following ones:
org.apache.jackrabbit.extractor.MsWordTextExtractor,
org.apache.jackrabbit.extractor.MsExcelTextExtractor,
org.apache.jackrabbit.extractor.MsPowerPointTextExtractor,
org.apache.jackrabbit.extractor.PdfTextExtractor,
org.apache.jackrabbit.extractor.OpenOfficeTextExtractor,
org.apache.jackrabbit.extractor.RTFTextExtractor,
org.apache.jackrabbit.extractor.HTMLTextExtractor,
org.apache.jackrabbit.extractor.PlainTextExtractor,
org.apache.jackrabbit.extractor.XMLTextExtractor
|