Overview
Text extractor, also called plain text extractor is a component that extracts text content from a file. After the extraction, the content will be sent to the search engine.There are two extractors in M-Files: CFS and dtSearch. CFS is currently a deprecating technology. M-Files 20.12.x and after use dtSearch by default. From M-Files 21.6 and onwards dtSearch is always used and it is not possible even to force the CFS into use.
Still, you might have forced CFS as an extractor and you are using M-Files version earlier than 21.6. This article will describe, how to switch the extractor into dtSearch in that scenario.
The procedure
1. Open M-Files admin tool.2. Navigate to the corresponding site.
3. Select Configurations > Advanced Vault Settings > Configuration > Search > Indexes (x) > [corresponding index].
4. Select Advanced tab from the top right.
5. In the JSON's additionalOptions section there might be plainTextExtraction and useCfs=true if the use of CFS has been forced. Example:
"generatedIndexName": "MF-{17E2AF7E-D782-4573-84EF-94CAE7CD30A0}",
"partitionCount": 2,
"useLanguageAnalysis": false,
"plainTextExtraction": {
"useCfs": true
},
b. Remove the whole plainTextExtraction section. The outcome of the JSON is like this:
"generatedIndexName": "MF-{17E2AF7E-D782-4573-84EF-94CAE7CD30A0}",
"partitionCount": 1,
"useLanguageAnalysis": false,
"limitsForIndexedData": {
"maxFieldLengthKb": 100,
"maxTotalFileDataLengthKb": 900
}
},
After the restart the vault will use dtSearch as a plain text extractor.
