Overview
This document describes the settings that affect the amount of indexed data. IDOL and Smart Search focus on high speed, so as a tradeoff they don't index all raw text data from the documents. In most real-life applications this is not an issue.
It should be noted that the maximum amount of indexed raw text differs between search engines. Smart Search can index up to 1 Megabyte of raw text, while IDOL and dtSearch can index up to 2 Megabytes.
Solution
There are three settings in the Advanced Vault Settings that affect indexing. Maximum File Size for Indexing, Maximum Plain-Text Length and Maximum Length of Single-File Content.
Maximum File Size for Indexing determines the maximum file size that we accept for indexing. The setting is found in Advanced Vault Settings > Configuration > Search > Full-Text Search. The default value is 256 Megabytes.
Maximum Plain-Text Length determines how much raw text data we extract from the document. As it says, raw text is just text i.e. it does not include pictures or other non-readable data. The setting can be found from Advanced Vault Settings > Configuration > File Previews > Viewer Files. The default value is 2 Megabytes.
Maximum Length of Single-File Content is the last setting, and it defines how much of the extracted data is saved to the index. The maximum is 2 megabytes, though it should be remembered that Smart Search only indexes up to 1 Megabyte. The setting is found in Advanced Vault Settings > Configuration > Search > Indexes > [index] > Additional Options > Limits for Indexed Data. The default value is 100 Kilobytes. We recommend that the maximum would be around 500 Kilobytes or there might be an impact on performance.
Maximum Length of File Contents is related to multi-files. The setting is the maximum amount reserved for all objects inside the multi-file. Each file has a share of the reserved length (maxTotalFileDataLengthKb / number of files on the object), with a minimum reservation of one kilobyte per file.
