Abstract
The initial indexing round of the vault might take a considerable time. It can also be hard to estimate beforehand, how much time the initial indexing will take. That is why it is good to have tools to see whether the indexing has finished and if not, where it is going currently. This document describes how to monitor the progress of the indexing in the initial indexing round. It will also be valid in some situations after the initial phase.
NOTE 1! This document is valid for IDOL version 12.x and beyond.
NOTE 2! Using this document requires basic understanding of IDOL components and terminology.
NOTE 3! Using this document requires basic understanding of using MFAutonomyConsole.
1. Introduction
When considering the progress of indexing, it is good to understand that M-Files and IDOL are different entities. They both have individual indexing queues that follow each other: First, M-Files does the actual extraction, analysis and packaging of raw data suitable for IDOL. Then the information goes to IDOL that processes and delivers it for its own engines. That is why the process should be monitored from both ends.
Another thing to know is the difference of the actual progress i.e. how many items have already been indexed and how many are waiting for to be indexed and the state where the systems think they are finished the indexing.
Some of the following methods are valid also outside the initial indexing round when we want to see, if the system is indexing something or not.
2. Initial (re-)indexing finished?
This can be used during initial indexing round or when we have triggered the full re-index from the admin tool.
M-Files server (20.4. and earlier): Check that two timestamps exists in IndexRebuild.txt (M-Files secondary data location e.g. C:\Program Files\M-Files\Server Vaults\Indexes\Combined\M-Files). The first timestamp shows the starting time and another shows the finishing time of the indexing.
M-Files server (20.5. and later): Index information has been moved to its own IDOL index. A specific powershell script has to be used for retrieving the information. Read more here:
IDOL - Retrieving index information in M-Files after May'20 release
M-Files server (all versions): The following location should be empty. C:\Program Files\M-Files\Server Vaults\[vault]\FileData\Temp\FIX\[index registry name]. This is the temporary location used during indexing. If it is empty, no files or folders, there is no indexing queue in M-Files side. It should be noted that this location can be changed via registry setting.
IDOL, backend: The following folder should be empty in every engine if IDOL side is ready. [drive letter]:\IDOL12\data\[name of the engine e.g. PROD1-content-11000]\index\status
3. Progress of indexing?
The following methods show, how many items are still in the queue. These can also be used after the initial indexing.
M-Files server, MetaData (20.4. and earlier): Browse into secondary data and there index location, usually in C:\Program Files\M-Files\Server Vaults\[vault name]\Indexes\Combined\M-Files. Open file IndexMLog.log and get the numeric value of parameter "values". Then use following SQL query against M-Files database to see, how many objects there are in Metadata indexing queue:
SELECT count(*) FROM [vault database name].[dbo].[OBJECTTYPEITEM] where [VERSIONFORMDI] > number from the file
Note that the method above can also be used in the future, when we start to use single-pipeline indexing (Meta- and FileData have been indexed simultaneously). In that case, use file IndexCLog.log.
M-Files server, MetaData (20.5. and later): Index information has been moved to its own IDOL index. A specific powershell script has to be used for retrieving the information. Read more here:
IDOL - Retrieving index information in M-Files after May'20 release
M-Files server, FileData (20.5. and later): Because in 20.5 Meta- and FileData has been indexed simultaneously, this cannot be monitored individually.
After retrieving the timestamp, use previous SQL query to get the queue length.
M-Files server, FileData (20.4. and earlier). You can do the same to the FileData by using IndexFLog.log and following query. Note that this is not valid in single-pipeline mode.
SELECT count(*) FROM [vault database name ].[dbo].[DOCUMENTFILE] where [VERSIONFORFDI] > number from the file
IDOL, overall indexing status. Use MFAutonomyConsole to run indexergetstatus summary against DIH. It will tell the overall status of indexing queue on the IDOL side.
Example: MFIDOLConsole cfg=config_frontend.txt a=indexergetstatus Summary=true > igs.xml
IDOL, amount of queued objects. Use console to run indexergetstatus IndexStatus -7 against some backend engine. It will tell the amount of queued indexing objects in a particular backend engine. This should be zero.
Example: MFIDOLConsole cfg=config_backend1.txt a=indexergetstatus MaxResults=5 IndexStatus=-7 > igs.xml
4. Amount of objects in the index and in M-Files database?
We can do some comparison between the index and M-Files database. It should be understood though, that the amount of objects in the index and in M-Files database are not necessarily 1:1. This is because there are items that are indexed, but are not considered as objects from M-Files point of view. For example, if we use Network Folder Connector or SharePoint connector, folders and even page URL's might get indexed. That said, this comparison should be used only for guiding purposes.
IDOL, number of indexed objects. Use console to run getstatus against DAH. The resulting .xml file will tell you the amount of documents overall and in each index database (=vault).
Example: MFIDOLConsole cfg=config.txt a=getstatus > gs.xml
M-Files, number of documents in the database. There are few ways to get the number:
a. Use the admin tool and Content Replication and Archiving > One-time Export > select Export objects and files > click Preview. This is viable method in small vaults (< 500K of objects). In larger vaults the admin tool might hang.
b. The faster way is to use SQL queries against the wanted vault database. This way the size of the vault does not matter.
Number of all objects:
SELECT COUNT(*) FROM [OBJECTTYPEITEM];
Number of objects divided into Object Types:
SELECT [OBJECTTYPE], COUNT(*) FROM [OBJECTTYPEITEM] GROUP BY [OBJECTTYPE] ORDER BY 2 DESC;
Compare the number to the amount of documents in the index. As said, the amount is likely not 1:1, but it should be near.
Keywords: IDOL, migration, monitoring
