Search
Welcome to M-Files Empower – our new support experience. We'd love to hear what you think!Give feedback
Home/Support and troubleshooting/Troubleshooting/FAQ/Daily use of M-Files

Searching for Exact Words Does Not Work as Expected with OCRed PDFs

Last updated on 6 December 2023

Admin
Search

Overview

When using OCRed PDFs, an issue may occur when searching for exact words.

For instance:

Exact search term used: "biolog"

Results in hits highlighted from OCRed PDFs: "biolog", "biology", "biologist" etc

This behavior is *not* seen with TXT or otherwise directly generated files. Searching for the exact word "biolog" against TXT files will always return just where it finds "biolog".

Solution

This is an unfortunate limitation of the current level of OCR technology. OCR just goes through and sees each individual character and doesn't make a distinction between characters or words. This means for OCRed PDFs as above, it matches against all of those words because it sees "biolog" in each of them so matches and returns it.

There is a current improvement with ID 168386 in our systems to find a better behavior for this against OCRed PDFs. At this time, it does not look like there is any way to change this behavior.

Still need help?

On this page