Search in File contents (pdf, word, excel etc..)
This section will guide you each step to index and search media f contents.
Open up the Index table submenu, located under the Ajax Search Pro main menu.
On the General panel, under the Post types to index option, choose the Attachment - Media post type, that will unlock the Media Service and File indexing options.
Once you have a license key, you can this documentation on how to enable it - although it is fairly simple, you just put the key into the input field, and hit the Activate button. Once it is activated, the plugin will attempt to index all file types selected via this parser automatically. That's it!
Each attachment has a so-called mime type. The file mime type determines what file the system is dealing with.
Too choose the type, simply scroll down to the File indexing options section, and choose the file types you wish to index.
If you wish, you can switch to manual mode by clicking the >>Enter Manually<< link.
After entering the desired mime types, the the file content indexing options will unlock (based on which mime types are entered)
Click on the On/Off buttons to switch which file type contents should be indexed.
After choosing all the desired options, it is time to Save the configuration on the bottom of the page, and then generating the index.
On the search instance options, go to the Search Sources -> Media Files Search panel. After doing so, change the first two options:
- Return media files as results: ON
- Search engine for media results: Index table engine
Save the options, and it is done. The search should return attachments based on their content now.
When the Media Service is not enabled, the local file parsers are used. Because these have to be executed on your local server, they depend on the local server performance as well, and they are generally less accurate and less efficient.
Only some of the parser scripts require some standard libraries to be installed/enabled. Usually these modules are enabled on most server hosts by default.
Indexing other documents is still possible, without meeting these requirements (RTF, TXT, CSV etc..)
The local parser libraries are highly optimized, and their performance mostly depends on the actual server performance, however there are a few things to consider when using an average server, that may affect the performance greatly:
- Document length - documents over 30-60 pages can get very difficult to index, and may fail, especially PDF files. Therefore it is not recommended to use this feature to index long books/documents.
- File size - documents with large images/attachments can be difficult and costly to read from the servers perspective. Optimally, the document should only contain the text to be indexed, although some graphics should not be an issue at all.
- Secured or Password protected documents - Secure or password protected documents are not possible to parse.