The recoll program provides the main user interface for searching. It is based on the Qt library.
recoll has two search modes:
Simple search (the default, on the main screen) has a single entry field where you can enter multiple words.
Advanced search (a panel accessed through the Tools menu or the toolbox bar icon) has multiple entry fields, which you may use to build a logical condition, with additional filtering on file type and location in the file system.
In most cases, you can enter the terms as you think them, even if they contain embedded punctuation or other non-textual characters. For exemple, Recoll can handle things like e-mail addresses, or arbitrary cut and paste from another text window, punctation and all.
The main case where you should enter text differently from how it is printed is for east-asian languages (Chinese, Japanese, Korean). Words composed of single or multiple characters should be entered separated by white space in this case (they would typically be printed without white space).
Start the recoll program.
Possibly choose a search mode: Any term, All terms, File name or Query language.
Enter search term(s) in the text field at the top of the window.
Click the Search button or hit the Enter key to start the search.
The initial default search mode is Query language. Without special directives, this will look for documents containing all of the search terms (the ones with more terms will get better scores), just like the All terms mode which will ignore such directives. Any term will search for documents where at least one of the terms appear.
The Query Language features are described in a separate section.
File name will specifically look for file names. The entry will be split at white space characters, and each fragment will be separately expanded, then the search will be for file names matching all fragments (this is new in 1.15, older releases did an OR of the whole thing which did not make sense). Things to know:
The search is case- and accent-insensitive.
Fragments without any wild card character and not capitalized will be prepended and appended with '*' (ie: etc -> *etc*, but Etc -> etc). Of course it does not make sense to have multiple fragments if one of them is capitalized (as this one will require an exact match).
If you want to search for a pattern including white space, use double quotes (ie: "admin note*").
If you have a big index (many files), excessively generic fragments may result in inefficient searches.
As an example, inst recoll would match recollinstall.in (and quite a few others...).
All search modes allow wildcards inside terms (*, ?, []). You may want to have a look at the section about wildcards for more information about this.
You can search for exact phrases (adjacent words in a given order) by enclosing the input inside double quotes. Ex: "virtual reality".
Character case has no influence on search, except that you can disable stem expansion for any term by capitalizing it. Ie: a search for floor will also normally look for flooring, floored, etc., but a search for Floor will only look for floor, in any character case. Stemming can also be disabled globally in the preferences.
Recoll remembers the last few searches that you performed. You can use the simple search text entry widget (a combobox) to recall them (click on the thing at the right of the text field). Please note, however, that only the search texts are remembered, not the mode (all/any/file name).
Typing Esc Space while entering a word in the simple search entry will open a window with possible completions for the word. The completions are extracted from the database.
Double-clicking on a word in the result list or a preview window will insert it into the simple search entry field.
You can cut and paste any text into an All terms or Any term search field, punctuation, newlines and all - except for wildcard characters (single ? characters are ok). Recoll will process it and produce a meaningful search. This is what most differentiates this mode from the Query Language mode, where you have to care about the syntax.
You can use the Tools / Advanced search dialog for more complex searches.
After starting a search, a list of results will instantly be displayed in the main list window.
By default, the document list is presented in order of relevance (how well the system estimates that the document matches the query). You can sort the result by ascending or descending date by using the vertical arrows in the toolbar (the old sort tool is gone after release 1.15, because the new result table has much better capability).
Clicking on the Preview link for an entry will open an internal preview window for the document. Further Preview clicks for the same search will open tabs in the existing preview window. You can use Shift+Click to force the creation of another preview window, which may be useful to view the documents side by side. (You can also browse successive results in a single preview window by typing Shift+ArrowUp/Down in the window).
Clicking the Open link will attempt to start an external viewer. The viewer for each document type can be configured through the user preferences dialog, or by editing the mimeview configuration file. You can also check the Use desktop preferences option in the user preferences dialog to use the desktop defaults for all documents. This is probably the best option if you are using a well configured Gnome or KDE desktop.
The Preview and Open edit links may not be present for all entries, meaning that Recoll has no configured way to preview a given file type (which was indexed by name only), or no configured external editor for the file type. This can sometimes be adjusted simply by tweaking the mimemap and mimeview configuration files (the latter can be modified with the user preferences dialog).
The format of the result list entries is entirely configurable by using the preference dialog to edit an HTML fragment.
You can click on the Query details link at the top of the results page to see the query actually performed, after stem expansion and other processing.
Double-clicking on any word inside the result list or a preview window will insert it into the simple search text.
The result list is divided into pages (the size of which you can change in the preferences). Use the arrow buttons in the toolbar or the links at the bottom of the page to browse the results.
Apart from the preview and edit links, you can display a pop-up menu by right-clicking over a paragraph in the result list. This menu has the following entries:
Preview
Open
Copy File Name
Copy Url
Save to File
Find similar
Preview Parent document
Open Parent document
The Preview and Open entries do the same thing as the corresponding links.
The Copy File Name and Copy Url copy the relevant data to the clipboard, for later pasting.
Save to File allows saving the contents of a result document to a chosen file. This entry will only appear if the document does not correspond to an existing file, but is a subdocument inside such a file (ie: an email attachment). It is especially useful to extract attachments with no associated editor.
The Find similar entry will select a number of relevant term from the current document and enter them into the simple search field. You can then start a simple search, with a good chance of finding documents related to the current result.
The Parent document entries will appear for documents which are not actually files but are part of, or attached to, a higher level document. This entry is mainly useful for email attachments and permits viewing the message to which the document is attached. Note that the entry will also appear for an email which is part of an mbox folder file, but that you can't actually visualize the folder (there will be an error dialog if you try). Recoll is unfortunately not yet smart enough to disable the entry in this case. In other cases, the Open option makes sense, for exemple to start a chm viewer on the parent document for a help page.
In Recoll 1.15 and newer, the results can now be shown in a spreadsheet-like display. You can switch to this presentation by clicking the table-like icon in the toolbar (this is a toggle, click again to restore the list).
Clicking on the column headers will allow sorting by the values in the column. You can click again to invert the order, and use the header right-click menu to reset sorting to the default relevance order.
Both the list and the table display the same underlying results. The sort order set from the table is still active if you switch back to the list mode. You can click twice on a date sort arrow to reset it from there.
The header right-click menu allows adding or deleting columns. The columns can be resized, and their order can be changed (by dragging). All the changes are recorded when you quit recoll
Hovering over a table row will update the detail area at the bottom of the window with the corresponding values. You can click the row to freeze the display. The bottom area is equivalent to a classical result list paragraph, with links for starting a preview or a native application, and an equivalent right-click menu.
The preview window opens when you first click a Preview link inside the result list.
Subsequent preview requests for a given search open new tabs in the existing window (except if you hold the Shift key while clicking which will open a new window for side by side viewing).
Starting another search and requesting a preview will create a new preview window. The old one stays open until you close it.
You can close a preview tab by typing ^W (Ctrl + W) in the window. Closing the last tab for a window will also close the window.
Of course you can also close a preview window by using the window manager button in the top of the frame.
You can display successive or previous documents from the result list inside a preview tab by typing Shift+Down or Shift+Up (Down and Up are the arrow keys).
The preview tabs have an internal incremental search function. You initiate the search either by typing a / (slash) or CTL-F inside the text area or by clicking into the Search for: text field and entering the search string. You can then use the Next and Previous buttons to find the next/previous occurrence. You can also type F3 inside the text area to get to the next occurrence.
If you have a search string entered and you use ^Up/^Down to browse the results, the search is initiated for each successive document. If the string is found, the cursor will be positioned at the first occurrence of the search string.
A right-click menu in the text area allows switching between displaying the main text or the contents of fields associated to the document (ie: author, abtract, etc.). This is especially useful in cases where the term match did not occur in the main text but in one of the fields.
You can print the current preview window contents by typing ^P (Ctrl + P) in the window text.
The advanced search dialog helps you build more complex queries without memorizing the search language constructs. It can be opened through the Tools menu or through the main toolbar.
The dialog has three parts:
The top part allows constructing a query by combining multiple clauses of different types. Each entry field is configurable for the following modes:
All terms.
Any term.
None of the terms.
Phrase (exact terms in order within an adjustable window).
Proximity (terms in any order within an adjustable window).
Filename search.
Additional entry fields can be created by clicking the Add clause button.
When searching, the non-empty clauses will be combined either with an AND or an OR conjunction, depending on the choice made on the left (All clauses or Any clause).
Entries of all types except "Phrase" and "Near" accept a mix of single words and phrases enclosed in double quotes. Stemming and wildcard expansion will be performed as for simple search.
The next part allows filtering the results by their mime types.
The state of the file type selection can be saved as the default (the file type filter will not be activated at program start-up, but the lists will be in the restored state).
The bottom part allows restricting the search results to a sub-tree of the indexed area. If you need to do this often, you may think of setting up multiple indexes instead, as the performance will be much better.
Phrases and Proximity searches. These two clauses work in similar ways, with the difference that proximity searches do not impose an order on the words. In both cases, an adjustable number (slack) of non-matched words may be accepted between the searched ones (use the counter on the left to adjust this count). For phrases, the default count is zero (exact match). For proximity it is ten (meaning that two search terms, would be matched if found within a window of twelve words). Examples: a phrase search for quick fox with a slack of 0 will match quick fox but not quick brown fox. With a slack of 1 it will match the latter, but not fox quick. A proximity search for quick fox with the default slack will match the latter, and also a fox is a cunning and quick animal.
Click on the Start Search button in the advanced search dialog, or type Enter in any text field to start the search. The button in the main window always performs a simple search.
Click on the Show query details link at the top of the result page to see the query expansion.
Recoll automatically manages the expansion of search terms to their derivatives (ie: plural/singular, verb inflections). But there are other cases where the exact search term is not known. For example, you may not remember the exact spelling, or only know the beginning of the name.
The term explorer tool (started from the toolbar icon or from the Term explorer entry of the Tools menu) can be used to search the full index terms list. It has three modes of operations:
In this mode of operation, you can enter a search string with shell-like wildcards (*, ?, []). ie: xapi* would display all index terms beginning with xapi. (More about wildcards here).
This mode will accept a regular expression as input. Example: word[0-9]+. The expression is implicitely anchored at the beginning. Ie: press will match pression but not expression. You can use .*press to match the latter, but be aware that this will cause a full index term list scan, which can be quite long.
This mode will perform the usual stem expansion normally done as part user input processing. As such it is probably mostly useful to demonstrate the process.
In this mode, you enter the term as you think it is spelled, and Recoll will do its best to find index terms that sound like your entry. This mode uses the Aspell spelling application, which must be installed on your system for things to work (if your documents contain non-ascii characters, Recoll needs an aspell version newer than 0.60 for UTF-8 support). The language which is used to build the dictionary out of the index terms (which is done at the end of an indexing pass) is the one defined by your NLS environment. Weird things will probably happen if languages are mixed up.
Note that in cases where Recoll does not know the beginning of the string to search for (ie a wildcard expression like *coll), the expansion can take quite a long time because the full index term list will have to be processed. The expansion is currently limited at 200 results for wildcards and regular expressions.
Double-clicking on a term in the result list will insert it into the simple search entry field. You can also cut/paste between the result list and any entry field (the end of lines will be taken care of).
Multiple Recoll databases or indexes can be created by using several configuration directories which are usually set to index different areas of the file system. A specific index can be selected for updating or searching, using the RECOLL_CONFDIR environment variable or the -c option to recoll and recollindex.
A recollindex program instance can only update one specific index.
A recoll program instance is also associated with a specific index, which is the one to be updated by its indexing thread, but it can use any number of Recoll indexes for searching. The external indexes can be selected through the external indexes tab in the preferences dialog.
Index selection is performed in two phases. A set of all usable indexes must first be defined, and then the subset of indexes to be used for searching. Of course, these parameters are retained across program executions (there are kept separately for each Recoll configuration). The set of all indexes is usually quite stable, while the active ones might typically be adjusted quite frequently.
The main index (defined by RECOLL_CONFDIR) is always active. If this is undesirable, you can set up your base configuration to index an empty directory.
As building the set of all indexes can be a little tedious when done through the user interface, you can use the RECOLL_EXTRA_DBS environment variable to provide an initial set. This might typically be set up by a system administrator so that every user does not have to do it. The variable should define a colon-separated list of index directories, ie:
export RECOLL_EXTRA_DBS=/some/place/xapiandb:/some/other/db
A typical usage scenario for the multiple index feature would be for a system administrator to set up a central index for shared data, that you choose to search or not in addition to your personal data. Of course, there are other possibilities. There are many cases where you know the subset of files that should be searched, and where narrowing the search can improve the results. You can achieve approximately the same effect with the directory filter in advanced search, but multiple indexes will have much better performance and may be worth the trouble.
Documents that you actually view (with the internal preview or an external tool) are entered into the document history, which is remembered.
You can display the history list by using the Tools/Doc History menu entry.
You can erase the document history by using the Erase document history entry in the File menu.
The documents in a result list are normally sorted in order of relevance. It is possible to specify different sort parameters by using the Sort parameters dialog (located in the Tools menu).
The tool sorts a specified number of the most relevant documents in the result list, according to specified criteria. The currently available criteria are date and mime type.
The sort parameters stay in effect until they are explicitly reset, or the program exits. An activated sort is indicated in the result list header.
Sort parameters are remembered between program invocations, but result sorting is normally always inactive when the program starts. It is possible to keep the sorting activation state between program invocations by checking the Remember sort activation state option in the preferences.
It is also possible to hide duplicate entries inside the result list (documents with the exact same contents as the displayed one). The test of identity is based on an MD5 hash of the document container, not only of the text contents (so that ie, a text document with an image added will not be a duplicate of the text only). Duplicates hiding is controlled by an entry in the Query configuration dialog, and is off by default.
Term completion. Typing Esc Space in the simple search entry field while entering a word will either complete the current word if its beginning matches a unique term in the index, or open a window to propose a list of completions.
Picking up new terms from result or preview text. Double-clicking on a word in the result list or in a preview window will copy it to the simple search entry field.
Wildcards. Wildcards can be used inside search terms in all forms of searches. More about wildcards.
Automatic suffixes. Words like odt or ods can be automatically turned into query language ext:xxx clauses. This can be enabled in the Search preferences panel in the GUI.
Disabling stem expansion. Entering a capitalized word in any search field will prevent stem expansion (no search for gardening if you enter Garden instead of garden). This is the only case where character case should make a difference for a Recoll search. You can also disable stem expansion or change the stemming language in the preferences.
Finding related documents. Selecting the Find similar documents entry in the result list paragraph right-click menu will select a set of "interesting" terms from the current result, and insert them into the simple search entry field. You can then possibly edit the list and start a search to find documents which may be apparented to the current result.
File names. File names are added as terms during indexing, and you can specify them as ordinary terms in normal search fields (Recoll used to index all directories in the file path as terms. This has been abandoned as it did not seem really useful). Alternatively, you can use the specific file name search which will only look for file names, and may be faster than the generic search especially when using wildcards.
Phrases and Proximity searches. A phrase can be looked for by enclosing it in double quotes. Example: "user manual" will look only for occurrences of user immediately followed by manual. You can use the This phrase field of the advanced search dialog to the same effect. Phrases can be entered along simple terms in all simple or advanced search entry fields (except This exact phrase).
AutoPhrases. This option can be set in the preferences dialog. If it is set, a phrase will be automatically built and added to simple searches when looking for Any terms. This will not change radically the results, but will give a relevance boost to the results where the search terms appear as a phrase. Ie: searching for virtual reality will still find all documents where either virtual or reality or both appear, but those which contain virtual reality should appear sooner in the list.
Using fields. You can use the query language and field specifications to only search certain parts of documents. This can be especially helpful with email, for example only searching emails from a specific originator: search tips from:helpfulgui
Query explanation. You can get an exact description of what the query looked for, including stem expansion, and Boolean operators used, by clicking on the result list header.
Browsing the result list inside a preview window. Entering Shift-Down or Shift-Up (Shift + an arrow key) in a preview window will display the next or the previous document from the result list. Any secondary search currently active will be executed on the new document.
Scrolling the result list from the keyboard. You can use PageUp and PageDown to scroll the result list, Shift+Home to go back to the first page. These work even while the focus is in the search entry.
Forced opening of a preview window. You can use Shift+Click on a result list Preview link to force the creation of a preview window instead of a new tab in the existing one.
Closing previews. Entering ^W in a tab will close it (and, for the last tab, close the preview window). Entering Esc will close the preview window and all its tabs.
Printing previews. Entering ^P in a preview window will print the currently displayed text.
Quitting. Entering ^Q almost anywhere will close the application.
You can customize some aspects of the search interface by using the Query configuration entry in the Preferences menu.
There are several tabs in the dialog, dealing with the interface itself, the parameters used for searching and returning results, and what indexes are searched.
Number of results in a result page:
Hide duplicate results: decides if result list entries are shown for identical documents found in different places.
Highlight color for query terms: Terms from the user query are highlighted in the result list samples and the preview window. The color can be chosen here. Any Qt color string should work (ie red, #ff0000). The default is blue.
Result list font: There is quite a lot of information shown in the result list, and you may want to customize the font and/or font size. The rest of the fonts used by Recoll are determined by your generic Qt config (try the qtconfig command).
Result paragraph format string: allows you to change the presentation of each result list entry. This is described in its own section.
Maximum text size highlighted for preview Inserting highlights on search term inside the text before inserting it in the preview window involves quite a lot of processing, and can be disabled over the given text size to speed up loading.
Use desktop preferences to choose document editor: if this is checked, the xdg-open utility will be used to open files when you click the Open link in the result list, instead of the application defined in mimeview. xdg-open will in term use your desktop preferences to choose an appropriate application.
Choose editor applications this will let you choose the command started by the Open links inside the result list, for specific document types.
Display category filter as toolbar... this will let you choose if the document categories are displayed as a list or a set of buttons.
Auto-start simple search on white space entry: if this is checked, a search will be executed each time you enter a space in the simple search input field. This lets you look at the result list as you enter new terms. This is off by default, you may like it or not...
Start with advanced search dialog open and Start with sort dialog open: If you use these dialogs all the time, checking these entries will get them to open when recoll starts.
Remember sort activation state if set, Recoll will remember the sort tool stat between invocations. It normally starts with sorting disabled.
Prefer HTML to plain text for preview if set, Recoll will display HTML as such inside the preview window. If this causes problems with the Qt HTML display, you can uncheck it to display the plain text version instead.
Stemming language: stemming obviously depends on the document's language. This listbox will let you chose among the stemming databases which were built during indexing (this is set in the main configuration file), or later added with recollindex -s (See the recollindex manual). Stemming languages which are dynamically added will be deleted at the next indexing pass unless they are also added in the configuration file.
Dynamically add phrase to simple searches: a phrase will be automatically built and added to simple searches when looking for Any terms. This will give a relevance boost to the results where the search terms appear as a phrase (consecutive and in order).
Replace abstracts from documents: this decides if we should synthesize and display an abstract in place of an explicit abstract found within the document itself.
Dynamically build abstracts: this decides if Recoll tries to build document abstracts when displaying the result list. Abstracts are constructed by taking context from the document information, around the search terms. This can slow down result list display significantly for big documents, and you may want to turn it off.
Synthetic abstract size: adjust to taste...
Synthetic abstract context words: how many words should be displayed around each term occurrence.
Query language magic file name suffixes: a list of words which automatically get turned into ext:xxx file name suffix clauses when starting a query language query (ie: doc xls xlsx...). This will save some typing for people who use file types a lot when querying.
External indexes: This panel will let you browse for additional indexes that you may want to search. External indexes are designated by their database directory (ie: /home/someothergui/.recoll/xapiandb, /usr/local/recollglobal/xapiandb).
Once entered, the indexes will appear in the External indexes list, and you can chose which ones you want to use at any moment by checking or unchecking their entries.
Your main database (the one the current configuration indexes to), is always implicitly active. If this is not desirable, you can set up your configuration so that it indexes, for example, an empty directory. An alternative indexer may also need to implement a way of purging the index from stale data,
The presentation of each result inside the result list can be customized by setting the result list paragraph format inside the User Interface tab of the Query configuration.
This is a Qt HTML string where the following printf-like % substitutions will be performed:
%A. Abstract
%D. Date
%I. Icon image name
%K. Keywords (if any)
%L. Preview and Edit links
%M. Mime type
%N. result Number
%R. Relevance percentage
%S. Size information
%T. Title
%U. Url
In addition to the predefined values above, all strings like %(fieldname) will be replaced by the value of the field named fieldname for this document. Only stored fields can be accessed in this way, the value of indexed but not stored fields is not known at this point in the search process (see field configuration). There are currently very few fields stored by default, apart from the values above (only author), so this feature will need some custom local configuration to be useful. For example, you could look at the fields for the document types of interest (use the right-click menu inside the preview window), and add what you want to the list of stored fields. A candidate example would be the recipient field which is generated by the message filters.
The default value for the paragraph format string is:
<img src="%I" align="left">%R %S %L <b>%T</b><br> %M %D <i>%U</i> %i<br> %A %KYou may, for example, try the following for a more web-like experience:
<u><b><a href="P%N">%T</a></b></u><br> %A<font color=#008000>%U - %S</font> - %LOr the clean looking:
<img src="%I" align="left">%L <font color="#900000">%R</font> <b>%T</b><br>%S <font color="#808080"><i>%U</i></font> <table bgcolor="#e0e0e0"> <tr><td><div>%A</div></td></tr> </table>%KNote that the P%N link in the above paragraph makes the title a preview link.
Due to the way the program handles right mouse clicks in the result list, if the custom formatting results in multiple paragraphs per result, right clicks will only work inside the first one.