Sunday, February 14, 2010

Grep Kindle's Find


One of the cornerstones of the Internet is the ability to search, so it comes as no surprise that the Kindle has a find command.  What is surprising is that Kindle's find has a classic case of dissociative identity disorder!  Books, notes, the Amazon Store, the dictionary, Wikipedia, and Google can all be searched from one interface, but all with slightly varying search abilities. Kindicts will obviously want to know everything they can about the Kindle's find; Kindicted is more than happy to oblige.

Overview
Other than within some specific modes (such as adding a note), typing a letter anywhere in the Kindle interface will bring up a find command.  A find command with options can also be opened by selecting "Search" after pressing the "menu" button.  On the home screen, the find command looks like this:


Pressing the return key, pressing in on the 5-way controller, or using the 5-way controller to select the find command will begin the find process.  If the 5-way controller is pressed to the right, additional options appear:


The "Find" command from the main menu appears like this:


The options are self-explanatory, although the find command works in a slightly different manner, depending on what is being searched ("my items", kindle store", etc.).  A detailed description of the differences is described below.

The find command can also be used within a book, in which case, the "search my items" selection changes to just a "find":


The additional options also changes slightly to display an additional option to search "my items":


At a very basic level, entering text into the search field and selecting search will search for that text in all items, or within an individual book.  Simple, no?  No!  As previously mentioned, find has multiple personalities, and the ability to search using wildcards, symbols, exact text, etc. varies depending on what item the find function is searching within.  Even worse, the find interface changes based on the type of document being searched!  The next few sections examine each find personality in greater detail.  Note that this functionality may change with future system software releases.  Kindicted will amend as necessary.

Before delving into each find, it is useful to note that when a document is loaded into the Kindle, the system software attempts to parse the document and creates a searchable index.  If a newly loaded document cannot be searched, a warning may pop up indicating that document has not yet been indexed.


Once an index has been created for the document, the find command can quickly search the document.  This can take anywhere from a few minutes to a few hours - depending on the number of documents to index.

Find #1: The Home Screen and Books/Documents 

As mentioned above, typing (or selecting the Search menu command) from the Home screen opens up a small input area at the bottom of the display.

Type in a word or several words, and the Kindle will return either a list (actually another home screen-like interface), or "No Items", which means that the word (or words) were not found in all searchable books.  A search with at least one match results in a list of matching titles.  By default, the items will be in decreasing order of relevance, which isn't like a Google relevance, it simply counts the number of matches.


At this point, it is important to point out that PDF files are excluded from the search.  PDFs can be searched individually, but with a 'different' search command (more later).

Assuming that one or more items matched the search, the list appears to be a filtered list of home page items, which it is, but it is not the home page.  To return to the home page, select "Home" and the search results will be cleared.  If, however, you press up on the 5-way controller within the search results, pressing left or right will reveal a list of additional sorting and filtering options.  Pressing left displays a list of document type filters:


Pressing right reveals sorting options:


The additional options are self-explanatory.

If an item in the search results list is highlighted (underlined), pressing in on the 5-way controller opens up a list of search results for that item.

Note that notes are searched, as well as the document text.  Also note that an additional option appears beneath the find text input area: "Close Search Results".  Selecting this (or pressing "Back") will return to the search result list.  Another search can also be initiated, but it does not search the search results: it is an entirely new search on the item that is open.

Note that initiating another search at this point "stacks" the searches.  Selecting "Close Search Results" will display the previous search results, and so on, depending on how many searches were initiated.  Selecting the "Home" button at any point will abandon all searches.

Closing all searches will ultimately lead to the original search of all items, which can be closed by pressing the "Home" button.

Searching Within a Document
Whether searching from the home screen, or from within a document, the interface and search 'personality' is essentially the same.  If a document is selected from the search results, or if a search is initiated within a document, a list of search results is displayed for the individual document:


At this point, many Kindicts ask two key questions: what is the syntax for entering words in find, and how does the find command handle multiple words?  The answers are not straightforward, but here goes:

Find Syntax for Documents (except PDF)
Right off the bat, the find command for documents does not have wildcards, grep syntax, etc.  Multiple words can be entered separated by a space, as can e-mail addresses and web sites.  All symbols are summarily ignored (and can even be in the middle of words) except for "." and "@", which are used for searching the aforementioned sites and addresses.  Here are a few observations:
  • Plurals are automatically included in search results.
  • Common small words are ignored ("of", "and", "the", etc.).
  • Searches are case-insensitive.
  • Characters with accents can usually be searched by simply ignoring the accent and entering the underlying character (e.g. search for c if the character is ĉ).  Note that this does not work for characters such as ø.
  • Individual letters can be searched.
  • All surrounding punctuation/symbols are highlighted in the search results, but cannot be specified in the search.
  • Numbers can be searched - leading zeroes are relevant.
  • Symbols between words (such as the "://" in "http://www") can be excluded entirely - the search will still work (provided the url does not continue on a new line after the "http://"). In order words "httpwww" is equivalent to "http://www".
Again, please keep in mind that these rules are only for searching documents.  PDFs, Wikipedia, Google, and the Amazon store all have different search syntax.

Multiple Word Handling for Documents
Multiple words can be entered in the find input area, each separated by a space.  The find command will attempt to find all words within a "Location", which roughly equates to a sentence, but not exactly.  The easiest way to think of the scope of a multiple word search is that all the words entered have to be "close" to each other - within a sentence or two.

Oddly enough, since the document search is not an exact match, the words in an matching phrase can be in any order.  For example, searching for "the deck stood on the burning boy" will return "the boy stood on the burning deck" as a match.

Back to the Document Search
With the document search syntax in mind, the search results make a bit more sense.  Each location contains the search terms, or a blank display will indicate that no matching word(s) were found.

If a matching location is selected, the location containing the search word will be at the top of the display, with the rest of the document following.  Except for the ability to return to the search results (by selecting "Back") the interface is identical to simply reading a book.  If another search is initiated at this point, the searches are again "stacked" - pressing "Back" will display the prior screen.  Selecting "Home" will clear all stacked searches and results and return to the Home display.

Find #2: PDFs 

Although a search from the Home screen will exclude PDFs, PDFs that do not consist entirely of images can be searched.  Opening a PDF and typing or selecting "Search this Document" from the menu will open an search area at the bottom of the display.  Typing in any text will search for the exact text entered.  This includes all symbols, spaces, etc.  What this also means is that entering a word like "enter" will find "enter", "enters", "entering" and so on.

Instead of search results, the PDF search searches forward from the current location and highlights where the term is found.  Backwards and forwards options within the search area appear, which enable next and previous searching for the words within the document.


There is no ability to enter wildcards or any grep-like syntax.  Words within images are also not searched.  In many ways, PDF search is more powerful than a document search, as exact phrases and symbols are relevant.  Searches are case-insensitive, and multiple words separated by spaces will search for the exact phrase - including the space.  Searches are also not stacked - there is only one search input open at a time.  Pressing right on the 5-way controller will reveal additional options, including an option to search "my items", although the search of "my items" follows the "rules" as indicated in the sections above.

Find #3: Amazon Store

The Amazon store find is a basic search with wildcard capabilities.  Exact phrases within quotes do not work, but adding an "*" to the end of any word or words will result in a search for all words in the titles or descriptions of books beginning with the characters before the "*".  Entering a phrase will search for all words - not the exact phrase, although the search rank will usually return the best match at the top of the list.  If there is only one match, the item will open, and "Buy" will be highlighted.  Be careful not to press on on the 5-way, or you will have to act fast to cancel the order.

Find #4: Dictionary

Searching the dictionary is straightforward.  The results will commence at the closest word matching the text entered.  All other characters are ignored.  Selecting "Back" will return to the previous display.  Searching the dictionary within a book in this manner is useful for looking up words then returning to the book.

Find #5: Wikipedia

A Wikipedia find operates as one would expect.  Search items in quotes are treated as exact phrases, and the "*" can be used as a wildcard character, resulting in a list of all Wikipedia entered that begin with the letters before the "*".  Multiple words can be entered, but only Wikipedia entries are searched - not the text within the entry.  Again, selecting back will return to the previous display.

Find #6: Google

For US Kindle owners, a Google search operates as one would anticipate.  The Kindle simply passes the text and symbols entered to Google.

International Kindle owners are out of luck, as web browsing is not allowed.

Find #7: Go to Web

Again, for US Kindle owners, this will simply open the URL entered in the find box.  This is confusing, as it is not so much a find as an "open web site" option.

Again, international Kindle owners will have to use another device, as this option is not available.

Conclusion
The Kindle find command could really use some psychotherapy!  A central find option that returned all off-web results (my items, PDFs, and the dictionary) would greatly assist in actually finding a phrase.  Also, an option for using grep-like syntax and wildcards would appeal to Kindicts who love to geek-out on their searches.

In the meantime, Kindicted hopes that this article cleared up some of the finer details of the current find command.

As always, corrections, comments, and additions are appreciated.

11 comments:

  1. Just got my Kindle and I am disappointed that you cannot search for an exact phrase in a document. For example it is not surprising that the words 'to', 'be' 'or' and 'not appear quite a lot in the works of Shakespeare and in many different combinations! Searching for quotations in any work would have its uses though...especially for study. Mind you, I was impressed that I was able to get a quick definition for 'fardel'.

    ReplyDelete
  2. Nice work. However you probably concentrated on how it works in the English language, in particular the part of "Plurals are automatically included in search results" This is true only in English. The part of the accents is also not 100% correct. As far as I checked it, the only special letters it finds are the German ä=a, ö=o, ü-u, ß=ss and the French letter é)Besides of the unsearchable special letters used in many European languaes (Spanish, Portugues, French, Danish,Swedish etc) it of course cannot search anything in laguages not using Latin letters (Cyrillic, Greek, Chinese, Korean)

    ReplyDelete
  3. I typed in Kindle Search the word mindfulness in a particular Kindle book i am reading. The search came up with every use of mind as well as mindfulness. There seems to be no way to restrict the search for only the word mindfulness. Can you advise?

    ReplyDelete