Friday, February 26, 2010

Kindle Text-to-Speech Dissected: Part 2 - TTS History

Many Kindicts love the text-to-speech (TTS) feature of the Kindle2 and DX, which reads books aloud in a user-selectable voice.  The ability for TTS to read a book's text is dependent on whether or not the book is TTS 'enabled', which is an interesting subject on its own, but not the focus of this article - the second in a series on TTS technologies.  Below, Kindicted presents a very brief history of the 'speech' portion of TTS.

History is littered with 'odd' individuals who were fixated on the technology du jour - distant evolutionary cousins of today's uber-geeks.  In that vein, every few hundred years or so, a historical figure became obsessed with building a machine or apparatus that could mimic the human voice.  The motive behind such early obsessions was not entirely clear, suffice it to say that since the primarily means of human communication is speech, a talking machine would net a profit if put to some practical use.  Regardless of the motivation, these individuals made strides in analysis of human speech, and occasionally the yardsticks of knowledge were moved forward a bit.

Tube-Tied
Early machines consisted of user-modifiable tubes and bellows to produce vowel sounds.  The subsequent addition of a mechanical tongue and lips enabled consonant sounds to be produced (along with a likely side effect: lonely inventors more adept at kissing).  The advent of the telephone raised new interest in the study of human speech, and by the 1930s, Homer Dudley, an engineer at Bell Labs, developed an electromechanical (i.e. non-digital) speech synthesizer dubbed VODER (for Voice Operating DEmonstratoR) based on research by fellow Bell scientists, led by Harvey Fletcher.  The techniques used for speech synthesis in VODER are still used in today's synthesis hardware - albeit with many refinements.  Note that this voice synthesis is separate from voice encoding (VOCODER), which was originally invented as a means of coding speech for transmission through phone lines, but subsequently used by the musicians as an interesting vocal effect.

Hooked on Phonemes
Post-1950, speech research focused on the phonetic elements of speech.  A phoneme is the smallest unit of sound that can be separately distinguished between sequential utterances (e.g. the 't' sound in 'sat' or 'test').  The production of phonemes during speech produces energy (in the form of sound pressure waves) that can be recorded and analyzed.  If a general model of each phoneme produced through human speech is recorded, an electronic representation of a language can be recorded and labeled; the English language contains 37 to 47 phonetic elements.  Playback of the recorded phonemes in the right sequence, and at the proper speed, produces crude synthesized speech.  Early systems that produced speech in this manner were barely intelligible.  Humans are very sensitive to even minute variations in speech, which makes clear speech synthesis quite difficult.

The human auditory system is the original sound transcoder, transforming sound pressure waves to electrical signals to be processed by the brain.  No less than 5 distinct areas of the brain stem and brain participate in the detection and recognition of sound.  People are particularly sensitive to their own language, and can even detect (at better than chance levels) whether or not an unseen speaker is smiling while they speak.  This extraordinary sensitivity is one of the reasons why people are so adept at detecting when a computer is speaking or controlling speech.  So far, no one has been able to develop a non-scripted speech system that is indistinguishable from a human speaker - but that does not mean that companies haven't tried.

Please answer the Diphone
In order to raise the quality of computerized speech, researchers moved away from phonemes and towards diphones.  Simply put, a diphone is the sound produced in the middle of two phonemes; from a point halfway into the first phoneme to a point halfway into the second.  Diphones (and sometimes half-syllables and triphones) are important in producing a natural-sounding synthesized voice, since the sound of a phoneme is modified slightly by the sound of the next phoneme. There are 1,400 diphones in the English language, which corresponds to 'allowable' combinations of phonemes.  Strictly speaking, the number of diphones should be the square of the number of phonemes, but many phoneme combinations never appear in spoken language.  By using the diphone approach, researchers were able to greatly increase the intelligibility of a synthesized voice, since there were many more realistic phoneme combinations to choose from when constructing the sound.

Live versus Lip Synch
In the 1980s, a battle royale of sorts was unfolding among speech researchers.  In one corner were the fundamentalists; researchers who believed that the purest and most flexible sound came from rule-based speech synthesis.  This included programs that modeled airflow, tongue position, lips, etc. - sort of like a digital representation of the early tube-based apparatus, coupled with an extensive database of rules describing how different phonemes are paired.  In the other corner were the concatenation-based zealots.  This group believed that the key to realistic speech was to build words out of a pre-recorded database of diphones (from a human speaker).  Since the Kindle's TTS uses a concatenation-based system, this series of articles will not cover rule-based speech synthesis to any great detail, suffice it to say that there are pros and cons to each approach, and neither has clearly 'won'.

In the first installment of this series, Tom Glynn (the TTS voice of the kindle), indicated that his diphone recording sessions consisted of reading phrases.  These phrases were selected to cover the 1,400 diphones; the recorded segments were sent through another application that analyzed Tom's speech and automatically converted the speech to diphone segments, which were stored in a database.  These speech segments are dynamically selected and concatenated to create speech, but how does the computer know which segments to concatenate?  The interpretation of text is the subject of a future article in this series.

By the mid 1980s, the majority of TTS technology that is in use today was developed; efforts have since concentrated on interpreting text, applying proper pronunciation and emphasis (a.k.a. prosody), and non-English language support.  In 1987, a corporation was founded with the vision of bringing computerized speech to the masses.  The corporation met its goal, but at a high cost to the owners.  Next week's installment examines the corporate roots of TTS; from university-funded initiatives in the 1960s, to the market leaders today (and the acquisition of over 30 companies in-between).

Friday, February 19, 2010

Kindle Text-to-Speech Dissected: Part 1 - Tom Glynn Interview

Here’s an interesting scenario: you’re listening to your child read a story to you from when they were 6 years old.  Your child is now 35, so this must be a recording, right?  But the book your child is reading was published only last year, and you are playing it for your 5-year old grandchild!  Sounds impossible?  Not if your child’s voice was recorded specifically for playback in a text-to-speech (TTS) system.  Although TTS uses a computer or someone else’s voice today, in the near future, TTS recording will enable the capture and playback of voices for everyone.  But, how does a TTS system actually work?
In this multi-part series, Kindicted will examine the history, technology, and people behind TTS, which includes everyone from childhood prodigies to internationally famous criminals.  But first, a lighter look at computerized speech, including a recent interview with the default male voice of the Kindle – Tom Glynn.
The ‘human’ computer
In a large city, computerized and computer-controlled speech systems are encountered on a daily basis; subway, transit, and GPS reservation systems, automated call attendants, cell phones, personal digital assistants, ebook readers, and so on.  For systems with a fixed number of words and phrases, envisioning the system is straightforward.  The computer simply plays back the appropriate previously recorded text based on input criteria.  For TTS systems, such as the Kindle, that use a human voice rather than a computerized (or synthesized) one, 1,400 individual snippets of English speech have to be recorded, labeled, and dynamically arranged for playback in order for the device to convert text to speech.
The man behind Kindle’s TTS voice
In the case of Amazon’s Kindle device, Nuance Technologies supplied the software and voices to convert text to speech.  You can currently choose from a male or female voice, although Nuance’s website lists dozens of voices in many languages.  In February of 2009, it was discovered that the male voice behind the default Kindle TTS is an experienced singer/songwriter and broadcaster: Tom Glynn.  A year has passed since Tom’s Kindle ‘discovery’; he has a new album out, and Amazon has sold millions of Kindles.  Kindicted recently had an opportunity to catch-up with Tom.
As an added bonus, this interview is available in mobi format here.  Simply download the mobi file, transfer it to your Kindle, and play the interview using the default male voice (Tom’s).  In some sense, Tom will be reading the interview aloud using his own voice!


The interview

Kindicted: You are an accomplished singer and songwriter; when did you realize that you had vocal and musical talent?
Tom: I realized it pretty young. My parents picked up on it and started me on piano lessons at age ten. From there, I taught myself how to play by ear and picked up the guitar around age 14. I was obsessed with playing piano or guitar every night through high school. I always loved music and had a pretty finely tuned ear for details like harmony, chord structure, and rhythm from an early age.
Kindicted: Broadcasting was a part of your career; was that to support your music, or to enhance it?
Tom: It was essentially a way to support myself, but I had a love of broadcasting from an early age. I loved performing impressions growing up, and I paid close attention to the nuances of the way people spoke. But yes, broadcasting and music definitely enhance each other. Inflection, pacing, and other elements of spoken words are certainly helped by being musical, as well as being able to remember the pitch of something I say and duplicate it many times for consistency.

Kindicted: In hindsight, do you feel that being a radio personality was critical to being able to use your voice talents for computerized speech?
Tom: Absolutely. A radio background gives you the experience you need to know how to capture people’s attention and communicate information in a compelling way. It also helps you develop a style and feel that’s your own.

Kindicted: Did you have to seek out work, or did someone hear your voice and decide that it would be perfect for a speech system?
Tom: Like most people in broadcasting, I had to work hard for a number of years to seek out opportunities. It’s a misconception some people have that having a good voice is all it takes to do voice-overs. There’s a lot more to it than that. Part of it is who you are as a person because your personality is reflected in the work you do. It also requires many, many hours of refining things such as pronunciation and inflection, along with listening to your recorded voice constantly to see if there are subtle improvements you can make to convey a better feel or connect better with the listener. I still do that everyday.

Kindicted: Is there a high degree of competition in the voice market?
Tom: Yes, voice-over work is a very competitive industry. I say that not in the sense that I feel like I’m competing with someone, but that there are perhaps a limited number of jobs that are in high demand. Ultimately, you’re competing with yourself to be the best you can be, just like any field, and if you develop a sound and style that’s your own, you’ll do well. If you find a niche, it’s great.


Kindicted: From a philosophical point of view, does it bother you that your voice is being used to utter phrases that you personally would not say or approve of?
Tom: Not really. I did some on-camera work earlier in my career, and I found that to be much more invasive and questionable. I think when someone sees your face, it’s more like a personal endorsement. That’s why you hear a lot of major movie stars doing voice-overs for TV commercials these days that they would never appear on camera for. If they were on camera, it would be as if they were personally endorsing something, but that’s not a problem if it’s just their voice -  even when people recognize their voice. I honestly don’t spend much time thinking about the way my voice is chopped up and used. I’m much more focused on getting it right when I do the actual recordings, and then I let it go. Also, I think people realize that a computerized TTS voice is just a functional tool more than a real person. 

Kindicted: If your voice kept uttering new phrases after your death (a long time from now), do you feel that you have a more modern degree of immortality than actors or musicians, whose body of work is essentially static?
Tom: Hey, you may be right. I never thought of that. Maybe my TTS voice can do my eulogy. 

Kindicted: Have you ever encountered your own voice in an interesting situation? If so, what was that like?
Tom: Oh yes, all the time. I end up having to converse with myself frequently on the phone. It’s also amusing when I’m waiting in line at CVS, and I hear myself say “One pharmacy call” on the loudspeaker. Or the time a group of us were watching a storm bulletin on TV, and it was me giving the emergency forecast as the voice of the National Weather Forecast. There are many surreal moments.

Kindicted: Do people recognize your voice as the voice of a GPS, Kindle, voice prompt, etc.?
Tom: If someone asks me what I do, and I tell them, then they recognize it. But not just out of the clear blue. Even when I’m at CVS and having a conversation with the clerk, they don’t recognize that’s also me on the loudspeaker – and I certainly don’t tell them. That’s another beauty of voice-overs…my anonymity. I’m a quiet, introverted person for the most part despite my voice being all over the place, so not being recognized is fine by me.

Kindicted: You don't own your voice in regards to the plethora of devices and systems that use it - does that bother you?
Tom: Not at all. That’s part of the gig.

Kindicted:
Are you made aware when your voice will be used in a new device, or do you usually find out after the fact?
Tom: Usually I know because most of my daily work is not TTS. I’m usually recording actual phrases for specific clients that I’m tailoring my voice and presentation for. But with TTS, I don’t always know where my voice ends up until after the fact. I had no idea I’d end up as the voice of the Kindle when we recorded those phrases. It was a thrill for me because I had already become addicted to my first generation Kindle before the TTS one came out. I’ve been a Kindle addict for quite some time.

Kindicted: If you lost your voice, would you use a computer to speak with your own voice, or would you choose a different one?
Tom: I’d probably enjoy the silence. I talk so much for my job that I prefer to be quiet much of the time. 

Kindicted: Do you like hearing the sound of your own voice?
Tom: Well, I’ve certainly become used to it over the years between singing and speaking. When I hear my voice, I’m usually paying close attention to the details and nuances of what I’m saying. I’m usually asking myself questions like, “How might the way I said that make somebody feel? Was it friendly enough, was it too friendly, was it delivered at a nice brisk pace or was it too rushed?” That’s an example of my internal dialogue. 

Kindicted: The process of recording diphones (snippets of words) seems (on the surface) to be physically and mentally demanding - how do you prepare for the process?
Tom: Yes, the work takes a great degree of focus for long stretches at a time. I burn out after about 3 hours of continual recording because of the level of concentration and the physical demands of making my mouth pronounce everything just right.  It’s important to be incredibly consistent, so I just get myself in a good frame of mind before I record. I can’t think about anything else other than what I’m recording. It really takes full concentration, but I enjoy that. I’m someone who’d much rather work intensely for several hours than work all day at a job that has a bunch of downtime.

Kindicted: How long does the typical recording session take (in total)?
Tom: A job can take anywhere from a few minutes to all day. But generally I try to limit any one job to three or four hours to make sure the client is getting the very best product possible.

Kindicted: How closely did you have to work with the scientists and engineers to pronounce the diphones just right?
Tom: We had recorded several versions together in the past, so we were lucky enough to have a lot of trial and error with TTS going back a number of years. The way we decided to go was to just be myself as if I was speaking normally and things I was saying were not going to eventually be chopped up. I think that helped us end up with a more natural sound with this version of TTS. Certainly it’s not as natural as hearing a real voice speaking, but it has come a long way. I really hope people find it helpful.

Kindicted: Did you have to have any speech training, or work with a linguist?
Tom: No, my speech training was all on the job over the years during broadcasting jobs, and many hours listening to recordings of myself and being hyper-critical. The most important element in learning to be good at voice-overs is not how well you talk, but how well you listen to yourself and others.

Kindicted: Do you use your voice talents for audiobooks?
Tom: I have never done an audiobook. I’ve done many types of narration over the years, but never an audiobook. I do listen to them quite often though, and there are some remarkable voice talents out there who read them. I love listening to their presentations.

Kindicted: Are you in demand for other roles (TV, radio, Internet etc.) based on your voice work?
Tom: I’ve done numerous radio and TV commercials over the years, along with many projects for the Internet, training videos, cartoon characters, corporate presentations, movie trailers, and literally thousands of other projects. Now people mainly know me as the phone voice they speak to when they call Bank of America, United, Apple, CVS, and many more. And my TTS voice is the voice of Onstar’s GPS, the National Weather Service, the Phoenix Airport, and of course, the Kindle. 

Kindicted: The Kindle didn't pronounce ‘Obama’ properly - did you have to record that one?
Tom: I actually read about that on my Kindle when the story came out. No, I didn’t re-record it, so they must have fixed it somehow in the technology. I’m glad they did.


Kindicted: For TTS, are you still asked to record new words, diphones, and phrases, or is your body of work large enough that no additional pronunciation is required?
Tom: I’m sure at some point we’ll record some more phrases, but currently I think we’re all set.

Kindicted: Do you still plan to market your voice, or are you concentrating on other endeavors?
Tom: I’m always open to new projects and ideas. I’m lucky in that I have a lot of clients who rely on me at the present moment, but I’m always up for new challenges. I’m still a musician at heart, and I just released a brand new album called “Blue You’ll Do”, which is available at Amazon, iTunes and tomglynn.com. I’m really happy with the way it turned out, and the reaction so far has been fabulous. This particular album features a unique baritone acoustic guitar, which I bought last year. It has an unusual custom tuning, so it’s half guitar and half bass. I’ve never heard anything like it on a singer-songwriter record. Right now I’m concentrating on promoting that and hopefully getting it into the ears of as many people as possible.

Kindicted: People can still tell that your voice is computer-driven; how long do you feel it will be before a computer-controlled voice will be indistinguishable from a human one?
Tom: That’s a good question. As someone who speaks for a living, I believe there is a human dimension to speech that can never really be replicated by a machine completely. But who knows?

Kindicted: From a personal point of view, do you feel that the ever-increasing use of electronics and electronic communication enriches people's lives, or does it dehumanize to a degree?
Tom: I love technology. Technology allows me to reach millions of people with my music digitally, and it allows me to do my voice-over work from virtually anywhere. Like anything, it has the potential for good and bad in it depending on what it’s used for. But that’s human nature in a nutshell too. I do know what you mean about dehumanizing with all the devices, but hopefully it’s also opening up channels for people to connect in new and beneficial ways too.

Kindicted: Do you ever see a day when computers will be the norm for writing and performing music - including singing?
Tom: Wow, I hope not. I guess to some degree it already is the norm. Singers are made to sound more ‘computerized’ with the Auto-Tune effect. I hope we always value real musicians, singers, and songwriters because that’s really at the core of who we are as human beings.

Kindicted: Tom, thanks for taking the time out to answer a few questions. Best of luck with your new album.





Tuesday, February 16, 2010

Deus ex Z-machine

With all the hype surrounding the upcoming Kindle Development Kit (anyone seen it, BTW?), it's no wonder the Kindle boards are filled with application requests galore.  It would seem that like the iPhone/iPod Touch/iPad, the range of applications requested run the gamut: from clocks to ports of Doom. Kindicts obviously have their personal favorites, but there is one application that deserves to be front-and-center: Z-machine.


The Zetetic Z-machine
For those who don't know, a Z-machine is a virtual machine developed by Infocom in 1979 to be able to port their text-based adventure games to virtually any platform.  The "Z" in Z-machine stands for Zork - one of Infocom's original games.  Z-machines have been developed for virtually every platform: 15 desktop OSes, 10 portable OSes, emacs, Java, and JavaScript, and they have been deployed on everything from mainframes to watches.  Z-machines have become a rite of passage of sorts for new devices.

Requests, Anyone?
Since the Kindle is primarily a book-reading device, a Z-machine interpreter seems like a natural fit, since the games are generally considered interactive fiction of sorts.  The one requests on the mobileread.com site is from April, 2009 and is met with the conclusion that since a JavaScript version is available, people could just play that.  There is one other thread, but that is for the iLiad reader (which, apparently, has a native Z-machine).

The JavaScript version, while sufficient in the interim, does not take full advantage of the Kindle's display capabilities, nor does it address the Z-machine issue for non-US Kindles, since wireless web access is blocked for all but Wikipedia and the Kindle store.

App Store Woes
If someone does port a Z-machine via the KDK, it may be blocked by Amazon, since it may be seen as an application that enables the skirting of copyright - something which Amazon has no experience with.  Still, the KDK may be the best opportunity to realize a native Z-machine on the Kindle.  If Amazon does not allow the application to be deployed through the app store, then there is little doubt that it will be deployed via the 'alternate' app deployment method that is bound to crop up.

Kindicted fully supports a native Z-machine port, and will be an early adopter and reviewer of what can be considered one of the most natural application fits for the Kindle.

Sunday, February 14, 2010

Grep Kindle's Find


One of the cornerstones of the Internet is the ability to search, so it comes as no surprise that the Kindle has a find command.  What is surprising is that Kindle's find has a classic case of dissociative identity disorder!  Books, notes, the Amazon Store, the dictionary, Wikipedia, and Google can all be searched from one interface, but all with slightly varying search abilities. Kindicts will obviously want to know everything they can about the Kindle's find; Kindicted is more than happy to oblige.

Overview
Other than within some specific modes (such as adding a note), typing a letter anywhere in the Kindle interface will bring up a find command.  A find command with options can also be opened by selecting "Search" after pressing the "menu" button.  On the home screen, the find command looks like this:


Pressing the return key, pressing in on the 5-way controller, or using the 5-way controller to select the find command will begin the find process.  If the 5-way controller is pressed to the right, additional options appear:


The "Find" command from the main menu appears like this:


The options are self-explanatory, although the find command works in a slightly different manner, depending on what is being searched ("my items", kindle store", etc.).  A detailed description of the differences is described below.

The find command can also be used within a book, in which case, the "search my items" selection changes to just a "find":


The additional options also changes slightly to display an additional option to search "my items":


At a very basic level, entering text into the search field and selecting search will search for that text in all items, or within an individual book.  Simple, no?  No!  As previously mentioned, find has multiple personalities, and the ability to search using wildcards, symbols, exact text, etc. varies depending on what item the find function is searching within.  Even worse, the find interface changes based on the type of document being searched!  The next few sections examine each find personality in greater detail.  Note that this functionality may change with future system software releases.  Kindicted will amend as necessary.

Before delving into each find, it is useful to note that when a document is loaded into the Kindle, the system software attempts to parse the document and creates a searchable index.  If a newly loaded document cannot be searched, a warning may pop up indicating that document has not yet been indexed.


Once an index has been created for the document, the find command can quickly search the document.  This can take anywhere from a few minutes to a few hours - depending on the number of documents to index.

Find #1: The Home Screen and Books/Documents 

As mentioned above, typing (or selecting the Search menu command) from the Home screen opens up a small input area at the bottom of the display.

Type in a word or several words, and the Kindle will return either a list (actually another home screen-like interface), or "No Items", which means that the word (or words) were not found in all searchable books.  A search with at least one match results in a list of matching titles.  By default, the items will be in decreasing order of relevance, which isn't like a Google relevance, it simply counts the number of matches.


At this point, it is important to point out that PDF files are excluded from the search.  PDFs can be searched individually, but with a 'different' search command (more later).

Assuming that one or more items matched the search, the list appears to be a filtered list of home page items, which it is, but it is not the home page.  To return to the home page, select "Home" and the search results will be cleared.  If, however, you press up on the 5-way controller within the search results, pressing left or right will reveal a list of additional sorting and filtering options.  Pressing left displays a list of document type filters:


Pressing right reveals sorting options:


The additional options are self-explanatory.

If an item in the search results list is highlighted (underlined), pressing in on the 5-way controller opens up a list of search results for that item.

Note that notes are searched, as well as the document text.  Also note that an additional option appears beneath the find text input area: "Close Search Results".  Selecting this (or pressing "Back") will return to the search result list.  Another search can also be initiated, but it does not search the search results: it is an entirely new search on the item that is open.

Note that initiating another search at this point "stacks" the searches.  Selecting "Close Search Results" will display the previous search results, and so on, depending on how many searches were initiated.  Selecting the "Home" button at any point will abandon all searches.

Closing all searches will ultimately lead to the original search of all items, which can be closed by pressing the "Home" button.

Searching Within a Document
Whether searching from the home screen, or from within a document, the interface and search 'personality' is essentially the same.  If a document is selected from the search results, or if a search is initiated within a document, a list of search results is displayed for the individual document:


At this point, many Kindicts ask two key questions: what is the syntax for entering words in find, and how does the find command handle multiple words?  The answers are not straightforward, but here goes:

Find Syntax for Documents (except PDF)
Right off the bat, the find command for documents does not have wildcards, grep syntax, etc.  Multiple words can be entered separated by a space, as can e-mail addresses and web sites.  All symbols are summarily ignored (and can even be in the middle of words) except for "." and "@", which are used for searching the aforementioned sites and addresses.  Here are a few observations:
  • Plurals are automatically included in search results.
  • Common small words are ignored ("of", "and", "the", etc.).
  • Searches are case-insensitive.
  • Characters with accents can usually be searched by simply ignoring the accent and entering the underlying character (e.g. search for c if the character is ĉ).  Note that this does not work for characters such as ø.
  • Individual letters can be searched.
  • All surrounding punctuation/symbols are highlighted in the search results, but cannot be specified in the search.
  • Numbers can be searched - leading zeroes are relevant.
  • Symbols between words (such as the "://" in "http://www") can be excluded entirely - the search will still work (provided the url does not continue on a new line after the "http://"). In order words "httpwww" is equivalent to "http://www".
Again, please keep in mind that these rules are only for searching documents.  PDFs, Wikipedia, Google, and the Amazon store all have different search syntax.

Multiple Word Handling for Documents
Multiple words can be entered in the find input area, each separated by a space.  The find command will attempt to find all words within a "Location", which roughly equates to a sentence, but not exactly.  The easiest way to think of the scope of a multiple word search is that all the words entered have to be "close" to each other - within a sentence or two.

Oddly enough, since the document search is not an exact match, the words in an matching phrase can be in any order.  For example, searching for "the deck stood on the burning boy" will return "the boy stood on the burning deck" as a match.

Back to the Document Search
With the document search syntax in mind, the search results make a bit more sense.  Each location contains the search terms, or a blank display will indicate that no matching word(s) were found.

If a matching location is selected, the location containing the search word will be at the top of the display, with the rest of the document following.  Except for the ability to return to the search results (by selecting "Back") the interface is identical to simply reading a book.  If another search is initiated at this point, the searches are again "stacked" - pressing "Back" will display the prior screen.  Selecting "Home" will clear all stacked searches and results and return to the Home display.

Find #2: PDFs 

Although a search from the Home screen will exclude PDFs, PDFs that do not consist entirely of images can be searched.  Opening a PDF and typing or selecting "Search this Document" from the menu will open an search area at the bottom of the display.  Typing in any text will search for the exact text entered.  This includes all symbols, spaces, etc.  What this also means is that entering a word like "enter" will find "enter", "enters", "entering" and so on.

Instead of search results, the PDF search searches forward from the current location and highlights where the term is found.  Backwards and forwards options within the search area appear, which enable next and previous searching for the words within the document.


There is no ability to enter wildcards or any grep-like syntax.  Words within images are also not searched.  In many ways, PDF search is more powerful than a document search, as exact phrases and symbols are relevant.  Searches are case-insensitive, and multiple words separated by spaces will search for the exact phrase - including the space.  Searches are also not stacked - there is only one search input open at a time.  Pressing right on the 5-way controller will reveal additional options, including an option to search "my items", although the search of "my items" follows the "rules" as indicated in the sections above.

Find #3: Amazon Store

The Amazon store find is a basic search with wildcard capabilities.  Exact phrases within quotes do not work, but adding an "*" to the end of any word or words will result in a search for all words in the titles or descriptions of books beginning with the characters before the "*".  Entering a phrase will search for all words - not the exact phrase, although the search rank will usually return the best match at the top of the list.  If there is only one match, the item will open, and "Buy" will be highlighted.  Be careful not to press on on the 5-way, or you will have to act fast to cancel the order.

Find #4: Dictionary

Searching the dictionary is straightforward.  The results will commence at the closest word matching the text entered.  All other characters are ignored.  Selecting "Back" will return to the previous display.  Searching the dictionary within a book in this manner is useful for looking up words then returning to the book.

Find #5: Wikipedia

A Wikipedia find operates as one would expect.  Search items in quotes are treated as exact phrases, and the "*" can be used as a wildcard character, resulting in a list of all Wikipedia entered that begin with the letters before the "*".  Multiple words can be entered, but only Wikipedia entries are searched - not the text within the entry.  Again, selecting back will return to the previous display.

Find #6: Google

For US Kindle owners, a Google search operates as one would anticipate.  The Kindle simply passes the text and symbols entered to Google.

International Kindle owners are out of luck, as web browsing is not allowed.

Find #7: Go to Web

Again, for US Kindle owners, this will simply open the URL entered in the find box.  This is confusing, as it is not so much a find as an "open web site" option.

Again, international Kindle owners will have to use another device, as this option is not available.

Conclusion
The Kindle find command could really use some psychotherapy!  A central find option that returned all off-web results (my items, PDFs, and the dictionary) would greatly assist in actually finding a phrase.  Also, an option for using grep-like syntax and wildcards would appeal to Kindicts who love to geek-out on their searches.

In the meantime, Kindicted hopes that this article cleared up some of the finer details of the current find command.

As always, corrections, comments, and additions are appreciated.

Friday, February 12, 2010

Free Bonus With eBook Purchase - Act Now!

Kindicts certainly love their Kindles, and who doesn't love books anyway? Well, (and this is really ironic) authors are starting to dislike books. Here are a few reasons why:

Shrinking margins (ROI)
One author (who has had moderate sales success) reported that he was making the equivaluent of $2.75 an hour over the course of two years (the time to write and publicize his book).  Motivation for authors to be excited at the prospect of a book is declining, and the lack of a paycheque certainly plays a factor.



Stolen books
eBooks are more convenient to steal than their physical counterparts.  While it may be true that the vast majority of stolen books would have never generated sales, there is a percentage of people who would have purchased the book had it not been so convenient to steal.  This results in lower sales, and unhappy authors.


More publicity
As book sales shrink, the need to perform book tours rises, which means time away from home, gueling travel schedules, etc.  A minority of authors had touring in mind when they were sequestered in their basement working on their novel.


Pressure from publishers and retailers
Authors face pressure on numerous fronts: themselves, their families, and the publishers and retailers.  The latter two would like authors to pump out best sellers like a machine to keep the 'wheels turning'.  Missed deadlines, poor quality manuscripts, and so on, all weigh on authors, which does not make for a pleasant writing experience.

General lack of control
If authors wish their book to be sold, they must essentially give up rights to a publisher, or to a retailer such as Amazon. What could anyone possibly do about the current situation?  Well, here are a few ideas that may revitalize the book industry:


Interactive books
Many Kindicts may recall early interactive computer books a children; the Broderbund had an entire interactive books spinoff: Living Books.  These never caught on for 'adult' books - until now.
The Kindle (and other similar devices) can support interactive books.  When a book on the European economy has a graph that ranks GDP on a number of factors, interactive sliders could protray 'what if' scenarios. Books with maps could have levels of zoom with annotations - the possibilities are endless.  Authors could add another dimension of value with more dynamic, interactive content.

Contests
Giveaways and contests are still very much a common advertising technique. Food corporations, credit card companies, retailers, etc. all use giveaways as a means of promoting their product.  Electronic books should not be any different.  Book retailers could have a giveaway for every 1,000 books sold, and in-book contests (for legal owners) could result in free books, trips, etc.

Discount and specialty seller co-operation
Not all books are sold through Amazon, although it's convenient for researching titles, which seems to make it an essential for author exposure.  If all the cut-rate and speciality booksellers pooled their resources, they could setup a network of research and selling that would enable prospective buyers to find authors and research who have opted out of Amazon.  Even individual authors can self-promote, but it's an uphill climb that relies on a luck.  Smaller book publishers could even let authors retain rights to their books, and just keep a cut of the sales.

Free bonus with purchase
Many kindicts will remember purchasing albums (LPs, records) and receiving a bonus inside - a poster, or other paraphernalia. Kiss and their Kiss Army was particularly effective at self-promotion, branding, and customer loyalty.  Why can't publishers or authors offer a free gift with eBook purchase?  How many times have you heard, "limited quantities - act now!"  That's because it still works, and it would work with books as well - especially if the giveaways were seen as desirable or collectible.  A kindle skin with the author's autograph would undoubtedly be appreciated by kindicts.

Books are dead! Long live books!
The industry is rapidly shifting, but traditional companies (such as large publishers) are choosing to rely on technology companies to chart their future course rather than investing in innovation and reinventing and defining the modern book experience.  Booksellers have the hindsight of the music industry to learn from, but it appears that all the book industry has learnt is how to raise the white technology flag, and bite the hand that writes.

There is hope; small publishers with great new selling ideas are appearing all over the web.  The book industry will change yet again; hopefully, the fine art of book writing is seen as less about the corporation, and more about the writers and purchasers.

Breaking News: Kindle MD Prototype

One of the advantages of hosting a site such as Kindicted is that sometimes, corporate representatives send out advance products for some early PR.  This must have been the case when a mysterious box labeled "Lab126" from showed up at the office door.  Inside the box was what can only be described as a whimsical, avant-garde and refreshing change to electronics as we know it.  After unpacking the box, the "Kindle MD" (it does not say that anywhere on the unit) was "booted".

Looks
Esthetically, the bold colors signify a departure from the drab canvas that make the previous Kindle line fade into the background.  The new look is sure to appeal to the sub-18 demographic.

The boot icon does seem vaguely familiar, but with so many electronic devices, who can keep track of logos and such?  The progress bar proceeded to complete rather quickly, and without the noticeable redraw 'lag' present in the current eInk devices.

Of note, the bold new orange stylus that harkens back to the Apple Newton days.  This one says, "Look at me; I'm a proud Kindle user!" The prototype did not have handwriting recognition capabilities, but that must be forthcoming in a production model.  The stylus lanyard is appreciated, although we can't be sure if it is for electrical signals, or just so users don't misplace the stylus.


Inputs
Amazon truly "one-upped" Apple this time, as there are absolutely no ports on this model.  Everything must be wireless, but connectivity seemed to be absent in the prototype.  Whispernet was so quiet, it could not be heard (or seen) - even when listening really, really close to the device.  A quick scan with several routers did not pick up any wireless activity.  The only logical conclusion is that the incredible brains behind eInk finally figured out a way to connect wirelessly using the quantum spin of electrons.  Incredible!


PDF and EPUB Support!
The built-in user guide is identical to the existing Kindle DX guide, but there was a pre-loaded EPUB file, and (and this is the best part) - zoom actually worked!  Kindicts knew this was coming, but to actually see software work the way everyone knows it can almost brings one to tears.

Finally, Kindle users will be able to overcome the last Sony hurdle (well, except for touchscreens), and be able to withdraw eBooks from their local library.  It took real guts for Amazon to give up their lucrative lock on book buying.  Kudos!


Stamps
One of the most useful features is the ability to quickly markup the document with the stylus.  If that wasn't enough, Amazon decided to include a handy emoticon stamp drawer, which allows users to quickly rate documents with a simple, but effective code of geometric shapes.  Amazing!





New Display
The eInk technology has to be seen to be believed.  It appears to be comprised of carbon nanotubes, which, when excited by the underlying matrix (or the stylus or stamps), move to the surface by (and this is only a guess) a quantum tunnelling  effect.  What's even more remarkable is that there is no power source other than ambient light.  The unit worked flawlessly even in the lowest light levels.  The display is also a full 8.5 x 11 sheet of Letter paper.  European users will have to wait for an A4 version.

New Screensavers
The Kindle MD was left alone for an hour over lunch, and upon return, the screensaver has kicked in - and what a surprise!  Instead of authors, the prototype Kindle has icons of computing.  Here are three that were captured.

Screensaver 1: A Devilishly Handsome Steve Jobs


Scrensaver 2: Bill Gates' Arizona Mugshot


Screensaver 3: An Angelic Looking Bezos

Conclusion
Amazon's new Kindle (if that's what this is) is a truly "fanatical and evolutionary" device that is poised to once again re-ewrite history.  From the new larger display, to solar and quantum technology, the device really feels like one is holding the future of electronics and computing in their hand.

Hats off to Jeff Bezos and Lab126!

Feel free to leave comments or ask questions, but please note that Kindicted is bound by a strict non-disclosure agreement.

PS
The team is hard at work disassembling the unit and will post pics of the internals as soon as possible.

Thursday, February 11, 2010

1979: Apple's Missed eInk Opportunity

Kindicted readers love their eInk displays; after all, the display places the Kindle and iPad squarely in different markets.  Could there be some sort of long lost connection between the Kindle and iPad that make them some sort of techno distant cousins?  Possibly.  Let's work our way backwards from 2010, all the way back to where the first ePaper was torn from its proverbial ePad.

1997-present: eInk
Some readers may know that the eInk display in the Kindle was developed at E Ink Corporation, an offshoot of MIT.  The original development team behind eInk represented some of the brightest people around.  With funding ($16M from IBM), patents and inventions followed quickly, with some the original eInk technology used as advertising.  In 1997, a groundbreaking paper called The Last Book was published in the IBM Journal of Systems and Development.

At one point, the paper describes a blank book with many eInk-based sheets.  On the spine of the book, a display and interface would allow the user to 'fill' the book with any book in their electronic library.  Even connectivity to the Internet was conceived for the eInk-based book.  So pulp-based books have been dying since 1997, right?  Not really, since one has to look a bit farther back to find the real culprit, and the company that was researching electronic paper seemed to have a lock on physical paper: Xerox.

1974-2005: Gyricon Kubla Khan
The basic idea behind the eInk display stemmed from research in the early 1970s by Nicholas K. Sheridon; a researcher with Xerox.  The Gyricon (greek for rotating image) worked by rotating tiny half-white, half-black (or is it half-black, half white?) beads immersed in oil using an electric field.  Strangely, the display technology was developed as a replacement for a CRT that didn't seem bright enough.  This the CRT in question was connected to a famous Xerox computer: the Xerox Alto.

As you may have read, the Xerox Alto impressed a young hippie visiting Xerox' Palo Alto Research Centre (PARC) in 1979.  Obviously, that hippie was Steve Jobs, and he and Woz has started Apple a couple of years earlier.  The Lisa and Macintosh computers were modelled after the Alto, which led to NeXT, OSX, iPods, iPhones, and finally iPads.

Mr. Sheridon was dissuaded by a Xerox suit, who informed him that Xerox does not make displays, so that was the end of the gyricon display technology.  In 1989, Mr. Sheridon realized that his gyricon technology could be used to develop ePaper.  The ePaper arm of Xerox lasted until 2005, when it was disbanded.

Conclusion
If the executive at Xerox funded the ePaper display in the mid-1970s, a young Steve Jobs, as part of his fact-finding mission, could have also been inspired by a gyricon display as part of the Xerox Alto, which could have led to ePaper displays in the 1980s, and so on.  Steve Jobs would have killed books long ago with his MacBook, NeXTBook, and iBook devices!

It seems that in the technology field, there are many missed opportunities, and many happy accidents.  Regardless, the kindicted users of today are using a "magical and revolutionary" display technology that took 35 years to come to fruition.