iSpeakIt: Exporting Sound to iTunes from Text with the System Voice

By Graham K. Rogers


I recently covered Speech Preferences for OS X, Leopard. While looking at this, I used a utility I had installed a while back but not used regularly: iSpeakIt. As this has now been updated, with additional features, it is worth re-examining.

Michael Zapp at ZappTek has created an application that uses the system voice (selected in Speech preferences) and creates a sound file from text. That in itself is simple enough. OS X allows us to play sound with any native application, via the "Services" menu, with highlighted text. Where iSpeakIt scores, is to add to that facility with both input and (particularly) output methods.

iSpeakIt toolbar

A basic panel, with seven icons on the toolbar, appears when iSpeakIt is running. These are split into three sections separated by faint dotted lines. The first two groups are for output; there are then three input methods, from information downloaded from Google; input from downloaded RSS feeds; and input from a document. The last uses a number of formats including .DOC and .RTF files as well as text from a web page. It might sometimes be useful to edit such input. With a web page, for example, text of links might be included. A user can also paste text directly into a panel ready for use.

iSpeakIt RSS

The RSS feed panel opens with a section where feeds are listed. There is a plus sign beneath to enable a user to add more, with a minus sign (-) for deletion. Pressing the download button will fetch the most recent additions (all entries can be selected). Options include the full article rather than the summary. Podcasts can be included (these use RSS feeds too). The panel notes that there is an option in preferences to load any podcasts on the launch of iSpeakIt.

The simple preferences panel has two sections. First is a check box for automatic iTunes transfer. If this is active and the next section also selected (Load on Launch, which has several input choices), the entire process could be automatic if it is coupled with a startup launch of the program. Plugging in the iPod for a sync would have one ready to go first thing in the morning.

The three other input methods each use Google sources: News, Weather and Driving directions. The latter loads a map panel, the same as in Safari or on the iPod touch. Text directions are converted to a sound file. I was unable to make this part of the utility create sound output although the directions themselves appeared. The other input methods worked instantly, every time.

iSpeakIt directions

The weather forecast selection opens with a small box in which we type a city name and can select metric units (also marked, "non-US"). iSpeakIt remembers the last choice. A seven-day outline of weather predictions is loaded into the main panel and this is converted to sound.

iSpeakIt iSpeakIt

The News panel allows choice of news types (Sports, World). A button allows selection from one of eight countries. A series of summaries is downloaded for conversion. A further box in the panel allows for full articles to be downloaded. This obviously takes more time and the sound files are larger.

The export sections allow for two processes. The export to iTunes puts a file directly into that application and then it is ready for an iPod. There are five formats: AAC, AIFF, Apple Lossless, MP3 and WAV files. A number of choices may be made including how the file may be split, which is useful if the text is long. A summary of the news takes a few seconds to export and produces a file just over 2 minutes long. The other icon exports .AIFF files. These can be used in applications, such as GarageBand.

When I was listening to a test of the weather forecast I realised I heard what appeared to be an intake of breath. When I played it again it was indeed a "virtual" breath at the end of some paragraphs, but not all. The first few breaks did not have this, then just five or six in the middle.

Another file created from Google news sources again prompted this apparent breath after some pragraphs, which adds an unusual -- almost eerie -- level of reality to the computer-generated voice.

speech preferences

I checked the same text using OS X to check if it were part of the installation or an extra to iSpeakIt. It is the OS X system voice (as selected in Speech Preferences). More specifically it is the voice named, Alex. I tried with other voices and none exhibited this trait.

The pauses, breaths and other human-like delivery features are subtle tricks that add realism and assist a user to pay better attention to the information being delivered. iSpeakIt adds convenience for those with limited time who can access information via an iPod and listen on the go. It is also excellent input for English learners.

See also the earlier article on Speech Preferences.


Made on Mac

For further information, e-mail to

Back to eXtensions

To eXtensions: 2006-07
To eXtensions: 2004-05
To eXtensions: Year Two
To eXtensions: Year One
To eXtensions: Book Reviews
Back to homepage