The blog of dlaa.me

SayIt, don't spray it! [Free program (and source code) makes it easy to use text-to-speech from your own programs and scripts]

Over the holiday break, I was asked to create a program to "speak" (via text-to-speech) simple sentences that were provided on the command-line. None of the available offerings were "just right" for the customer's scenario, so I was asked if I knew of any other options... Although I hadn't used it before, I figured the .NET System.Speech assembly/classes would make solving this task pretty easy, so I decided to give it a quick try.

And about three minutes later, I was done [ :) ]:

using System.Speech.Synthesis;

class Program
{
    static void Main(string[] args)
    {
        (new SpeechSynthesizer()).Speak(string.Join(" ", args));
    }
}

 

Because that was so easy, it felt like cheating; I decided to add support for a few more features to round out the offering and turn the whole thing into a free tool and blog post. The program I've written is called SayIt and is a window-less .NET 4 application to speak arbitrary text from the command line (or from a file) while allowing the user to make simple customizations (such as gender and volume).

Here's the "documentation" that shows up when SayIt is run without any parameters:

Use the command line to tell SayIt what to say and how to say it.

SayIt.exe
    [--Gender Male|Female|Neutral]
    [--Volume Silent|ExtraSoft|Soft|Medium|Loud|ExtraLoud|Default]
    [--Rate ExtraSlow|Slow|Medium|Fast|ExtraFast]
    [--Text <Text.txt>]
    [--SSML <SSML.xml>]
    [<Text to say>]

Examples:
    SayIt.exe Hello world
    SayIt.exe --Gender Female --Text C:\My\File.txt

Version 2011-01-04
http://blogs.msdn.com/b/delay/

 

For fun, I created the following icon (if you don't recognize it, here's a hint):

SayIt icon

 

When you run SayIt with the proper parameters (combine them in any order!), SayIt uses the Windows text-to-speech engine to speak the text using the default output device. It does this without creating or showing a window, so other programs can make use of SayIt without distracting the user. Alternatively, SayIt can be called from batch files to provide status updates for custom scripts and the like. Simple text can be passed directly on the command line, while more complicated (or lengthy) text can be passed in a file (via the --Text option).

Here are a few ideas to get you started:

SayIt Build completed
SayIt Processing file 12 of 25
SayIt You've got mail!
SayIt I'm sorry, Dave. I'm afraid I can't do that.
SayIt Barbra Streisand

 

Aside from the simple gender, volume, and rate customizations that can be done on the command-line, the most likely tweak is fine-tuning the pronunciation of a particular word or words. (Though most sentences are quite understandable by default, some words come out a little garbled!) Fortunately, there's a W3C standard for customizing pronunciation and SayIt supports that standard (via the --SSML parameter): Speech Synthesis Markup Language.

SSML is a simple, XML-based syntax for controlling the behavior of text-to-speech applications like SayIt. I won't go into the gory details here (the SSML specification has everything you need), but I will highlight the phoneme element in the form of a simple sample that's part of the small SayIt test suite:

<!-- IPA pronunciations based on http://en.wiktionary.org/wiki/tomato -->
<speak version='1.0' xmlns='http://www.w3.org/2001/10/synthesis' xml:lang='en-US'>
  You say <phoneme alphabet="ipa" ph="təˈmaɪto">tomato</phoneme>.
  I say <phoneme alphabet="ipa" ph="təˈmeɪto">tomato</phoneme>.
</speak>

 

Although technology hasn't come quite as far as Arthur C. Clarke envisioned in 2001: A Space Odyssey, it's still pretty amazing what the common household computer is capable of. Speech - and other forms of "natural input" - can be a great way to personalize the computing experience and the ease with which SayIt can be incorporated into existing programs and scripts should make it a natural fit for many scenarios. Instead of using an inefficient polling approach to find out when your tasks finish, let the computer tell you - literally! :)

 

[Click here to download the .NET 4 SayIt application, complete source code, and simple tests.]

 

Notes:

  • The default English install of Windows 7 comes with only one "voice", Microsoft Anna, which is female. Therefore, even if you specifically request a male voice (via --Gender Male or SSML), you'll probably still hear Anna. This is not a bug in SayIt. :)
  • SayIt exposes the same volume settings that the System.Speech assembly does (via the PromptVolume enumeration), but you probably won't be able to increase the speech volume because the default setting is "ExtraLoud". This is not a bug in SayIt, either.
  • The speech engine is pretty good about spelling out acronyms like you'd want, but if you ever need to force the issue, just prefix the relevant word/acronym with the '-' character:
    SayIt Register your car at the -DOT.