Introduction
I always been fascinated whenever I use the Acrobat reader Read Out options. I found that Adobe Reader uses the Windows Speech engine. Almost all versions of the Windows OS shipped with the Speech engine. You also can use this engine programatically. There are many features available with speech engine, such as speech recognition, text to speech, and so forth. By using speech recognition, you also can interact with your PC by using a voice command rather than GUI commands. In this example, I have shown how to use the TTS feature of the Speech engine.
Background
Windows XP shipped with the Text-To-Speach engine. You can verify this by clicking Control Panel ->Speech ->Text to speech. If this engine is not installed on your OS version, you can download it from Microsoft: Speech SDK 5.1. If you want to use the TTS feature on a web browser, you can use an ActiveX control provided by Microsoft by applying new ActiveXObject("Sapi.SpVoice") in your JavaScript.
A Little About SAPI
The Microsoft Speech API(SAPI) contains many interfaces and classes for managing speech. For TTS, the base class is SpVoice; following are some important properties:
- Voice: Object of type SpObjectToken that is inherited from ISpeechObjectTokens
- Volume: An integer that specifies the intensity of voice
- AudioOutputStream: Specifies the stream for audio output. If you want to save it in a file, use SpFileStream of SAPI
- SynchronousSpeakTimeout: Milliseconds after which the voice’s synchronous Speak and SpeakStream calls will time out
Methods
- GetVoices(): Returns all available voices. I have used this to populate the voice type comboBox
- Speak(): Returns the audio on the output stream(Speaker/ file)
- Pause(): Pauses the audio output
- Resume(): Resume the audio output
- WaitUntilDone(): Blocks application execution while a voice is speaking asynchronously
Using the Code
To start with SAPI in your .NET application, you have to first add a reference to SAPI.dll from the path C:Program FilesCommon FilesMicrosoft SharedSpeech if SAPI does not appear on the COM tab of Add Reference. Following is the code that generates audio based on the text entered. Note that I assign a Voice property value based on the Voice type selected from the ComboBox. At form_load, I have filled the ComboBox with all available Voices (see the next code section).
Private Sub btnSpeak_Click(ByVal sender As System.Object, _ ByVal e As System.EventArgs) Handles btnSpeak.Click Me.Cursor = Cursors.WaitCursor Dim oVoice As New SpeechLib.SpVoice Dim cpFileStream As New SpeechLib.SpFileStream oVoice.Voice = oVoice.GetVoices.Item(cmbVoices.SelectedIndex) oVoice.Volume = trVolume.Value oVoice.Speak(txtSpeach.Text, _ SpeechLib.SpeechVoiceSpeakFlags.SVSFDefault) oVoice = Nothing Me.Cursor = Cursors.Arrow End Sub
Find all available voices and Bind then with Voice ComboBox by using GetVoices method on the SpVoice class object. Note that list of available voices; you can use the getDescription method to find out the voice name; for example, LH Michael.
Private Sub Form1_Load(ByVal sender As System.Object, _ ByVal As System.EventArgs) Handles MyBase.Load Dim x As New SpeechLib.SpVoice Dim arrVoices As SpeechLib.ISpeechObjectTokens = x.GetVoices Dim arrLst As New ArrayList For i As Integer = 0 To arrVoices.Count - 1 arrLst.Add(arrVoices.Item(i).GetDescription) Next cmbVoices.DataSource = arrLst End Sub
To Save audio Output to a file, You must use SpFileStream and set the AudioOutPutStream=your stream object of type SpFileStream.
If SaveFileDialog1.ShowDialog = Windows.Forms.DialogResult.OK Then Dim oVoice As New SpeechLib.SpVoice Dim cpFileStream As New SpeechLib.SpFileStream cpFileStream.Open(SaveFileDialog1.FileName, _ SpeechLib.SpeechStreamFileMode.SSFMCreateForWrite, False) oVoice.AudioOutputStream = cpFileStream oVoice.Voice = oVoice.GetVoices.Item(cmbVoices.SelectedIndex) oVoice.Volume = trVolume.Value oVoice.Speak(txtSpeach.Text, _ SpeechLib.SpeechVoiceSpeakFlags.SVSFDefault) oVoice = Nothing cpFileStream.Close() cpFileStream = Nothing End If
References
Because this example is on TTS, those who are interested in the speech recognition and grammar part can refer to http://www.codeproject.com/cs/media/tambiSR.asp, For more details on the Speech SDK, please refer to http://www.microsoft.com/speech/techinfo/apioverview/.