Text to Speech Using Windows SAPI

CodeGuru content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

Introduction

I always been fascinated whenever I use the Acrobat reader Read Out options. I found that Adobe Reader uses the Windows Speech engine. Almost all versions of the Windows OS shipped with the Speech engine. You also can use this engine programatically. There are many features available with speech engine, such as speech recognition, text to speech, and so forth. By using speech recognition, you also can interact with your PC by using a voice command rather than GUI commands. In this example, I have shown how to use the TTS feature of the Speech engine.

Background

Windows XP shipped with the Text-To-Speach engine. You can verify this by clicking Control Panel ->Speech ->Text to speech. If this engine is not installed on your OS version, you can download it from Microsoft: Speech SDK 5.1. If you want to use the TTS feature on a web browser, you can use an ActiveX control provided by Microsoft by applying new ActiveXObject("Sapi.SpVoice") in your JavaScript.

A Little About SAPI

The Microsoft Speech API(SAPI) contains many interfaces and classes for managing speech. For TTS, the base class is SpVoice; following are some important properties:


  • Voice: Object of type SpObjectToken that is inherited from ISpeechObjectTokens

  • Volume: An integer that specifies the intensity of voice

  • AudioOutputStream: Specifies the stream for audio output. If you want to save it in a file, use SpFileStream of SAPI

  • SynchronousSpeakTimeout: Milliseconds after which the voice’s synchronous Speak and SpeakStream calls will time out

Methods


  • GetVoices(): Returns all available voices. I have used this to populate the voice type comboBox

  • Speak(): Returns the audio on the output stream(Speaker/ file)

  • Pause(): Pauses the audio output

  • Resume(): Resume the audio output

  • WaitUntilDone(): Blocks application execution while a voice is speaking asynchronously

Using the Code

To start with SAPI in your .NET application, you have to first add a reference to SAPI.dll from the path C:Program FilesCommon FilesMicrosoft SharedSpeech if SAPI does not appear on the COM tab of Add Reference. Following is the code that generates audio based on the text entered. Note that I assign a Voice property value based on the Voice type selected from the ComboBox. At form_load, I have filled the ComboBox with all available Voices (see the next code section).

Private Sub btnSpeak_Click(ByVal sender As System.Object, _
   ByVal e As System.EventArgs) Handles btnSpeak.Click
   Me.Cursor = Cursors.WaitCursor
   Dim oVoice As New SpeechLib.SpVoice
   Dim cpFileStream As New SpeechLib.SpFileStream

   oVoice.Voice = oVoice.GetVoices.Item(cmbVoices.SelectedIndex)
   oVoice.Volume = trVolume.Value
   oVoice.Speak(txtSpeach.Text, _
      SpeechLib.SpeechVoiceSpeakFlags.SVSFDefault)
   oVoice = Nothing
   Me.Cursor = Cursors.Arrow
End Sub

Find all available voices and Bind then with Voice ComboBox by using GetVoices method on the SpVoice class object. Note that list of available voices; you can use the getDescription method to find out the voice name; for example, LH Michael.

Private Sub Form1_Load(ByVal sender As System.Object, _
   ByVal As System.EventArgs) Handles MyBase.Load
   Dim x As New SpeechLib.SpVoice
   Dim arrVoices As SpeechLib.ISpeechObjectTokens = x.GetVoices
   Dim arrLst As New ArrayList
   For i As Integer = 0 To arrVoices.Count - 1
      arrLst.Add(arrVoices.Item(i).GetDescription)
   Next
   cmbVoices.DataSource = arrLst
End Sub

To Save audio Output to a file, You must use SpFileStream and set the AudioOutPutStream=your stream object of type SpFileStream.

If SaveFileDialog1.ShowDialog = Windows.Forms.DialogResult.OK Then
   Dim oVoice As New SpeechLib.SpVoice
   Dim cpFileStream As New SpeechLib.SpFileStream
   cpFileStream.Open(SaveFileDialog1.FileName, _
      SpeechLib.SpeechStreamFileMode.SSFMCreateForWrite, False)
   oVoice.AudioOutputStream = cpFileStream
   oVoice.Voice = oVoice.GetVoices.Item(cmbVoices.SelectedIndex)
   oVoice.Volume = trVolume.Value
   oVoice.Speak(txtSpeach.Text, _
      SpeechLib.SpeechVoiceSpeakFlags.SVSFDefault)

   oVoice = Nothing
   cpFileStream.Close()
   cpFileStream = Nothing
End If

References

Because this example is on TTS, those who are interested in the speech recognition and grammar part can refer to http://www.codeproject.com/cs/media/tambiSR.asp, For more details on the Speech SDK, please refer to http://www.microsoft.com/speech/techinfo/apioverview/.

More by Author

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Must Read