In this article, I’m going to show you how you can convert text into speech and integrate the audio file in your application.
- Retrieve a Client ID and Secret from your Google Project and use these credentials to set up OAuth2.0 for Google on your Simplifier instance. Then, create a new login method and add it in the Connector endpoint settings (recommended).
- Or: Retrieve the API Key from your Google Project and add it to each Connector Call, as the input parameter ‘APIKey’.
Step 1 – Test the function ‘synthesizeText’
If you used the OAuth2.0 authentication mechanism as login in the Connector, login in Simplifier with your Google account.
Test the Connector call ‘synthesizeText’. The two mandatory parameters of this function are ‘text’ and ‘languageCode’, all other parameters are optional.
Enter an English text and the language code ‘en-US’, then test. As a result, you should see the parameter ‘audioContent’, containing the converted text as base64.
With the parameters ‘voiceName’ and ‘gender’, you can specify how the audio of your text should sound. Execute the Connector call ‘getVoices’ to see a list of all available voices and their gender.
Step 2 – Use the Audio in an App
The audio is generated, but how can we listen to the text or even download the audio?
To do so, create a new Simplifier application. Execute the login wizard and select the login mechanism ‘oAuth’. Then, select the Google OAuth authentification mechanism on your Simplifier instance (or create one, as stated in the Prerequisites section).
In the UI Designer, switch to the screen that is displayed after the login process and add the widget ‘ui_core_HTML‘. In the widget’s property ‘content’, you can add any HTML code. In our case, we want to display an HTML audio widget, so enter the following code:
<audio controls style='width:500px' id='myaudio' controls src='' />
In the Process Designer, create a new story and execute the Connector call ‘synthesizeText‘. As input parameter ‘text’, you can use text from a constant for testing purposes, or you can add a TextArea to your application where the user can enter the text that should be converted to speech.
Store the output parameter ‘audioContent‘ in a Global Variable.