Python Program to Convert Speech to Text with Output

Speech recognition is one of most exciting and useful applications of machine learning and with growing popularity of virtual assistants and chatbots demand for this technology is only going to increase. In this article we’ll learn how to Convert Speech to Text with Output using Python and gtts library.

Step by Step Video to Convert Speech to Text in Python with Output

Step 1: Installing required libraries

To get started we need install required libraries. We’ll be using gtts library for speech recognition and SpeechRecognition library convert speech to text. To install these libraries we’ll use pip. In new Colab notebook we can run following code:

Real Time Currency Converter in Python with Output

Python Program to Convert Text to Speech

Any Language Translator in Python

!pip install gtts
!pip install SpeechRecognition

Step 2: Importing libraries

Now that we’ve installed libraries we need to import them into our Python code. We can do this by adding following line code our notebook:

from gtts import gTTS
import speech_recognition as sr

Step 3: Recording speech

To convert speech to text we first need to record speech. We can do this using sr.Recognizer class from SpeechRecognition library. We can define a function that will record speech using microphone of our device. Here’s the code for function:

def record_audio():
    r = sr.Recognizer()
    with sr.Microphone() as source:
        print('Speak now...')
        r.adjust_for_ambient_noise(source)
        audio = r.listen(source)
    return audio

Step 4: Converting speech to text

Once we’ve recorded speech we can use gTTS class from gtts library convert to text. We can define function that will take in the recorded audio as an argument and return text. Here’s code for function:

def speech_to_text(audio):
    r = sr.Recognizer()
    try:
        text = r.recognize_google(audio)
        return text
    except:
        return 'Sorry, I could not understand what you said.'

Step 5: Putting it all together

Now that we’ve defined our functions we can put them together to convert speech to text. Here’s complete code:

from gtts import gTTS
import speech_recognition as sr

def record_audio():
    r = sr.Recognizer()
    with sr.Microphone() as source:
        print('Speak now...')
        r.adjust_for_ambient_noise(source)
        audio = r.listen(source)
    return audio

def speech_to_text(audio):
    r = sr.Recognizer()
    try:
        text = r.recognize_google(audio)
        return text
    except:
        return 'Sorry, I could not understand what you said.'

audio = record_audio()
text = speech_to_text(audio)
print('You said:', text)

Step 6: Running the code

To run code we can simply click on the “Run” button in Colab or press “Ctrl+Enter” on our keyboard. We’ll be prompted to speak and our speech will be converted to text.

Conclusion In this article we learned how convert speech to text using Python and gtts library. We installed required libraries imported them into our code, recorded the speech using the microphone of our device, and used the gtts library to convert it to text. We put it all together to create a simple application that can recognize speech and convert it to text. With this knowledge, you can now explore more advanced speech recognition applications and build your own voice-activated projects.

Python Program to Convert Speech to Text with Output

Complete code implementation

Here is complete code implementation for converting speech to text using Python and gtts library:

# Import required libraries
from gtts import gTTS
import speech_recognition as sr

# Define function to record audio
def record_audio():
    r = sr.Recognizer()
    with sr.Microphone() as source:
        print('Speak now...')
        r.adjust_for_ambient_noise(source)
        audio = r.listen(source)
    return audio

# Define function to convert speech to text
def speech_to_text(audio):
    r = sr.Recognizer()
    try:
        text = r.recognize_google(audio)
        return text
    except:
        return 'Sorry, I could not understand what you said.'

# Record audio
audio = record_audio()

# Convert speech to text
text = speech_to_text(audio)

# Print the converted text
print('You said:', text)

To run this code simply copy and paste into a new Colab notebook and click “Run” button or press “Ctrl+Enter” on your keyboard. You’ll be prompted to speak and your speech will converted to text.

Here are some useful links related speech-to-text conversion using Python and gtts library:

gtts documentation: https://gtts.readthedocs.io/en/latest/
pyaudio documentation: https://people.csail.mit.edu/hubert/pyaudio/docs/
Tutorial on speech recognition using Python: https://realpython.com/python-speech-recognition/
Tutorial on text-to-speech using Python and gtts: https://www.geeksforgeeks.org/convert-text-speech-python-gtts/
Stack Overflow thread on “No module named ‘gtts'”: https://stackoverflow.com/questions/52243811/module-not-found-error-gtts
GitHub repository for pyaudio: https://github.com/spatialaudio/pyaudio

These resources should provide you with all information you need get started with speech-to-text conversion using Python and gtts library.

Python Program to Convert Speech to Text with Output

Step by Step Video to Convert Speech to Text in Python with Output

Step 1: Installing required libraries

Step 2: Importing libraries

Step 3: Recording speech

Step 5: Putting it all together

Step 6: Running the code

Leave a Reply Cancel reply