Sunday, May 26, 2024

How To Convert ChatGPT Into An Advanced Voice Assistant

ChatGPT-3 needs no introduction. You ask it any question and it replies in a flash. But the answer comes in the form of text. What if you could talk with it, just like you do with any voice assistant like Siri?

It’s no secret that ChatGPT has revolutionised the world of AI. Unlike other AI bots, it is able to understand the context of a conversation and respond, and it makes you feel like you are chatting with a human and not a machine.

POC Video In English:

POC Video In Hindi:

But as it is still a kind of chatbot, you need to type a question and you get the answer in the form of text. That’s not as exciting as talking to a bot.

This thought gave me the idea of programming ChatGPT so that it could be used as a voice assistant called VoiceGPT. I began by using natural language processing (NLP) to recognise the voice, and then transferring the recognised voice to the ChatGPT engine as a query using the API. After getting an intelligent reply from ChatGPT, I again used NLP to convert it into a human voice.

- Advertisement -
VoiceGPT working principle
Fig. 1: VoiceGPT working principle

I needed a good NLP tool for this and OpenAI itself provides one, that is, Whisper. But due to limited time and space, I ended up using Google Natural Language API.

A step-by-step guide to making VoiceGPT

We need to begin by combining the NLP for the ttX service with ChatGPT. For this we need a machine to run the open API, transfer the query gathered from NLP, and reprocess the answer given by ChatGPT into a human voice using NLP.

You can use any laptop, but I chose the Raspberry Pi to run all this. For capturing the voice for recognition, I attached the voice bonnet; a USB microphone can also be used with Raspberry Pi. However, if you are using a laptop to run the VoiceGPT code, there is no need for a USB microphone; you can use the laptop’s inbuilt microphone.

- Advertisement -

We now need to create an account and log into ChatGPT (see Fig. 2).

ChatGPT login page
Fig. 2: ChatGPT login page

Next, we need to get the API key for doing research and experimenting with the ChatGPT code, as shown in Fig. 3.

Getting the OpenAI API menu
Fig. 3: Getting the OpenAI API menu
ChatGPT API keys
Fig. 4: ChatGPT API keys

You can create the API key using the right-corner option for API in your OpenAI account (Fig. 4).

After generating the OpenAI API key, copy it and save it. We need it later in our code for developing VoiceGPT.

Now we need to install the open AI on the system where we are going to run the VoiceGPT. Here you can use a computer with any Linux version installed. I used Raspberry Pi for it.

Next, open the terminal and install the open AI and other Python modules that help us in natural language processing. Here you can use Whisper from OpenAI or any other NLP module. I used Google NLP and combined it with ChatGPT.

Fig. 5: Cloning OpenAi ChatGPT code
Cloning OpenAi ChatGPT code

You can install these modules using the following command. After that, you can either create your open custom talking content in OpenAI or use simple chatting in the playground. Here, you can also set the temperature, frequency, and other parameters for your VoiceGPT assistant.

sudo pip3 install openai
sudo pip3 install SpeechRecognition
sudo pip3 install gTTSRefer to Fig. 5 and Fig. 6 to see how to clone the OpenAI ChatGPT and do the setup.

Raspberry Pi ChatGPT setup
Fig. 6: Raspberry Pi ChatGPT setup

Next, set the temperature, frequency, and chat model, as shown in Fig. 7.

Setting temperature, frequency, and chat model in ChatGPT
Fig. 7: Setting temperature, frequency, and chat model in ChatGPT

Programming ChatGPT to be used as VoiceGPT

First, we need to import the OpenAI Python module in code to play with OpenAI and carry out an experiment with ChatGPT. Next, we import the modules for NLP. After that, we import pygame to play the file that processed the reply in a human voice using the NLP model.
Next, we need to set the ChatGPT model (see Fig. 8). Here, we can choose from model names like Davinci, Ada, etc. Each model has its own expertise, and the cost of using these models varies. But no worries, because developers get a US$18 credit to develop and experiment with OpenAI.

Next, we need to set the API in the code. With that, we have created the function for connecting with ChatGPT to handle the query and get the response from it.

import speech_recognition as sr
import math
import time
import serial
from espeak import espeak
import sys
import openai
import pygame
from gtts import gTTS
#model_to_use=”text-davinci-003” # most
model_to_use=”text-ada-001” # lowest
token cost
r = sr.Recognizer()
openai.api_key=”******Your Key
def chatGPT(query):
response = openai. Completion.create(
max_tokens 1000
return str.strip(response[‘choices’][0]
[‘text’]), response[‘usage’][‘total_

After that, we create the main function and then make a while loop. Here, we use NLP to capture the voice continuously and extract what we said using the NLP model and save it as a query. Then we transfer this query to ChatGPT and receive the response from it.

def main():
print(‘LED is ON while button is
pressed (Ctrl-C for exit).’)
while True:
with sr.Microphone() as source:
r.adjust_for_ambient_noise (source)
print(“Say something!”)
audio r.listen(source)
print(“Recognizing Now….”)
command=str(r.recognize_google (audio))
print(“Google Speech Recognition thinks
you said + command)
(res, usage) = chatGPT (query)
tts gTTS(text=res, lang=’en’)“good.mp3”)“good.mp3”)

if __name__ == ‘__main__’:

After this, we again use NLP to convert the reply from ChatGPT into a human voice, and then we play that voice. This whole thing runs in a loop continuously making it look like a real conversation between two humans.

This VoiceGPT gives you the option of customising and selecting models. It allows you to choose from GPT models like Ada, Davinci, or Babbage. It uses a free speech-recognition service that can be customised to offline speech-recognition services like Sphenix.

List of ChatGPT models
Figure 8: List of ChatGPT models

Testing VoiceGPT

To test the VoiceGPT, run the code in Python, and it will tell you to ask a question or start a conversation. You can ask whatever you want; it recognises your voice, transfers the query to ChatGPT, and then replies to you in a human voice.

So now you can talk to ChatGPT just like you do with Google Assistant, Alexa, or Siri. Enjoy your conversation with VoiceGPT!

Download source code

Author’s note. This is the first version of VoiceGPT. I am still experimenting with it and you will get all the new updates very soon on

Ashwini Kumar Sinha is an IoT and AI enthusiast tech journalist at EFY

Ashwini Sinha
Ashwini Sinha
A tech journalist at EFY, with hands-on expertise in electronics DIY. He has an extraordinary passion for AI, IoT, and electronics. Holder of two design records and two times winner of US-China Makers Award.


  1. “”How To Convert ChatGPT Into An Advanced Voice Assistant””
    This very informative article by Shree.Ashwini Kumar Sinha, is deeply explaing various aspects of the programming to make more use of ChatGPT.
    Even since its release, I have been experimenting with ChatGPT on “other languages?. aspect. Thru continuous chats in it, I am checking the response quality and depth of ChatGPT by conversing deeply in that language. Other than English ans some European Languages, the Data Base and Language Model in other languages is too shallow. Althouigh they assure that my feed back from my end is accepted and the Data Base is updated, the update is not reflected in the conversation subsequently.
    This idea seems very innovation and novel in that we can have a vocal conversation thru ChatGPT.
    Thanks a lot to the author.


Unique DIY Projects

Electronics News

Truly Innovative Tech

MOst Popular Videos

Electronics Components