First, we need to install some of the required modules and libraries for the project. As the smart glass is based on real-time video processing, by using OCR we can extract the text from the pages that may include images. Here, a module is needed that can capture images from a camera. By using a speech synthesis module, the text in those captured images can be converted into speech.
First, we need to install the following libraries:
To install them, use the commands below:
sudo apt-get install espeak
sudo apt-get install espeak python-espeak
sudo pip3 install opencv2
Import the required libraries to the code and then set the path where the video frames are to be saved for text extraction.
Create a while loop in the code, which will capture real-time video from the camera. Using cv2, convert the image into BGR and save it to the path previously set. Then call PyTesseract that will open the saved video frame for processing the image and extracting text from it. By using eSpeak, the speech engine will convert all that text into audio and read it.
For a clear voice output that does not sound robotic you can use paid or other text-to-speech services like gtts, watson speech, etc.
Fix the camera onto the eyeglass and run the code. On putting a book in front of the camera and waiting for a few minutes without any movement, it will automatically start reading the book. To hear it, connect your earphones to the Raspberry Pi headphone TRRS jack or any speaker with amplifier. You can also connect any Bluetooth earphone.
Download Source Code
Ashwini Kumar Sinha is an electronics hobbyist and tech journalist at EFYi