Speak Clearly, Write Exactly: Using Whisper in Windows
Speak Clearly, Write Exactly: Using Whisper in Windows
OpenAI’s Whisper is a new AI-powered solution that can turn your voice into text. Best of all, it comes at zero cost.
However, there’s a catch: it’s more challenging to install and use than your average Windows utility. Especially if you want to use your Nvidia GPU’s Tensor Cores to give it a nice boost.
Don’t fret, though. That’s why we’re here! Read on to find out how to install and use it, but also, if you own one, to have Whisper take advantage of your Nvidia GPU.
What Is OpenAI’s Whisper?
ChatGPT is all the rage nowadays, and we already saw how you can use ChatGPT by OpenAI . And yet, it’s not the only interesting project by OpenAI.
Powered by deep learning and neural networks, Whisper is a natural language processing system that can “understand” speech and transcribe it into text. But it’s also its own thing, sitting at a spot right among all similar solutions:
- Whisper is an AI solution “trained” on natural language. So, it’s better at understanding “normal” human speech than older solutions.
- Whisper doesn’t come with an interface, nor can it record audio. It can only take existing audio files and output text files.
- Since it’s good at “making sense of language”, Whisper also has the superpower of automatic translation in a single step.
- Whisper is not an online service and can work entirely offline.
- If you have a relatively modern Nvidia GPU (GTX970 or newer), Whisper can run in “hardware accelerated mode” to boost its speed.
- There’s no requirement to register, purchase a license, or buy a subscription.
Why Are AMD GPUs Not Supported?
For GPUs to be useful for more than graphics, they’d have to act as fully programmable processors. That’s why Nvidia created CUDA, officially deemed “a parallel computing platform and programming model”. To learn more about CUDA and related hardware (“CUDA cores”), read our article on what are CUDA cores and how they improve PC gaming .
CUDA is proprietary Nvidia technology, only compatible with Nvidia GPUs. The closest alternatives for AMD’s hardware are OpenCL and Radeon Compute Platform. To learn more about how each company’s solutions compare, check our article on AMD Compute Units vs. Nvidia CUDA Cores .
Compared to the alternatives, CUDA is considered more mature, performant, and easier to use. Thus, most developers only target CUDA, which, in turn, means that their software only takes advantage of the hardware features on Nvidia GPUs. And that includes Whisper.
How to Download and Install Whisper
Unfortunately, Whisper is not a standalone app you can download, install, and run. It relies on other software, which must also be installed.
For Windows, to keep this guide simple, we’ll use Chocolatey extensively for installing most of the necessary software parts. Check our guide on the quickest way to install Windows software for more info on Chocolatey.
For Linux and Macs, the installation process (excluding the Windows path variable, and easy-to-use batch files we’ll create) should be similar.
- To install and use Whisper, you must havePython and itsPIP tool installed and added to the Windows “Path” variable. For info on that, check our article on how to install Python PIP on Windows, Mac, and Linux .
- InstallFFMPEG through Chocolatey with this command:
choco install ffmpeg
Also, install its Python version with:pip3 install python-ffmpeg
Take advantage of PREMIUM features for 12 months.
Create your texts / logos without any limitation.
No attribution required when downloading.
No advertising on the website.
TextStudio.com PREMIUM - Yearly Membership - Finally, install Whisper from its Github page with:
pip3 install git+https://github.com/openai/whisper.git
Getting Whisper’s CUDA-Enabled Version
Although Whisper doesn’t use Nvidia GPUs, thetorch package it relies on offers a CUDA-accelerated version. Using this instead of the “plain” version can help Whisper complete its transcriptions much faster with the help of your Nvidia GPU.
To have Whisper use the CUDA cores of your Nvidia GPU:
- If you already have the “vanilla” version of torch installed, uninstall and purge remnants of it with:
pip3 uninstall torch
Once it’s done, follow it up with:pip cache purge
- Install torch’s CUDA-enabled version with:
pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu117
- To check if Whisper can use your Nvidia GPU, use:
whisper --help | findstr -i pytorch
You should see**(default: cuda)** instead of**(default: cpu)** .
What to Do if Torch Fails to Install
If you encounter the “no version found” errorwhile installing torch, you may need to install an older version of Python parallel to your current one.
Use this command to do that:
choco install python --version OLDER_VERSION --side-by-side
Replace “OLDER_VERSION” with a version, like 3.10.
Then, use the path of the secondary version for all “generic” Whisper commands (e.g., “c:\Python310\Scripts\pip.exe” rather than just “pip”).
How to Record Your Voice
You can use any sound-recording app to turn your voice into a WAV or MP3 file. Windows includes such an app—for more info on that, see how to use the Windows 10 Voice Recorder app .
For a more full-featured option, tryAudacity . Learn how to do it with our guide on how to use Audacity to record audio on Windows and Mac .
- Title: Speak Clearly, Write Exactly: Using Whisper in Windows
- Author: David
- Created at : 2024-08-16 00:31:49
- Updated at : 2024-08-17 00:31:49
- Link: https://win11.techidaily.com/speak-clearly-write-exactly-using-whisper-in-windows/
- License: This work is licensed under CC BY-NC-SA 4.0.