The Parts
Voice control is made up of a number parts, for example:
- a microphone senses sound
- the speech recognition engine is used to transform the sound into something. Examples of speech recognition engines include the windows one, and ones called dragon and conformer.
- an API that takes the recognized speech and simulates the output of keyboard and mouse use. One example of an API is talon voice, which was written in python.
- this might be included in the API, but is not included in talon voice - a set of instructions that say what to do for which import, and for talon voice this is called the talon configuration set.
This post will focus on talon voice, and how it works, but not how to set it up, as that is widely available online.
First Use
So where to begin? Other than the tutorials mentioned in my other blog post, the logs and making a custom command are a good spot!
The logs are simply available under the menu.
The simplest command would be of this form, and would be in a .talon on file in the user folder, which can also be opened from the talon menu.
ditto:key(ctrl-`)
This custom command is the first one that I made, and it's the short cut to open the clipboard if you use the clipboard program called ditto.
How It Works
Talon voice is closed source, which is fair enough as it was invented by one person. A lot of custom scripts are open source though! How it works, is basically that it takes the scripts in the user folder, and brings them alive! Talon itself is placed in the app folder on a windows machine.