Category Archives: NSSpeechRecognizer

“Computer, compute to the last digit the value of pi” — Spock(Star Trek)

Okay, we obviously can’t be computing the last digit of pi, but we can do other cool things with NSSpeechRecognizer. Now, another potential damper is that NSSpeechRecognizer under Tiger does not have dictation capabilities yet(Leopard might, but I have no idea). But it is still useful, and extremely easy to use from our own programs. First make a simple Cocoa application with a button. To get a quick start in Cocoa, you can either take the gentle introduction(Become An Xcoder), or the quick tutorial– Currency Converter While in Interface Builder, Create a subclass of NSObject, instantiate it, and call itSpeechController(in order to make it work with the source code I provide). After you place the button above, create an action for SpeechController called buttonPressed(IB should automatically put in the colon after buttonPressed). Go ahead and create files for SpeechController(default location should be fine). Then, make your SpeechController.m and SpeechController.h match the following code(warning, it has been a while since I ran this code). At any rate, take a look at the disclaimers following the code. The code is licensed under the GNU General Public License (GPLv2)

SpeechController.h

/* Copyright(c) 2007 Chinmoy Gavini*/

#import <Cocoa/Cocoa.h>

@interface SpeechController : NSObject
{
NSSpeechRecognizer *recognizer;
NSArray *cmds;
int next;
}
– (id) init;
– (IBAction)buttonPressed:(id)sender;
@end

SpeechController.m

/*Copyright (c) 2007 Chinmoy Gavini
*/
#import “SpeechController.h”

@implementation SpeechController

(id) init {
[super init];
NSLog(@”init”);
next = 0;
recognizer = [[NSSpeechRecognizer alloc] init];
[recognizer setDelegate:self];
cmds = [NSArray arrayWithObjects:@”next”,nil];
[recognizer setCommands:cmds];
return self;
}

(void)speechRecognizer:(NSSpeechRecognizer* )sender
didRecognizeCommand:(id) command{

//NSWorkspace stuff from Scott Stevenson’s article
//http://theocacao.com/document.page/183
NSWorkspace * ws = [NSWorkspace sharedWorkspace];
NSArray * apps;
apps = [ws valueForKeyPath:@”launchedApplications.NSApplicationName”];
if([command isEqualTo:@”next”])
{
[ws launchApplication:[apps objectAtIndex:next]];
if (next > [apps count])
{
next = 0;
}
next++;
}
}

(IBAction)buttonPressed:(id)sender
{
[recognizer startListening];
}

@end

So I don’t remember the exact details of what this does (I am not near my Mac right now), but basically it cycles through the list of open applications(although not as conveniently as you might think because the mouse focus has to be on this app before it can switch to another app). I’ll probably add code to minimize windows later. But basically, how NSSpeechRecognizer works is that it delegates the work to speechRecognizer: didRecognizeCommand: once the hard part that we don’t have to worry about(making sense of the signal coming through the microphone, etc), is done. All the programmer(you,me) has to do is to follow his/her side of the “contract”, i.e., implement

(void)speechRecognizer:(NSSpeechRecognizer* )sender
didRecognizeCommand:(id) command{

in our application.

 NO WARRANTY

  11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW.  EXCEPT WHEN
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  THE ENTIRE RISK AS
TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU.  SHOULD THE
PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
REPAIR OR CORRECTION.

  12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGES.