Creating Custom Voice Commands for Hello-Lucy...?

Doctor0ctoroc

@sdetweil Thanks for the breakdown. So does the pocketsphink library have a particular set of full words already defined (ie no new words can be added on our end) and does it compare specific combinations of phonetic symbols to that library to determine what words are being spoken or does the engine strictly recognize phonetic symbols and it can recognize any word as long as that word is associated with a locally defined string of phonetic symbols?

In other words, does the library already have the word “history” and it is defined by specific combinations of phonetic symbols so when it detects any of those combinations it ‘hears’ the word “history” or does it simply ‘listen’ for symbols in the phonetic alphabet and upon hearing a combination defined anywhere (in the .dic file, for example), it can ‘hear’ the word? Based on the fact that “Jarvis” is a word already recognized by the Hello-Lucy module, I would think that is a unique enough word that it wouldn’t be included in a full library of all words that sphinx can recognize and that in order to recognize when a user says “Jarvis”, it would have to work by picking up the combination of phonetic symbols that represents the word (in the .dic file it appears as JH AA R V AH S).

If this is the case, then as long as the tools on the sphnx website are used to generate the content in the .dic and .lm files (I assume the .lm file works in conjunction with the .dic file in terms of recognizing phonetic symbols), new words should be able to be added to those files for use with Hello-Lucy. Then, it would just be a matter of adding the code to the relevant .js and .json files included with Hello-Lucy to determine the commands associated with those words and certain combinations of them with any other words defined in the .lm and .dic files. Looking at the code for Hello-Lucy’s functionality, it appears that the words.json and sentences.json files define full words and phrases used in commands, then the primary .js file issues commands based on those.

Am I on the right track or am I missing something?

sdetweil

@Doctor0ctoroc the tool uses a compiler of sorts to build the dictionary and language model
this is an unlimited vocabulary voice reco engine.

you can make is a limited vocabulary type by changing the dictionary

see the lmtool info
http://www.speech.cs.cmu.edu/tools/FAQ.html

Doctor0ctoroc

@sdetweil Okay cool - so it does seem like the tool generates pronunciations (strings of phonetic symbols) based on a dictionary of actual words but can also create pronunciations for new words (like proper names, eg Jarvis) so long as they are not too complicated (or over 35 characters) - however, it seems that it is less reliable to create pronunciations based on uncommon words than it is to create pronunciations for existing words. And the tool is also used to create complete sets of words to be referenced locally, if I’m not mistaken. This would mean that I can theoretically create a text file including all of the new words I want to add and upload it to both the lmtool and lextool, then I would add the output content to the .lm and .dic files included with Hello-Lucy. I think…

I may just create copies of all the related files in Hello-Lucy and do some experimenting with the option to revert back to the copied files.

sdetweil

@Doctor0ctoroc yep, you got it

Doctor0ctoroc

@sdetweil said in Creating Custom Voice Commands for Hello-Lucy...?:

@Doctor0ctoroc yep, you got it

Yes! So now that I got a handle on that, I need to figure out how the .js and .json files utilize the local sphinx library.

@Mykle1 - can you lend a hand here? From the looks of it, I believe that the words.json and sentences.json files contain a reference list of all of the words and phrases used in the checkCommands.json file, and they’re referenced by the node_helper.js and Hello-Lucy.js files to implement the hide/show commands, yes? Something like that? A basic hierarchy should suffice to point me int he right direction.

sdetweil

@Doctor0ctoroc its builds the library from the sentences and words files…
then calls lmtool to generate the lm & dic files

Doctor0ctoroc

@sdetweil So are the .lm and .dic files generated in real time? Like, does whatever is added to the words.json and sentences.json files propagate into the .lm and .dic files or are you saying that both the words and sentences files are the basis for generating the .lm and .dic files through the sphinx tool?

sdetweil

@Doctor0ctoroc on module startup

sdetweil

@Doctor0ctoroc module sends a message to node_helper “START”
and then u can read the code in node_helper

Doctor0ctoroc

@sdetweil Ah, that’s fantastic. That would explain why when I changed “Hello Lucy” to “Hey Jarvis” in the Hello-Lucy.js and config file, it was added to the .dic and .lm file…I thought it was included in there from the get go (assuming the code was written to include an alternative, ‘familiar’ AI name that users might want) but all this time, it was my change of the code that put it in there - and it totally works when I say “Hey Jarvis” instead of “Hello Lucy”!

So there’s no need to even edit the .lm or .dic files directly then?