Posts made by alexyak

alexyak

I’ve managed to create the VoiceControl module based on Snowboy Hotword detection: https://snowboy.kitt.ai/

alexyak

OK, I’ve tested the pocketsphinx.js. It’s very unreliable with the minimal setting - too many false positives. The native pocketsphinx installation and configuration is too convoluted with a lot of dependencies.

Actually I was able to achieve the best and consistent results with https://snowboy.kitt.ai. There’s a way to create multiple keyword models on their web site and include it with the app. I’ve got this working pretty well in the MM. I should be able to publish the voice control module based on snowboy pretty soon.

alexyak

Actually I’ve started looking at this: https://syl22-00.github.io/pocketsphinx.js/ Need to test the performance on RaspPi

alexyak

I think using https://snowboy.kitt.ai is the best approach right now for detecting a hot word and then continuing with the google one (annyang). The same way the guy at the http://docs.smart-mirror.io has done it.

alexyak

There’s another option to use a cloud service from Microsoft https://www.microsoft.com/cognitive-services/en-us/face-api. It should be enough to have just one good picture of your student uploaded to the server. I’ve got a basic module working that uses these api’s, but it’s not at the stage that I can publish it for everybody yet.

-Alex

alexyak

You don’t have to run the the code in the electron every time. I would just run “node index” from the /serveronly folder and start the chrome with dev tools on my dev box. Also as dev tool Visual Studio Code is a very good one and runs on Windows, Linux and OS X. You could configure it to debug the node.js as well as the chrome client (via a plugin) locally or even have a remote debugging session.

alexyak

It looks like this guy has figured it out:

http://docs.smart-mirror.io

alexyak

@tyho You will quickly run out of quota if it listens continuously.

alexyak

Besides the ability to recognize the speech you also need to start the recognition with annyang. There should be an always listening component that reacts on a key word and starts annyang.