I’ve managed to create the VoiceControl module based on Snowboy Hotword detection: https://snowboy.kitt.ai/
Read the statement by Michael Teeuw here.
Posts made by alexyak
-
VoiceControl module
-
RE: Voice/motion control
OK, I’ve tested the pocketsphinx.js. It’s very unreliable with the minimal setting - too many false positives. The native pocketsphinx installation and configuration is too convoluted with a lot of dependencies.
Actually I was able to achieve the best and consistent results with https://snowboy.kitt.ai. There’s a way to create multiple keyword models on their web site and include it with the app. I’ve got this working pretty well in the MM. I should be able to publish the voice control module based on snowboy pretty soon.
-
RE: Voice/motion control
Actually I’ve started looking at this: https://syl22-00.github.io/pocketsphinx.js/ Need to test the performance on RaspPi
-
RE: Voice/motion control
I think using https://snowboy.kitt.ai is the best approach right now for detecting a hot word and then continuing with the google one (annyang). The same way the guy at the http://docs.smart-mirror.io has done it.
-
RE: Facial Rec. Attendance/Roll Taker
There’s another option to use a cloud service from Microsoft https://www.microsoft.com/cognitive-services/en-us/face-api. It should be enough to have just one good picture of your student uploaded to the server. I’ve got a basic module working that uses these api’s, but it’s not at the stage that I can publish it for everybody yet.
-Alex
-
RE: Debugging
You don’t have to run the the code in the electron every time. I would just run “node index” from the /serveronly folder and start the chrome with dev tools on my dev box. Also as dev tool Visual Studio Code is a very good one and runs on Windows, Linux and OS X. You could configure it to debug the node.js as well as the chrome client (via a plugin) locally or even have a remote debugging session.
-
RE: Voice/motion control
@tyho You will quickly run out of quota if it listens continuously.
-
RE: Voice/motion control
Besides the ability to recognize the speech you also need to start the recognition with annyang. There should be an always listening component that reacts on a key word and starts annyang.