Better speech recognition engine?

A Former User · Sep 20, 2017, 7:53 PM

https://www.youtube.com/watch?v=RMlNQksWZxo
I’m building another voice commander & google-assistant(with grpc not python) implements. (It is not yet released.)
I’m using Google Cloud Speech for speech recognition. but It is somewhat difficult to test with my pronunciation(I’m not native English user.)
And, Google Cloud Speech is very cheap but not free. It could be burden for some people.

So I want to get an alternative. Is there any good speech recognition engine besides Google Cloud Speech?

I’ve considered these;
IBM Watson, Annyang, PhocketSphinx, Bing Speech, wit.ai(is it available now?). But including budget, I think Google Cloud Speech is better than others. Is there any opinion about this?

Mykle1 · Sep 21, 2017, 4:01 AM

@Sean

Nice work Sean. I don’t have any input but I do have a question. In the video, your mirror is running on a Pi? A tinker board? PC?

A Former User · Sep 21, 2017, 4:01 AM

@Mykle1 ATB.

lavolp3 · Sep 21, 2017, 9:08 AM

I always read that Alexa is still far superior to the Google Service.
Have you tried the modules provided here?
Nevertheless, I can’t see the video yet but the picture you’re showing looks very promising.
What hardware (mic and speaker) are you using?

I installed Alexa once on a pi but haven’t done a lot with it. Will install it again in the near future. But presumably not on the mirror.

A Former User · Sep 21, 2017, 9:38 AM

@lavolp3 my module would combine two mode - assistant and command. And speech recognoition(stt) is required for command mode. Alexa cannot help about that.
Anyway, The real final solution might be actions on google or alexa skill set for command mode.
But currently there are several problems for use easily. So I turned to stt solution.

lavolp3 · Sep 21, 2017, 9:46 AM

Now you have my full attention! :-)
Sounds exciting! Good luck with development!

Do you now jasper?
“Jasper is an open source platform for developing always-on, voice-controlled applications”
https://jasperproject.github.io/
Maybe worth a look?

E.g.:
“Julius is a high-performance open source speech recognition engine. It does not need an active internet connection. Please note that you will need to train your own acoustic model, which is a very complex task that we do not provide support for. Regular users are most likely better suited with one of the other STT engines listed here.”
(from the documentation)