@soldatino okay, I think it’s because you are using a different username it’s breaking. I will try to add the config options to set the files path in the coming days, for now just check you nodehelper of MMM-Chat.js line 21 and let me know if that helps. You will have to set it to your correct absolute path. Cheers.
Read the statement by Michael Teeuw here.
Posts
-
RE: Chatgpt+google STT + elevenlabs api
-
RE: Chatgpt+google STT + elevenlabs api
@soldatino hi could you try this, go to MMM-Chat directory, there is a file called transcript.py, around line 34-36 you will find these
# start recording frames = sd.rec(int(RATE * RECORD_SECONDS), samplerate=RATE, channels=CHANNELS, blocking=True)
Change it into
# start recording print("say something...") frames = sd.rec(int(RATE * RECORD_SECONDS), samplerate=RATE, channels=CHANNELS, blocking=True)
Console log is added to say something, now go to the end of the line and comment out the following line
# delete saved audio file os.remove(WAVE_OUTPUT_FILENAME)
It should look like this now
# delete saved audio file #os.remove(WAVE_OUTPUT_FILENAME)
And now run the script in the following way
Python3 transcript.py
It will print on the console “say something ……” try speaking on the mic and wait for it to complete, if everything goes well it will print the transcript on the console. If it does not print it in the console, you will get a wav file in the directory, try playing it and see if the audio is audible. If the audio is not clear you might have to change the rate according to which rate gives the best audio. Let me know if you have any more questions.
-
Drunk AI
Recently I came across this awesome module MMM-OpenAI, I have posted earlier with a short demo and YouTube link for the full demo. and I decided to do something fun, I was not expecting it to turn out this way, I have detailed the step by step instructions in the GitHub to set this up, Iam not from a technical background therefore the codes are not really polished and there maybe issues but I tried by best to be thorough in the instructions as much as possible, I’m sure someone else will come up with better idea to implement this, here is the link Drunk AI repo
At the moment it’s quite slow and it will get faster will more enhancementsDetailed instructions:
Drunk-AIThis is an integration of OPENAI chatgpt in the magicmirror.
Demo below
watch the full demo here ([https://youtu.be/eDcc7zqAFwE])
This is the initial version, I am not from a technical background therefore if the codes and scripts are little off the charts or there is a way to enhance it feel free to send PR. I admit this is quite messy and little long process to set up but in the end its worth it.
Download the shortcuts linked below and configure it with the api key from MMM-RemoteControl
more info here ([https://github.com/Jopyth/MMM-Remote-Control.git])ios shortcuts link
([https://www.icloud.com/shortcuts/8a0e7600808d45eb9616dae8105653ef])
EDIT THE DOWNLOADED SHORTCUT with your magicmirror ip address AND REPLACE apikey in the text with your api key from MMM-RemoteControl.
you need to run this shortcuts every time you want to interact with the chatgpt, or you can use the default telegram commands but that is without voice input, hotword detection will be added soon.
When you run this shortcuts, the mirror with update with jarvis animation and text saying “say something”, you have to speak in the microphone after you see Say something and if the audio was captured successfully, it will update with the spoken text. the listening duration is set to 4 seconds by default, you can adjust this by editing the transcript.py in the MMM-Chat module mentioned below.
Install and configure MMM-Chat from here ([ https://github.com/sdmydbr9/MMM-Chat])
Install MMM-NotificationTrigger ([https://github.com/MMRIZE/MMM-NotificationTrigger.git])
Add the following lines to the config
{ module: 'MMM-NotificationTrigger', config: { triggers: [ { trigger: 'SHOW_ALERT', fires: [ { fire: 'MY_COMMAND', exec: (payload) => `python3 /home/pi/MagicMirror/modules/MMM-11-TTS/main.py "${payload.message}"` } ] } ] } },
Install MMM-OpenAI from here ([https://github.com/MMRIZE/MMM-OpenAI.git])
Add the following in your config
{ module: "MMM-OpenAI", position: 'top_right', config: { defaultChatInstruction: "Your name is Marvin and you are a paranoid android that reluctantly answers questions with sarcastic responses.", stealth: true, // <- This is needed to hide default module view. postProcessing: function (handler, responseObj) { if (responseObj.error) return // When the error happens, just do nothing. let method = responseObj.options.method let alertPayload = { title: responseObj.request.prompt, imageUrl: (method === 'IMAGE') ? responseObj.response.data[0].url : null, message: (method === 'TEXT') ? responseObj.response.choices[0].text : ((method === 'CHAT') ? responseObj.response.choices[0].message.content : null), timer: 2 * 1000 } handler.sendNotification('SHOW_ALERT', alertPayload) } } },
Clone the following respository ([https://github.com/sdmydbr9/MMM-11-TTS]) in your modules folder and install it according to instruction
edit the main.py and add your api key and voice id, a voice id as already set by default, you can add any voice id, refer elevenlabs api doc for more details.
Disclaimer: Even though the quality of the output of their voice is far superior to GOOGLE TTS, the character limit is very limited, 10,000 characters per month per account, I hope they will offer more in future, or you can opt for a paid account and get around 30,000 characters as well as voice cloning features, clone any voice you want, for example clone the voice of jarvis and transform your magicmirror into jarvis. just add the voice ID in the main.py and youre good to go.
Disclaimer 2: The above module works in my test but it is not very efficient since the script will first downlaod the audio from the api request and convert it using fmpeg and play the audio as output.
Depends on fmpeg. Install it if you dont have it installed using apt
This whole implementation maybe possible to be implemented in a single module, I hope someone will try to make this in a single module and less messy
-
RE: Chatgpt+google STT + elevenlabs api
@Rags here is the detailed instructions of how I did it, it’s not polished but I tried by best to be thorough
-
Chatgpt+google STT + elevenlabs api
Added voice to the chatgpt model and made an interactive chatbot with voice input and voice output using google stt for transcription and eleven labs api for tts.
This is not final yet and I’m hoping someone will polish this, this is just a proof and concept and will post the complete code once it’s better. Here is the link to the video chatgptTodo: 1. Add Hotword detection. Right now depends on notification from the iOS shortcuts.
2. Add offline STT, google stt though more accurate takes up more time.
3. Optimisation of the process. Right now it takes time to complete a request -
RE: MMM-OpenAI
@MMRIZE I think the Hotword module is not Available anymore and the clap module, might work but it will not look as asthetic as activating with the hot word…
-
RE: MMM-OpenAI
@sdetweil I just meant it as a similar approach, I don’t actually want to implement the whole assistant module.
-
RE: MMM-OpenAI
@MMRIZE a similar idea has crossed my mind. Using a Hotword detector like the one used in google assignment module to activate a listener for voice and ask for voice input and then using the input provided, get the transcript of the audio using the whisper api or something similar to get the transcript… pass on the transcript as a content in the OPENAI_REQUEST notification and then SHOW_ALERT or just read it out loud using the MMM-GoogleTTS. Just a rough sketch. I’m sure someone out here will come up with more convincing and better ideas.
-
RE: Notifications help
@BKeyport I did that and it’s definitely my code, I can see the notification is being sent by the SHOW_ALERT but my module is not able to capture them, I tried different codes and hit mental block now.