Read the statement by Michael Teeuw here.
Strange issue
-
@richland007
I re-read your first post and your problem is more restarting then actual crashing.
Of course a restart can be caused by a crash …
MMM-WatchDog tends to restart MM, so I too would suggest to remove this module.
In principle all it does is sending a ping and restarting MM using pm2 when a timeout occurs.
I also would expect to see some lines in the pm2 logs on WatchDog.
Can you do ?:pi@MagicPi:~ $ grep -i watchdog /home/pi/.pm2/logs/mm-out.log pi@MagicPi:~ $ grep -i watchdog /home/pi/.pm2/logs/mm-error.log
What is alo a good package to install is sysstat:
pi@MagicPi:~ $ sudo apt-get install sysstat
This allows for commands like iostat and sar.
vmstat 10 10 iostat 10 10 sar 10 10
-
@evroom Ok so here is whats writen on that top.txt file
pi@SmartMirror:~ $ tail -F top.txt 834 pi 20 0 155396 17884 8104 S 0.0 1.9 23:20.77 lxpanel 582 root 20 0 211836 13576 2440 S 0.0 1.4 418:48.14 Xorg 457 alexapi 20 0 218108 10768 4772 S 47.4 1.1 878:20.67 python 721 pi 20 0 122032 10600 4792 S 0.0 1.1 8:34.40 PM2 v3.0.3+ 835 pi 20 0 154056 5184 4016 S 0.0 0.5 0:43.57 pcmanfm 18753 pi 20 0 47692 4896 3996 S 0.0 0.5 0:03.94 lxterminal 829 pi 20 0 53536 4432 3556 S 0.0 0.5 0:13.22 openbox 100 root 20 0 35220 3140 2980 S 0.0 0.3 2:16.35 systemd-jo+ 7236 pi 20 0 8112 3140 2712 R 15.8 0.3 0:00.07 top ============================================= Sat Nov 17 11:25:01 CST 2018 top - 11:25:01 up 3 days, 15:28, 1 user, load average: 0.56, 2.88, 4.35 Tasks: 135 total, 1 running, 89 sleeping, 0 stopped, 1 zombie %Cpu(s): 21.4 us, 5.5 sy, 0.0 ni, 69.4 id, 2.3 wa, 0.0 hi, 1.5 si, 0.0 st KiB Mem : 949452 total, 493928 free, 157456 used, 298068 buff/cache KiB Swap: 949440 total, 777124 free, 172316 used. 727276 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 582 root 20 0 218840 24540 12364 S 10.5 2.6 419:19.31 Xorg 502 root 20 0 45188 23400 14452 S 36.8 2.5 27:20.07 vncserver-+ 834 pi 20 0 155396 21680 10616 S 0.0 2.3 23:23.35 lxpanel 7558 pi 20 0 77052 21140 18404 S 0.0 2.2 0:03.37 leafpad 835 pi 20 0 154156 16420 13232 S 0.0 1.7 0:46.57 pcmanfm 721 pi 20 0 122032 14004 7360 S 0.0 1.5 8:34.81 PM2 v3.0.3+ 457 alexapi 20 0 218108 10772 4772 S 36.8 1.1 880:07.93 python 636 root 20 0 15416 9020 8820 S 0.0 1.0 3:55.73 vncagent 18753 pi 20 0 47692 8484 7164 S 0.0 0.9 0:04.22 lxterminal 829 pi 20 0 53724 5888 4220 S 0.0 0.6 0:13.49 openbox 953 pi 20 0 27236 5312 4812 S 0.0 0.6 0:03.05 vncserverui 873 pi 20 0 27816 4328 3676 S 0.0 0.5 0:40.96 vncserverui 804 pi 20 0 39468 3288 3000 S 0.0 0.3 0:02.79 gvfsd 100 root 20 0 35220 3196 3036 S 0.0 0.3 2:16.50 systemd-jo+ 7603 pi 20 0 8112 3180 2752 R 10.5 0.3 0:00.05 top top - 11:25:01 up 3 days, 15:28, 1 user, load average: 0.56, 2.88, 4.35 Tasks: 135 total, 1 running, 89 sleeping, 0 stopped, 1 zombie %Cpu(s): 21.4 us, 5.5 sy, 0.0 ni, 69.4 id, 2.3 wa, 0.0 hi, 1.5 si, 0.0 st KiB Mem : 949452 total, 493776 free, 157600 used, 298076 buff/cache KiB Swap: 949440 total, 777124 free, 172316 used. 727128 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 582 root 20 0 218840 24540 12364 S 11.1 2.6 419:19.33 Xorg 502 root 20 0 45188 23400 14452 S 50.0 2.5 27:20.16 vncserver-+ 834 pi 20 0 155396 21680 10616 S 0.0 2.3 23:23.36 lxpanel 7558 pi 20 0 77052 21140 18404 S 0.0 2.2 0:03.37 leafpad 835 pi 20 0 154156 16420 13232 S 0.0 1.7 0:46.57 pcmanfm 721 pi 20 0 122032 14004 7360 S 0.0 1.5 8:34.81 PM2 v3.0.3+ 457 alexapi 20 0 218108 10772 4772 S 61.1 1.1 880:08.05 python 636 root 20 0 15416 9020 8820 S 0.0 1.0 3:55.73 vncagent 18753 pi 20 0 47692 8484 7164 S 0.0 0.9 0:04.24 lxterminal 829 pi 20 0 53724 5888 4220 S 0.0 0.6 0:13.49 openbox 953 pi 20 0 27236 5312 4812 S 0.0 0.6 0:03.05 vncserverui 873 pi 20 0 27816 4328 3676 S 0.0 0.5 0:40.96 vncserverui 804 pi 20 0 39468 3288 3000 S 0.0 0.3 0:02.79 gvfsd 100 root 20 0 35220 3196 3036 S 0.0 0.3 2:16.50 systemd-jo+ 7608 pi 20 0 8112 3180 2752 R 22.2 0.3 0:00.07 top =============================================
I use the command
tail -F top.txt
right?? i dont know how to read the above file so let me know if you see anything strange …i have increased the swap file to install opencv a while back for facial recognition to install rapidly but i do not use a usb …i could if i have to.
As far as taking out the watchdog…commenting it out of the config.js should do the trick or do i have to uninstall it form the modules??
Will pm2 still restart the MM if it crashes without the watchdog (it does if you do ctrl+q) ??
D
-
Okay, this calls for some more basics.
Print the whole file:
$ cat top.txtLoad the file in an editor:
$ nano top.txt
$ vi top.txtI am old-school, so I use vi, but nano is more Word-like.
To show the last 50 lines:
$ tail -50 top.txtTo show the first 50 lines:
$ head -50 top.txtWhere of course 50 is just an example.
To show text that is appending:
$ tail -f top.txtTo show text that is appending and re-open the file when necessary (useful for rotating log files):
$ tail -F top.txtOn disabling a module see the next reply.
-
To disable MMM-WatchDog, edit the config.js file.
Locate:module: 'MMM-WatchDog',
Edit this like this:
module: 'MMM-WatchDog', disabled: true,
and restart MM:
$ pm2 restart mm
Use
pm2 list
to check if your application name actually is mm, or something else.
Later you can enable it again by updating config.js like this:
disabled: false,
This works for all modules.
When I test modules I normally leave the config lines and disable a module in this fashion. -
@richland007
On pm2 and restarting I cannot say many.
On using ctrl+q neither, as I only access my MM via ssh.Can you send the output of the following commands:
$ uname -a
$ swapon -s
$ free -h
$ cat /etc/dphys-swapfile | egrep -v ‘#|^$’
$ sudo service dphys-swapfile statusI just learned that swap is handled as a service.
Being old-school, this is something new to me :-) -
@richland007
Concerning your top.txt, was that the output on the moment that MM was restarted ?
I do not see obvious memory issues.
Only that vncserver and python are CPU hungry, but within limits.
And I do not see any electron processes. -
@oceank The BIGGEST problem that nobody either wants to admit or doesn’t realize is that the pi wasn’t made to be pounded on this hard.
It’s a simple computer…when you start adding constant pulls for data and scrolling text, etc it over heats and overloads the pi.
-
@evroom I appologise on the delay at answering i was at work during the weekend
here is the autput from
pi@MagicPi:~ $ grep -i watchdog /home/pi/.pm2/logs/mm-out.log pi@MagicPi:~ $ grep -i watchdog /home/pi/.pm2/logs/mm-error.log
a bunch of these almost every 10-15 minutes
Fri Nov 09 2018 09:52:50 GMT-0600 (CST) - WatchDog: Heartbeat timeout. Frontend might have crashed. Exit now. Fri Nov 09 2018 10:06:24 GMT-0600 (CST) - WatchDog: Heartbeat timeout. Frontend might have crashed. Exit now.
WatchDog started. Maximum timeout: 20s. Module helper loaded: MMM-WatchDog Connecting socket for: MMM-WatchDog Starting module helper: MMM-WatchDog WatchDog started. Maximum timeout: 20s. Module helper loaded: MMM-WatchDog Connecting socket for: MMM-WatchDog Starting module helper: MMM-WatchDog
Here is some sysstat output from the following commands
vmstat 10 10 iostat 10 10 sar 10 10
r b swpd free buff cache si so bi bo in cs us sy id wa st 1 0 63744 209036 34364 567792 0 6 82 12 778 312 18 3 78 1 0 0 0 63744 209036 34372 567824 0 0 0 2 2431 278 4 0 96 0 0
avg-cpu: %user %nice %system %iowait %steal %idle 17.75 0.00 3.32 0.57 0.00 78.36 Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn mmcblk0 13.75 311.58 45.87 821628 120969 zram0 1.77 0.95 6.13 2504 16164 zram1 1.73 0.78 6.12 2068 16132 zram2 1.70 0.74 6.08 1952 16028 zram3 1.74 0.94 6.02 2480 15880
06:44:46 PM all 3.17 0.00 0.08 0.00 0.00 96.76 06:44:56 PM all 4.12 0.00 0.25 0.03 0.00 95.60
-
@evroom here is the rest of the commands you asked me to do :)
pi@SmartMirror:~ $ uname -a Linux SmartMirror 4.14.52-v7+ #1123 SMP Wed Jun 27 17:35:49 BST 2018 armv7l GNU/Linux pi@SmartMirror:~ $ swapon -s Filename Type Size Used Priority /dev/zram0 partition 237360 15596 5 /dev/zram1 partition 237360 15636 5 /dev/zram2 partition 237360 15612 5 /dev/zram3 partition 237360 15364 5 pi@SmartMirror:~ $ free -h total used free shared buff/cache available Mem: 927M 135M 166M 16M 624M 718M Swap: 927M 60M 866M pi@SmartMirror:~ $ cat /etc/dphys-swapfile | egrep -v ‘#|^$’ bash: ^$’: command not found pi@SmartMirror:~ $ sudo service dphys-swapfile status ● dphys-swapfile.service - LSB: Autogenerate and use a swap file Loaded: loaded (/etc/init.d/dphys-swapfile; generated; vendor preset: enabled Active: active (exited) since Sun 2018-11-18 17:49:07 CST; 1h 36min ago Docs: man:systemd-sysv-generator(8) Process: 358 ExecStart=/etc/init.d/dphys-swapfile start (code=exited, status=0 CGroup: /system.slice/dphys-swapfile.service Nov 18 17:49:06 SmartMirror systemd[1]: Starting LSB: Autogenerate and use a swa Nov 18 17:49:06 SmartMirror dphys-swapfile[358]: Starting dphys-swapfile swapfil Nov 18 17:49:07 SmartMirror dphys-swapfile[358]: want /var/swap=100MByte, checki Nov 18 17:49:07 SmartMirror dphys-swapfile[358]: done. Nov 18 17:49:07 SmartMirror systemd[1]: Started LSB: Autogenerate and use a swap lines 1-12/12 (END)
And on the top.txt file neither the first 50 nor the last 50 are not saying much about magic mirror but i am going to start it again and tail top.txt again once it crashes.
Here is the output any how for the first and last 50 lines of the top.txt file
pi@SmartMirror:~ $ tail -50 top.txt 848 pi 20 0 27796 8972 8052 S 0.0 0.9 0:01.61 vncserverui 809 pi 20 0 42752 8256 8116 S 0.0 0.9 0:00.09 lxpolkit 890 pi 20 0 74324 7768 7596 S 0.0 0.8 0:00.15 gvfs-udisk+ ============================================= Sun Nov 18 19:25:01 CST 2018 top - 19:25:01 up 1:25, 1 user, load average: 0.39, 0.39, 0.37 Tasks: 134 total, 1 running, 88 sleeping, 0 stopped, 1 zombie %Cpu(s): 12.5 us, 1.5 sy, 0.0 ni, 85.3 id, 0.3 wa, 0.0 hi, 0.3 si, 0.0 st KiB Mem : 949452 total, 169780 free, 140140 used, 639532 buff/cache KiB Swap: 949440 total, 887232 free, 62208 used. 734220 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 553 root 20 0 214532 54136 35564 S 0.0 5.7 3:10.64 Xorg 719 pi 20 0 120620 28044 15252 S 0.0 3.0 0:06.59 PM2 v3.0.3+ 814 pi 20 0 144716 27976 24560 S 0.0 2.9 0:12.95 pcmanfm 455 alexapi 20 0 191228 27236 13820 S 5.6 2.9 21:37.81 python 481 root 20 0 45136 22548 13788 S 0.0 2.4 3:25.82 vncserver-+ 1840 pi 20 0 77044 21280 18496 S 0.0 2.2 0:08.20 leafpad 812 pi 20 0 138632 20484 18116 S 0.0 2.2 0:21.17 lxpanel 1656 pi 20 0 49308 20328 16408 S 0.0 2.1 0:40.59 lxterminal 868 pi 20 0 27820 14160 12840 S 0.0 1.5 0:00.35 vncserverui 806 pi 20 0 53684 13080 10916 S 0.0 1.4 0:02.33 openbox 557 root 20 0 15416 10640 10312 S 5.6 1.1 0:29.86 vncagent 673 pi 20 0 52180 10472 10140 S 0.0 1.1 0:00.85 lxsession 848 pi 20 0 27796 8972 8052 S 0.0 0.9 0:01.61 vncserverui 809 pi 20 0 42752 8256 8116 S 0.0 0.9 0:00.09 lxpolkit 890 pi 20 0 74324 7768 7596 S 0.0 0.8 0:00.15 gvfs-udisk+ top - 19:25:01 up 1:25, 1 user, load average: 0.39, 0.39, 0.37 Tasks: 134 total, 1 running, 88 sleeping, 0 stopped, 1 zombie %Cpu(s): 12.5 us, 1.5 sy, 0.0 ni, 85.3 id, 0.3 wa, 0.0 hi, 0.3 si, 0.0 st KiB Mem : 949452 total, 170032 free, 139888 used, 639532 buff/cache KiB Swap: 949440 total, 887232 free, 62208 used. 734472 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 553 root 20 0 214532 54136 35564 S 0.0 5.7 3:10.64 Xorg 719 pi 20 0 120620 28044 15252 S 0.0 3.0 0:06.59 PM2 v3.0.3+ 814 pi 20 0 144716 27976 24560 S 0.0 2.9 0:12.96 pcmanfm 455 alexapi 20 0 191228 27236 13820 S 0.0 2.9 21:37.81 python 481 root 20 0 45136 22548 13788 S 0.0 2.4 3:25.82 vncserver-+ 1840 pi 20 0 77044 21280 18496 S 0.0 2.2 0:08.20 leafpad 812 pi 20 0 138632 20484 18116 S 0.0 2.2 0:21.17 lxpanel 1656 pi 20 0 49308 20328 16408 S 0.0 2.1 0:40.59 lxterminal 868 pi 20 0 27820 14160 12840 S 0.0 1.5 0:00.35 vncserverui 806 pi 20 0 53684 13080 10916 S 0.0 1.4 0:02.33 openbox 557 root 20 0 15416 10640 10312 S 0.0 1.1 0:29.86 vncagent 673 pi 20 0 52180 10472 10140 S 0.0 1.1 0:00.85 lxsession 848 pi 20 0 27796 8972 8052 S 0.0 0.9 0:01.61 vncserverui 809 pi 20 0 42752 8256 8116 S 0.0 0.9 0:00.09 lxpolkit 890 pi 20 0 74324 7768 7596 S 0.0 0.8 0:00.15 gvfs-udisk+ ============================================= pi@SmartMirror:~ $ head -50 top.txt Fri Nov 16 22:55:01 CST 2018 top - 22:55:02 up 3 days, 2:58, 1 user, load average: 0.18, 0.31, 0.21 Tasks: 132 total, 1 running, 88 sleeping, 0 stopped, 1 zombie %Cpu(s): 18.3 us, 4.2 sy, 0.0 ni, 74.3 id, 2.0 wa, 0.0 hi, 1.2 si, 0.0 st KiB Mem : 949452 total, 276412 free, 169288 used, 503752 buff/cache KiB Swap: 949440 total, 825760 free, 123680 used. 687104 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 582 root 20 0 219092 34684 17284 S 0.0 3.7 302:37.78 Xorg 457 alexapi 20 0 210684 27312 13416 S 0.0 2.9 799:59.04 python 834 pi 20 0 152648 26352 11692 S 0.0 2.8 19:11.56 lxpanel 502 root 20 0 45188 23764 14620 S 0.0 2.5 26:30.86 vncserver-+ 835 pi 20 0 154056 17252 14084 S 0.0 1.8 0:33.17 pcmanfm 721 pi 20 0 127960 16648 7700 S 0.0 1.8 6:21.25 PM2 v3.0.3+ 2011 pi 20 0 48208 12352 10328 S 0.0 1.3 0:10.66 lxterminal 100 root 20 0 35220 9068 8840 S 0.0 1.0 2:10.19 systemd-jo+ 636 root 20 0 15416 9048 8840 S 0.0 1.0 3:51.37 vncagent 829 pi 20 0 53536 6128 4664 S 0.0 0.6 0:09.74 openbox 953 pi 20 0 27236 4904 4428 S 0.0 0.5 0:02.30 vncserverui 873 pi 20 0 27816 4280 3616 S 0.0 0.5 0:32.84 vncserverui 1 root 20 0 27024 3680 2972 S 0.0 0.4 0:12.38 systemd 598 root 20 0 7272 3464 3192 S 0.0 0.4 0:01.09 bluetoothd 979 raspoti+ 20 0 18924 3432 3036 S 0.0 0.4 1:10.74 librespot top - 22:55:02 up 3 days, 2:58, 1 user, load average: 0.18, 0.31, 0.21 Tasks: 132 total, 1 running, 88 sleeping, 0 stopped, 1 zombie %Cpu(s): 18.3 us, 4.2 sy, 0.0 ni, 74.3 id, 2.0 wa, 0.0 hi, 1.2 si, 0.0 st KiB Mem : 949452 total, 276684 free, 168980 used, 503788 buff/cache KiB Swap: 949440 total, 825760 free, 123680 used. 687416 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 582 root 20 0 219092 34684 17284 S 0.0 3.7 302:37.78 Xorg 457 alexapi 20 0 210684 27312 13416 S 5.9 2.9 799:59.05 python 834 pi 20 0 152648 26352 11692 S 0.0 2.8 19:11.56 lxpanel 502 root 20 0 45188 23764 14620 S 0.0 2.5 26:30.86 vncserver-+ 835 pi 20 0 154056 17252 14084 S 0.0 1.8 0:33.18 pcmanfm 721 pi 20 0 127960 16648 7700 S 0.0 1.8 6:21.25 PM2 v3.0.3+ 2011 pi 20 0 48208 12352 10328 S 0.0 1.3 0:10.66 lxterminal 100 root 20 0 35220 9068 8840 S 0.0 1.0 2:10.19 systemd-jo+ 636 root 20 0 15416 9048 8840 S 0.0 1.0 3:51.37 vncagent 829 pi 20 0 53536 6128 4664 S 0.0 0.6 0:09.74 openbox 953 pi 20 0 27236 4904 4428 S 0.0 0.5 0:02.30 vncserverui 873 pi 20 0 27816 4280 3616 S 0.0 0.5 0:32.84 vncserverui 1 root 20 0 27024 3680 2972 S 0.0 0.4 0:12.38 systemd 598 root 20 0 7272 3464 3192 S 0.0 0.4 0:01.09 bluetoothd 979 raspoti+ 20 0 18924 3432 3036 S 0.0 0.4 1:10.74 librespot ============================================= Fri Nov 16 23:00:01 CST 2018 top - 23:00:01 up 3 days, 3:03, 1 user, load average: 3.02, 1.49, 0.69 Tasks: 138 total, 4 running, 91 sleeping, 0 stopped, 1 zombie %Cpu(s): 18.4 us, 4.2 sy, 0.0 ni, 74.3 id, 2.0 wa, 0.0 hi, 1.2 si, 0.0 st
Thank you
D -
Its holding better i haven’t seen it restart in the past few hours i think MMM-watchdog was the problem