Read the statement by Michael Teeuw here.
Core dump causing black screen
-
Does anyone have an idea how to debug a core dump caused by a module or MM itself?
What I am seeing is that occasionally my mirror goes black and I originally thought it was the screensaver kicking in. However, I believe it is due to a core dump occurring.
From the pm2 logs I see this:
ATTENTION: default value of option force_s3tc_enable overridden by environment. [16831:0413/111826.964220:FATAL:memory.cc(22)] Out of memory. size=79556608 ATTENTION: default value of option force_s3tc_enable overridden by environment. ATTENTION: default value of option force_s3tc_enable overridden by environment. [11331:0415/142933.736368:FATAL:memory.cc(22)] Out of memory. size=120422400 ATTENTION: default value of option force_s3tc_enable overridden by environment. ATTENTION: default value of option force_s3tc_enable overridden by environment. getrandom indicates that the entropy pool has not been initialized. Rather than continue with poor entropy, this process will block until entropy is available. ATTENTION: default value of option force_s3tc_enable overridden by environment. ATTENTION: default value of option force_s3tc_enable overridden by environment. [1700:0415/233524.137869:FATAL:memory.cc(22)] Out of memory. size=96002048 ATTENTION: default value of option force_s3tc_enable overridden by environment. ATTENTION: default value of option force_s3tc_enable overridden by environment. ATTENTION: default value of option force_s3tc_enable overridden by environment. [6522:0416/101829.273887:FATAL:memory.cc(22)] Out of memory. size=155766784 ATTENTION: default value of option force_s3tc_enable overridden by environment.
I am not concerned about the force_s3tc_enable but notice that there are several out of memory errors. Based on the timestamp of the generated core file I can correlate these out of memory errors to within a minute or 2 of core file.
So, in the above I have a core dump from:
4-13-2019 @ 11:18am
5-15/2019 @ 14:28pm
4-15-2019 @ 23:35pm
4-16-2019 @ 10:18amThere does not look to be a pattern here so not sure where to go other than disabled modules 1-by-1 and wait.
Given I have the core file is there a way to read it and determine at least which module caused the error as that would help me try to narrow it down.
-
@mlcampbe said in Core dump causing black screen:
6522:0416/101829.273887:FATAL:memory.cc(22)] Out of memory. size=155766784
it is out of memory…
-
Yeah I realize it is out of memory but what module is causing that? I have a core dump and I should be able to use gdb to see the stack trace to try to figure it out.
-
@mlcampbe no idea… I just caused a core dump on mine by comparing a whole structure to a string variable… (not on purpose) it didn’t tell me which module caused it
-
@mlcampbe - It definitely sounds like there is a memory leak somewhere. Personally, I blame electron. Basically out of spite, not any technical reason.
I would recommend that you add a cron job that will reboot your mirror at some absurd hour in the morning. It should reset the memory allocation and prevent OOM errors. It’s not a fix, but it should get you close enough.
-
Yeah that is a good idea and I might go that way as a last resort if I can’t identify which module is the culprit.
This morning I discovered the MMM-Logging module which prints date/time info into the logs and I am hoping that might help me match the coredump time to which module was doing something.
I also discovered a potential issue with the MMM-DailyBibleVerse module. I see that it has a getScripts function that loads the jquery-3.1.1.min.js file from its module directory. I am not sure if that is getting loaded over and over and thus eventually running out of memory but for now have removed that module for testing.
-
@mlcampbe getscripts is only called once per module
-
@mlcampbe see this https://www.javascriptjanuary.com/blog/nodejs-postmortem-debugging-for-fun-and-production
u need ulimit -c set to some non-0 number (number of core dumps)
0 by default
-
@sdetweil I am already getting core dumps generated so I have the file. I had see the nodejs postmortem debugging and I can get a stack trace from the core file but so far I have not been able to identify which module was active at the time from it.
Doing more research shows that this may be related to MMM-WallPaper which I am using. I found https://github.com/kolbyjack/MMM-Wallpaper/issues/3 that seems to match my symptoms exactly. I’m removing that module and will test it a few days and see what happens.