Read the statement by Michael Teeuw here.
Help with puppeteer ,cheerio and json
-
Can someone please help me solve noob problems with the development of my module.
The module collects data from http://www.ri-info.net/Radovi.aspx of possible works and shutdowns of water and electricity installations, by streets and places for the area of the city of Rijeka, Croatia.
The module works, but I need to implement a few more things to be more complete. And the DOM must be rewritten.
The data on the website is generated dynamically, so I use puppeteer and cheerio to retrieve the data and pack it in JSON.The first problem and doubts:
The plan was to use website search to match the street and location added to the config.sample in configuration:
{ module: "MMM-Radovi-RI", position: "top_right", config: { streetName:"Labinska", placeName:"Rijeka" } },
I manage to pass config.streetName and config.placeName to node_helper, puppeteer find, but for some streets there is a multi-street selector.
I don’t know how to automatically implement street and place marker combinations. This should look like this"LABINSKA, Rijeka"
The selector is working with manualy typed :
const option = (await page.$x( '//*[@id="cphContent_cphSadrzaj_ddlOdaberiUlicu"]/option[text() = "LABINSKA, Rijeka"]' // text "LABINSKA, Rijeka" shoul be combine from payload (config.streetName:"",config.placeName:"",) ))[0]; const value = await (await option.getProperty('value')).jsonValue(); await page.select('#cphContent_cphSadrzaj_ddlOdaberiUlicu', value); }catch (error) { console.log("The element didn't appear.") }
The big problem is the website search engine for some street is not working properly (data is not displayed), but if I select everything, then there is street and job data.
Maybe I should ignore first problem and collect all the data in JSON and then sort and display it using config selection.Second problem:
Because table is generated automatically i have problem collecting all data.
I’m using tr class=“tableRow” as reference to get data as place ,street, starting time ,end time and end date of work (interruption).
I don’t know how to get tr class:“datum” as starting point of first day and add all rest data in that.code is still in progress
const $ = cheerio.load(html); var water = []; var waterList = []; var el = $('#cphContent_cphSadrzaj_upVoda table tbody ')[0]; if (el) { $('#cphContent_cphSadrzaj_upVoda table tbody .tableRow').prev().each(function (index, elem) { const $elem = $(elem); var street = $elem.find('span[id*="Label2"]').text(); var streetNmbr = $elem.find('span[id*="Label7"]').text(); var startTime = $elem.find('span[id*="Label8"]').text(); var endTime = $elem.find('span[id*="Label9"]').text(); var endWork = $elem.find('span[id*="Label10"]').text(); var water = { street, streetNmbr, startTime, endTime, endWork }; waterList.push(water); }); }; if (!el) { console.log('No new entries!'); }; var waterWork = JSON.parse(JSON.stringify(waterList)); self.sendSocketNotification('ELECTRIC_POWER_DISCONNECTED', waterWork);
table sample from html
<tr class="tableRow"> </tr><table id="radovi" class="tabla"> <tbody><tr> <th></th> <th class="ulica">Ulica </th> <th class="kucnibroj">Kućni broj </th> <th>od </th> <th>do </th> <th>Povremeno </th> <th>Kraj </th> </tr> <tr class="datum"> <th> <div style="font-size: 11px;"> 08.12.2020 //this date to use as starting date and make array for object </div> </th> </tr> <tr> <td> <a id="cphContent_cphSadrzaj_lvVoda_ListView1_0_LinkButton1_0" title="Dodatne informacije" href="javascript:__doPostBack('ctl00$ctl00$cphContent$cphSadrzaj$lvVoda$ctrl0$ListView1$ctrl0$LinkButton1','')"><img id="cphContent_cphSadrzaj_lvVoda_ListView1_0_Image1_0" src="resources/images/plus.png" style="height:16px;"></a> </td> <td class="ulica"> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_0_Label2_0" style="color:Black;">DOVIČIĆI, Viškovo</span> </td> <td class="kucnibroj"> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_0_Label7_0" style="color:Black;">19-20 ; 39-42 </span> </td> <td> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_0_Label8_0" style="color:Black;">08:30</span> </td> <td> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_0_Label9_0" style="color:Black;">14:30</span> </td> <td> <span class="aspNetDisabled"><input id="cphContent_cphSadrzaj_lvVoda_ListView1_0_CheckBox1_0" type="checkbox" name="ctl00$ctl00$cphContent$cphSadrzaj$lvVoda$ctrl0$ListView1$ctrl0$CheckBox1" disabled="disabled"></span> </td> <td> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_0_Label10_0" style="color:Black;">08.12.20</span> </td> </tr><tr class="tableRow"> </tr> <tr> <td class="tableNapomena" colspan="6"> <div id="" style="border: 0px; margin: 0px; padding: 0px; overflow-y: hidden; visibility: visible; height: 0px; display: none;" class=""><div id="cphContent_cphSadrzaj_lvVoda_ListView1_0_ManagingPanel_0" style="visibility: visible; height: auto;"> Navedenim područjem kružiti će autocisterna </div></div> <input type="hidden" name="ctl00$ctl00$cphContent$cphSadrzaj$lvVoda$ctrl0$ListView1$ctrl0$CollapsiblePanelExtender1_ClientState" id="cphContent_cphSadrzaj_lvVoda_ListView1_0_CollapsiblePanelExtender1_ClientState_0" value="true"> </td> </tr> <tr> <td> <a id="cphContent_cphSadrzaj_lvVoda_ListView1_0_LinkButton1_1" title="Dodatne informacije" href="javascript:__doPostBack('ctl00$ctl00$cphContent$cphSadrzaj$lvVoda$ctrl0$ListView1$ctrl1$LinkButton1','')"><img id="cphContent_cphSadrzaj_lvVoda_ListView1_0_Image1_1" src="resources/images/plus.png" style="height:16px;"></a> </td> <td class="ulica"> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_0_Label2_1" style="color:Black;">FERENCI, Viškovo</span> </td> <td class="kucnibroj"> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_0_Label7_1" style="color:Black;">9-55</span> </td> <td> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_0_Label8_1" style="color:Black;">08:30</span> </td> <td> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_0_Label9_1" style="color:Black;">14:30</span> </td> <td> <span class="aspNetDisabled"><input id="cphContent_cphSadrzaj_lvVoda_ListView1_0_CheckBox1_1" type="checkbox" name="ctl00$ctl00$cphContent$cphSadrzaj$lvVoda$ctrl0$ListView1$ctrl1$CheckBox1" disabled="disabled"></span> </td> <td> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_0_Label10_1" style="color:Black;">08.12.20</span> </td> </tr><tr class="tableRow"> </tr> <tr> <td class="tableNapomena" colspan="6"> <div id="" style="border: 0px; margin: 0px; padding: 0px; overflow-y: hidden; visibility: visible; height: 0px; display: none;" class=""><div id="cphContent_cphSadrzaj_lvVoda_ListView1_0_ManagingPanel_1" style="visibility: visible; height: auto;"> Navedenim područjem kružiti će autocisterna </div></div> <input type="hidden" name="ctl00$ctl00$cphContent$cphSadrzaj$lvVoda$ctrl0$ListView1$ctrl1$CollapsiblePanelExtender1_ClientState" id="cphContent_cphSadrzaj_lvVoda_ListView1_0_CollapsiblePanelExtender1_ClientState_1" value="true"> </td> </tr> <tr> <td> </td> <td class="ulica"> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_0_Label2_2" style="color:Black;">ISTARSKA, Rijeka</span> </td> <td class="kucnibroj"> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_0_Label7_2" style="color:Black;">64</span> </td> <td> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_0_Label8_2" style="color:Black;">10:00</span> </td> <td> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_0_Label9_2" style="color:Black;">14:00</span> </td> <td> <span class="aspNetDisabled"><input id="cphContent_cphSadrzaj_lvVoda_ListView1_0_CheckBox1_2" type="checkbox" name="ctl00$ctl00$cphContent$cphSadrzaj$lvVoda$ctrl0$ListView1$ctrl2$CheckBox1" disabled="disabled"></span> </td> <td> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_0_Label10_2" style="color:Black;">08.12.20</span> </td> </tr><tr class="tableRow"> </tr> <tr> <td class="tableNapomena" colspan="6"> <div id="" style="border: 0px; margin: 0px; padding: 0px; overflow-y: hidden; visibility: visible; height: 0px; display: none;" class=""><div id="cphContent_cphSadrzaj_lvVoda_ListView1_0_ManagingPanel_2" style="visibility: visible; height: auto;"> </div></div> <input type="hidden" name="ctl00$ctl00$cphContent$cphSadrzaj$lvVoda$ctrl0$ListView1$ctrl2$CollapsiblePanelExtender1_ClientState" id="cphContent_cphSadrzaj_lvVoda_ListView1_0_CollapsiblePanelExtender1_ClientState_2" value="true"> </td> </tr> <tr> <td> </td> <td class="ulica"> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_0_Label2_3" style="color:Black;">KASTAVSKA, Rijeka</span> </td> <td class="kucnibroj"> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_0_Label7_3" style="color:Black;"></span> </td> <td> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_0_Label8_3" style="color:Black;">10:00</span> </td> <td> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_0_Label9_3" style="color:Black;">14:00</span> </td> <td> <span class="aspNetDisabled"><input id="cphContent_cphSadrzaj_lvVoda_ListView1_0_CheckBox1_3" type="checkbox" name="ctl00$ctl00$cphContent$cphSadrzaj$lvVoda$ctrl0$ListView1$ctrl3$CheckBox1" disabled="disabled"></span> </td> <td> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_0_Label10_3" style="color:Black;">08.12.20</span> </td> </tr><tr class="tableRow"> </tr> <tr> <td class="tableNapomena" colspan="6"> <div id="" style="border: 0px; margin: 0px; padding: 0px; overflow-y: hidden; visibility: visible; height: 0px; display: none;" class=""><div id="cphContent_cphSadrzaj_lvVoda_ListView1_0_ManagingPanel_3" style="visibility: visible; height: auto;"> </div></div> <input type="hidden" name="ctl00$ctl00$cphContent$cphSadrzaj$lvVoda$ctrl0$ListView1$ctrl3$CollapsiblePanelExtender1_ClientState" id="cphContent_cphSadrzaj_lvVoda_ListView1_0_CollapsiblePanelExtender1_ClientState_3" value="true"> </td> </tr> <tr> <td> </td> <td class="ulica"> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_0_Label2_4" style="color:Black;">MATE BALOTE, Rijeka</span> </td> <td class="kucnibroj"> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_0_Label7_4" style="color:Black;">3-14</span> </td> <td> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_0_Label8_4" style="color:Black;">10:00</span> </td> <td> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_0_Label9_4" style="color:Black;">14:00</span> </td> <td> <span class="aspNetDisabled"><input id="cphContent_cphSadrzaj_lvVoda_ListView1_0_CheckBox1_4" type="checkbox" name="ctl00$ctl00$cphContent$cphSadrzaj$lvVoda$ctrl0$ListView1$ctrl4$CheckBox1" disabled="disabled"></span> </td> <td> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_0_Label10_4" style="color:Black;">08.12.20</span> </td> </tr><tr class="tableRow"> </tr> <tr> <td class="tableNapomena" colspan="6"> <div id="" style="border: 0px; margin: 0px; padding: 0px; overflow-y: hidden; visibility: visible; height: 0px; display: none;" class=""><div id="cphContent_cphSadrzaj_lvVoda_ListView1_0_ManagingPanel_4" style="visibility: visible; height: auto;"> </div></div> <input type="hidden" name="ctl00$ctl00$cphContent$cphSadrzaj$lvVoda$ctrl0$ListView1$ctrl4$CollapsiblePanelExtender1_ClientState" id="cphContent_cphSadrzaj_lvVoda_ListView1_0_CollapsiblePanelExtender1_ClientState_4" value="true"> </td> </tr> <tr class="datum"> <th> <div style="font-size: 11px;"> 09.12.2020 // second date and all rest data after </div> </th> </tr> <tr> <td> </td> <td class="ulica"> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_1_Label2_0" style="color:Black;">ISTARSKA, Rijeka</span> </td> <td class="kucnibroj"> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_1_Label7_0" style="color:Black;">64</span> </td> <td> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_1_Label8_0" style="color:Black;">10:00</span> </td> <td> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_1_Label9_0" style="color:Black;">14:00</span> </td> <td> <span class="aspNetDisabled"><input id="cphContent_cphSadrzaj_lvVoda_ListView1_1_CheckBox1_0" type="checkbox" name="ctl00$ctl00$cphContent$cphSadrzaj$lvVoda$ctrl1$ListView1$ctrl0$CheckBox1" disabled="disabled"></span> </td> <td> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_1_Label10_0" style="color:Black;">09.12.20</span> </td> </tr><tr class="tableRow"> </tr> <tr> <td class="tableNapomena" colspan="6"> <div id="" style="border: 0px; margin: 0px; padding: 0px; overflow-y: hidden; visibility: visible; height: 0px; display: none;" class=""><div id="cphContent_cphSadrzaj_lvVoda_ListView1_1_ManagingPanel_0" style="visibility: visible; height: auto;"> </div></div> <input type="hidden" name="ctl00$ctl00$cphContent$cphSadrzaj$lvVoda$ctrl1$ListView1$ctrl0$CollapsiblePanelExtender1_ClientState" id="cphContent_cphSadrzaj_lvVoda_ListView1_1_CollapsiblePanelExtender1_ClientState_0" value="true"> </td> </tr> <tr> <td> </td> <td class="ulica"> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_1_Label2_1" style="color:Black;">KASTAVSKA, Rijeka</span> </td> <td class="kucnibroj"> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_1_Label7_1" style="color:Black;"></span> </td> <td> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_1_Label8_1" style="color:Black;">10:00</span> </td> <td> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_1_Label9_1" style="color:Black;">14:00</span> </td> <td> <span class="aspNetDisabled"><input id="cphContent_cphSadrzaj_lvVoda_ListView1_1_CheckBox1_1" type="checkbox" name="ctl00$ctl00$cphContent$cphSadrzaj$lvVoda$ctrl1$ListView1$ctrl1$CheckBox1" disabled="disabled"></span> </td> <td> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_1_Label10_1" style="color:Black;">09.12.20</span> </td> </tr><tr class="tableRow"> </tr> <tr> <td class="tableNapomena" colspan="6"> <div id="" style="border: 0px; margin: 0px; padding: 0px; overflow-y: hidden; visibility: visible; height: 0px; display: none;" class=""><div id="cphContent_cphSadrzaj_lvVoda_ListView1_1_ManagingPanel_1" style="visibility: visible; height: auto;"> </div></div> <input type="hidden" name="ctl00$ctl00$cphContent$cphSadrzaj$lvVoda$ctrl1$ListView1$ctrl1$CollapsiblePanelExtender1_ClientState" id="cphContent_cphSadrzaj_lvVoda_ListView1_1_CollapsiblePanelExtender1_ClientState_1" value="true"> </td> </tr> <tr> <td> </td> <td class="ulica"> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_1_Label2_2" style="color:Black;">MATE BALOTE, Rijeka</span> </td> <td class="kucnibroj"> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_1_Label7_2" style="color:Black;">3-14</span> </td> <td> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_1_Label8_2" style="color:Black;">10:00</span> </td> <td> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_1_Label9_2" style="color:Black;">14:00</span> </td> <td> <span class="aspNetDisabled"><input id="cphContent_cphSadrzaj_lvVoda_ListView1_1_CheckBox1_2" type="checkbox" name="ctl00$ctl00$cphContent$cphSadrzaj$lvVoda$ctrl1$ListView1$ctrl2$CheckBox1" disabled="disabled"></span> </td> <td> <span id="cphContent_cphSadrzaj_lvVoda_ListView1_1_Label10_2" style="color:Black;">09.12.20</span> </td> </tr><tr class="tableRow"> </tr> <tr> <td class="tableNapomena" colspan="6"> <div id="" style="border: 0px; margin: 0px; padding: 0px; overflow-y: hidden; visibility: visible; height: 0px; display: none;" class=""><div id="cphContent_cphSadrzaj_lvVoda_ListView1_1_ManagingPanel_2" style="visibility: visible; height: auto;"> </div></div> <input type="hidden" name="ctl00$ctl00$cphContent$cphSadrzaj$lvVoda$ctrl1$ListView1$ctrl2$CollapsiblePanelExtender1_ClientState" id="cphContent_cphSadrzaj_lvVoda_ListView1_1_CollapsiblePanelExtender1_ClientState_2" value="true"> </td> </tr> </tbody></table>
How to get date end all object inside that date till new date and so on
First date:<tr> <th> <div> 08.12.2020 </div> </th> </tr>
Im getting data as
[ { street: 'BUJSKA, Rijeka', streetNmbr: '30,32,34,36', startTime: '09:00', endTime: '11:00', endWork: '07.12.20' }, { street: 'LABINSKA, Rijeka', streetNmbr: '18,20', startTime: '09:00', endTime: '11:00', endWork: '07.12.20' }, ...... ]
but i would like to add the starting date like this
[ { startDate:'07.12.20', data: { street: 'BUJSKA, Rijeka', streetNmbr: '30,32,34,36', startTime: '09:00', endTime: '11:00', endWork: '07.12.20' }, { street: 'LABINSKA, Rijeka', streetNmbr: '18,20', startTime: '09:00', endTime: '11:00', endWork: '08.12.20' } }, { startDate:'08.12.20', data: { street: 'LABINSKA, Rijeka', streetNmbr: '18,20', startTime: '09:00', endTime: '11:00', endWork: '08.12.20' } } ]
I was unable to upload images to explain it better.
Sorry, English is not my nativ language and im just start learning JS . -
@lolo ignore the class= for the moment. that will just make it look prettier w colors etc
A table is a list of rows. Which contain a list of columns
A table is table
A row is tr
A column is tdYour data is an array []
Of objects{}You can use the for loop to process you data
This is pseudo code, I am on my phone
for(let item of array_name){
Create row.
Create td = item.element_name
Etc
} -
Thanks for replay.
I believe, I dint explained well.
This is table from website. Screenshot.
The data what i menage to collect is from prev sibling of tr class tableRow.
What i wish to get is tr class datum as first date and then data of street, street number, time (all in span) as data object in array, till next tr datum and then same after.
For each span id is different but they sharing same part of one number in id.
I cant use for for each as element (td or tr) as plenty of them are useless.
And here I’m stuck. I tried different approaches but then all data was multiplying or got empty object in arrays
example from website
-
@lolo but u have the date already in the object. Sort them maybe, but cheerio should return the list in order.
-
Yes I’m getting date , but this one is endWork (kraj). When the interruption will end, not to start. I know maybe I should use that.
-
I think i managed to sort it out.
const $ = cheerio.load(html); var water = []; var waterList = []; var el = $('#cphContent_cphSadrzaj_upVoda table tbody '); if (el) { var elementWater = $('#cphContent_cphSadrzaj_upVoda table tbody span').filter(function () { return $(this).text() === self.place; }).parent().parent(); $(elementWater).each(function (jex, edate) { const $edate = $(edate); var startDate = $edate.prev().find('tr[class="datum"]>th>div').text().trim(); var street = $edate.find('span[id*="Label2"]').text(); var streetNmbr = $edate.find('span[id*="Label7"]').text(); var startTime = $edate.find('span[id*="Label8"]').text(); var endTime = $edate.find('span[id*="Label9"]').text(); var endWork = $edate.find('span[id*="Label10"]').text(); var water = { startDate, street, streetNmbr, startTime, endTime, endWork }; waterList.push(water); console.log(water); }); }; if (!el) { console.log('No new entries!'); }; var waterWork = JSON.parse(JSON.stringify(waterList)); self.sendSocketNotification('ELECTRIC_POWER_DISCONNECTED', waterWork);
Now i can get all data sorted by config.place
{ startDate: '09.12.2020', street: 'ISTARSKA, Rijeka', streetNmbr: '64', startTime: '10:00', endTime: '14:00', endWork: '09.12.20' }
-
@lolo cool
-
This is output of my module. Any suggestion about styling.
croatian
I know there are more things to make it better but i have to learn it first.
-
@lolo i cannot help with look. Totally useless here