MagicMirror² v2.14.0 is available! For more information about this release, check out this topic.

Help with puppeteer ,cheerio and json



  • Can someone please help me solve noob problems with the development of my module.
    The module collects data from http://www.ri-info.net/Radovi.aspx of possible works and shutdowns of water and electricity installations, by streets and places for the area of ​​the city of Rijeka, Croatia.
    The module works, but I need to implement a few more things to be more complete. And the DOM must be rewritten.
    The data on the website is generated dynamically, so I use puppeteer and cheerio to retrieve the data and pack it in JSON.

    The first problem and doubts:
    The plan was to use website search to match the street and location added to the config.

    sample in configuration:

    {
    module: "MMM-Radovi-RI",
    position: "top_right",
    	config: {
    		streetName:"Labinska",
    		placeName:"Rijeka"
    		}
    },
    

    I manage to pass config.streetName and config.placeName to node_helper, puppeteer find, but for some streets there is a multi-street selector.
    I don’t know how to automatically implement street and place marker combinations. This should look like this

    "LABINSKA, Rijeka"
    

    The selector is working with manualy typed :

    	const option = (await page.$x(
                '//*[@id="cphContent_cphSadrzaj_ddlOdaberiUlicu"]/option[text() = "LABINSKA, Rijeka"]'   
    // text "LABINSKA, Rijeka" shoul be combine from payload (config.streetName:"",config.placeName:"",)
              ))[0];
              const value = await (await option.getProperty('value')).jsonValue();
             await page.select('#cphContent_cphSadrzaj_ddlOdaberiUlicu', value);
    	
    	}catch (error) {
            console.log("The element didn't appear.")
        }
    

    The big problem is the website search engine for some street is not working properly (data is not displayed), but if I select everything, then there is street and job data.
    Maybe I should ignore first problem and collect all the data in JSON and then sort and display it using config selection.

    Second problem:
    Because table is generated automatically i have problem collecting all data.
    I’m using tr class="tableRow" as reference to get data as place ,street, starting time ,end time and end date of work (interruption).
    I don’t know how to get tr class:"datum" as starting point of first day and add all rest data in that.

    code is still in progress

    	const $ = cheerio.load(html);
        var water = [];
        var waterList = [];
        var el = $('#cphContent_cphSadrzaj_upVoda table tbody ')[0];
        if (el) {
            $('#cphContent_cphSadrzaj_upVoda table tbody .tableRow').prev().each(function (index, elem) {
                const $elem = $(elem);
                var street = $elem.find('span[id*="Label2"]').text();
                var streetNmbr = $elem.find('span[id*="Label7"]').text();
                var startTime = $elem.find('span[id*="Label8"]').text();
                var endTime = $elem.find('span[id*="Label9"]').text();
                var endWork = $elem.find('span[id*="Label10"]').text();
    
                var water = {
                    street,
                    streetNmbr,
                    startTime,
                    endTime,
                    endWork
                };
                waterList.push(water);
    
            });
    
        };
        if (!el) {
            console.log('No new entries!');
        };
        var waterWork = JSON.parse(JSON.stringify(waterList));
       
    	self.sendSocketNotification('ELECTRIC_POWER_DISCONNECTED', waterWork);
    

    table sample from html

    <tr>
                                                    </tr><table>
                                            <tbody><tr>
                                                <th></th>
                                                <th>Ulica
                                                </th>
                                                <th>Kućni broj
                                                </th>
                                                <th>od
                                                </th>
                                                <th>do
                                                </th>
                                                <th>Povremeno
                                                </th>
                                                <th>Kraj
                                                </th>
                                            </tr>
                                            
                                        <tr>
                                            <th>
                                                <div>
                                                    08.12.2020   //this date to use as starting date and make array for object
                                                </div>
                                            </th>
                                        </tr>
                                        
                                                
                                                <tr>
                                                    <td>
                                                        <a><img src="resources/images/plus.png" /></a>
                                                        
                                                    </td>
                                                    <td>
                                                        DOVIČIĆI, Viškovo
                                                    </td>
                                                    <td>    
                                                        19-20 ; 39-42 
                                                    </td>
                                                    <td>
                                                        08:30
                                                    </td>
                                                    <td>
                                                        14:30
                                                    </td>
                                                    <td>
                                                        
                                                    </td>
                                                    <td>
                                                        08.12.20
                                                    </td>
                                                    </tr><tr>
                                                    </tr>
                                                
                                                <tr>
                                                    <td>
                                                        <div><div>
    		
                                                            Navedenim područjem kružiti će autocisterna
                                                        
    	</div></div>
                                                        
                                                    </td>
                                                </tr>
                                            
                                                <tr>
                                                    <td>
                                                        <a><img src="resources/images/plus.png" /></a>
                                                        
                                                    </td>
                                                    <td>
                                                        FERENCI, Viškovo
                                                    </td>
                                                    <td>
                                                        9-55
                                                    </td>
                                                    <td>
                                                        08:30
                                                    </td>
                                                    <td>
                                                        14:30
                                                    </td>
                                                    <td>
                                                        
                                                    </td>
                                                    <td>
                                                        08.12.20
                                                    </td>
                                                    </tr><tr>
                                                    </tr>
                                                
                                                <tr>
                                                    <td>
                                                        <div><div>
    		
                                                            Navedenim područjem kružiti će autocisterna
                                                        
    	</div></div>
                                                        
                                                    </td>
                                                </tr>
                                            
                                                <tr>
                                                    <td>
                                                        
                                                        
                                                    </td>
                                                    <td>
                                                        ISTARSKA, Rijeka
                                                    </td>
                                                    <td>
                                                        64
                                                    </td>
                                                    <td>
                                                        10:00
                                                    </td>
                                                    <td>
                                                        14:00
                                                    </td>
                                                    <td>
                                                        
                                                    </td>
                                                    <td>
                                                        08.12.20
                                                    </td>
                                                    </tr><tr>
                                                    </tr>
                                                
                                                <tr>
                                                    <td>
                                                        <div><div>
    		
                                                            
                                                        
    	</div></div>
                                                        
                                                    </td>
                                                </tr>
                                            
                                                <tr>
                                                    <td>
                                                        
                                                        
                                                    </td>
                                                    <td>
                                                        KASTAVSKA, Rijeka
                                                    </td>
                                                    <td>
                                                        
                                                    </td>
                                                    <td>
                                                        10:00
                                                    </td>
                                                    <td>
                                                        14:00
                                                    </td>
                                                    <td>
                                                        
                                                    </td>
                                                    <td>
                                                        08.12.20
                                                    </td>
                                                    </tr><tr>
                                                    </tr>
                                                
                                                <tr>
                                                    <td>
                                                        <div><div>
    		
                                                            
                                                        
    	</div></div>
                                                        
                                                    </td>
                                                </tr>
                                            
                                                <tr>
                                                    <td>
                                                        
                                                        
                                                    </td>
                                                    <td>
                                                        MATE BALOTE, Rijeka
                                                    </td>
                                                    <td>
                                                        3-14
                                                    </td>
                                                    <td>
                                                        10:00
                                                    </td>
                                                    <td>
                                                        14:00
                                                    </td>
                                                    <td>
                                                        
                                                    </td>
                                                    <td>
                                                        08.12.20
                                                    </td>
                                                    </tr><tr>
                                                    </tr>
                                                
                                                <tr>
                                                    <td>
                                                        <div><div>
    		
                                                            
                                                        
    	</div></div>
                                                        
                                                    </td>
                                                </tr>
                                            
                                            
                                    
                                        <tr>
                                            <th>
                                                <div>
                                                    09.12.2020     // second date and all rest data after
                                                </div>
                                            </th>
                                        </tr>
                                        
                                                
                                                <tr>
                                                    <td>
                                                        
                                                        
                                                    </td>
                                                    <td>
                                                        ISTARSKA, Rijeka
                                                    </td>
                                                    <td>
                                                        64
                                                    </td>
                                                    <td>
                                                        10:00
                                                    </td>
                                                    <td>
                                                        14:00
                                                    </td>
                                                    <td>
                                                        
                                                    </td>
                                                    <td>
                                                        09.12.20
                                                    </td>
                                                    </tr><tr>
                                                    </tr>
                                                
                                                <tr>
                                                    <td>
                                                        <div><div>
    		
                                                            
                                                        
    	</div></div>
                                                        
                                                    </td>
                                                </tr>
                                            
                                                <tr>
                                                    <td>
                                                        
                                                        
                                                    </td>
                                                    <td>
                                                        KASTAVSKA, Rijeka
                                                    </td>
                                                    <td>
                                                        
                                                    </td>
                                                    <td>
                                                        10:00
                                                    </td>
                                                    <td>
                                                        14:00
                                                    </td>
                                                    <td>
                                                        
                                                    </td>
                                                    <td>
                                                        09.12.20
                                                    </td>
                                                    </tr><tr>
                                                    </tr>
                                                
                                                <tr>
                                                    <td>
                                                        <div><div>
    		
                                                            
                                                        
    	</div></div>
                                                        
                                                    </td>
                                                </tr>
                                            
                                                <tr>
                                                    <td>
                                                        
                                                        
                                                    </td>
                                                    <td>
                                                        MATE BALOTE, Rijeka
                                                    </td>
                                                    <td>
                                                        3-14
                                                    </td>
                                                    <td>
                                                        10:00
                                                    </td>
                                                    <td>
                                                        14:00
                                                    </td>
                                                    <td>
                                                        
                                                    </td>
                                                    <td>
                                                        09.12.20
                                                    </td>
                                                    </tr><tr>
                                                    </tr>
                                                
                                                <tr>
                                                    <td>
                                                        <div><div>
    		
                                                            
                                                        
    	</div></div>
                                                        
                                                    </td>
                                                </tr>
                                            
                                            
                                    
                                        </tbody></table>
    

    How to get date end all object inside that date till new date and so on
    First date:

                                        <tr>
                                            <th>
                                                <div>
                                                    08.12.2020
                                                </div>
                                            </th>
                                        </tr>
    

    Im getting data as

    [
      {
        street: 'BUJSKA, Rijeka',
        streetNmbr: '30,32,34,36',
        startTime: '09:00',
        endTime: '11:00',
        endWork: '07.12.20'
      },
      {
        street: 'LABINSKA, Rijeka',
        streetNmbr: '18,20',
        startTime: '09:00',
        endTime: '11:00',
        endWork: '07.12.20'
      },
      ......
    ]
    

    but i would like to add the starting date like this

    [
      {
      startDate:'07.12.20',
      data:
         {
      
        street: 'BUJSKA, Rijeka',
        streetNmbr: '30,32,34,36',
        startTime: '09:00',
        endTime: '11:00',
        endWork: '07.12.20'
    	},
    	{
        street: 'LABINSKA, Rijeka',
        streetNmbr: '18,20',
        startTime: '09:00',
        endTime: '11:00',
        endWork: '08.12.20'
        }
      },
      {
      startDate:'08.12.20',
      data:
        {
        street: 'LABINSKA, Rijeka',
        streetNmbr: '18,20',
        startTime: '09:00',
        endTime: '11:00',
        endWork: '08.12.20'
        }
      }
    ]
    

    I was unable to upload images to explain it better.
    Sorry, English is not my nativ language and im just start learning JS .



  • @lolo ignore the class= for the moment. that will just make it look prettier w colors etc

    A table is a list of rows. Which contain a list of columns
    A table is table
    A row is tr
    A column is td

    Your data is an array []
    Of objects{}

    You can use the for loop to process you data
    This is pseudo code, I am on my phone
    for(let item of array_name){
    Create row.
    Create td = item.element_name
    Etc
    }



  • Thanks for replay.
    I believe, I dint explained well.
    This is table from website. Screenshot.
    table.jpg
    The data what i menage to collect is from prev sibling of tr class tableRow.
    What i wish to get is tr class datum as first date and then data of street, street number, time (all in span) as data object in array, till next tr datum and then same after.
    For each span id is different but they sharing same part of one number in id.
    I cant use for for each as element (td or tr) as plenty of them are useless.
    And here I’m stuck. I tried different approaches but then all data was multiplying or got empty object in arrays
    example from website
    date.PNG
    Capture.PNG



  • @lolo but u have the date already in the object. Sort them maybe, but cheerio should return the list in order.



  • Yes I’m getting date , but this one is endWork (kraj). When the interruption will end, not to start. I know maybe I should use that.



  • I think i managed to sort it out.

    	const $ = cheerio.load(html);
        
    	var water = [];
            var waterList = [];
            var el = $('#cphContent_cphSadrzaj_upVoda table tbody ');
        
    	if (el) {
                   var elementWater = $('#cphContent_cphSadrzaj_upVoda table tbody  span').filter(function () {
                      return $(this).text() === self.place;
                       }).parent().parent();
                 $(elementWater).each(function (jex, edate) {
    
                const $edate = $(edate);
                var startDate = $edate.prev().find('tr[class="datum"]>th>div').text().trim();
                var street = $edate.find('span[id*="Label2"]').text();
                var streetNmbr = $edate.find('span[id*="Label7"]').text();
                var startTime = $edate.find('span[id*="Label8"]').text();
                var endTime = $edate.find('span[id*="Label9"]').text();
                var endWork = $edate.find('span[id*="Label10"]').text();
    
                var water = {
                    startDate,
                    street,
                    streetNmbr,
                    startTime,
                    endTime,
                    endWork
                };
                
            
                waterList.push(water);
                console.log(water);
            });
    
        };
        if (!el) {
            console.log('No new entries!');
        };
        var waterWork = JSON.parse(JSON.stringify(waterList));
        self.sendSocketNotification('ELECTRIC_POWER_DISCONNECTED', waterWork);
    

    Now i can get all data sorted by config.place

     {
      startDate: '09.12.2020',
      street: 'ISTARSKA, Rijeka',
      streetNmbr: '64',
      startTime: '10:00',
      endTime: '14:00',
      endWork: '09.12.20'
    }
    
    


  • @lolo cool



  • This is output of my module. Any suggestion about styling.
    srdoči.PNG
    šmrika.PNG
    brnelići.PNG
    croatian
    brnelići2.PNG

    I know there are more things to make it better but i have to learn it first.



  • @lolo i cannot help with look. Totally useless here


Log in to reply