• Recent
  • Tags
  • Unsolved
  • Solved
  • MagicMirror² Repository
  • Documentation
  • 3rd-Party-Modules
  • Donate
  • Discord
  • Register
  • Login
MagicMirror Forum
  • Recent
  • Tags
  • Unsolved
  • Solved
  • MagicMirror² Repository
  • Documentation
  • 3rd-Party-Modules
  • Donate
  • Discord
  • Register
  • Login
A New Chapter for MagicMirror: The Community Takes the Lead
Read the statement by Michael Teeuw here.

Parse HTML String

Scheduled Pinned Locked Moved Development
7 Posts 5 Posters 5.2k Views 5 Watching
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • A Offline
    alihallo
    last edited by Aug 29, 2016, 7:54 PM

    Thank you for your answer, but I coudn’t find out how to use the NPM request to parse the html code.
    But I found another solution to solve the issue:

    https://github.com/cheeriojs/cheerio

    This way I could get the data out of the html code like this:

    var options = {url: URL};
            request(options, (error, response, body) => {
                if (response.statusCode === 200) {
                    this.sendSocketNotification("DATA", this.parseData(body));
    
    
    ...
    
    
    var $ = cheerio.load(body, {
       normalizeWhitespace: true,
       xmlMode: false
    });
    		
    $('div[class=data_1]').find('p').each(function (index, element) {
    	data_array.push($(element).text());
    });
    

    This way I could solve my problem.

    Best regards,
    alihallo

    I 1 Reply Last reply Aug 29, 2016, 9:26 PM Reply Quote 0
    • S Offline
      strawberry 3.141 Project Sponsor Module Developer
      last edited by strawberry 3.141 Aug 29, 2016, 8:19 PM Aug 29, 2016, 8:18 PM

      I think it’s weird that this works, because your looking for attribute class = data_1 but it’s an id

      the css selector for an id is #, and when you put p behind it will look for paragraphs in the element with the id data_1

      when you replace

      $('div[class=data_1]').find('p').each(function (index, element) {
      	data_array.push($(element).text());
      });
      

      with

      data_array.push($('#data_1 p').text());
      

      does it still work? Not sure if it will return the element if just one occurance is found or will return an array anyways

      Please create a github issue if you need help, so I can keep track

      1 Reply Last reply Reply Quote 1
      • I Offline
        ianperrin @alihallo
        last edited by ianperrin Aug 29, 2016, 9:35 PM Aug 29, 2016, 9:26 PM

        @alihallo

        If your input html file is fairly simple, you may be able to avoid the use of the cheerio library entirely

        // an array to hold the data from the file
        var data_array = [];
        // Get all p tag elements inside div tag elements with an id that starts with 'data_'
        var data_tags = body.querySelectorAll('div[id^=data_] p');
        // Loop through data tags and add content to data array
        for (i = 0; i < data_tags.length; i++) { 
            data_array.push(data_tags[i].innerHTML);
        }
        

        Of course the more complex your input file is the more you might benefit from the use of cheerio.

        "Live as if you were to die tomorrow. Learn as if you were to live forever." - Mahatma Gandhi

        1 Reply Last reply Reply Quote 0
        • P Offline
          Plati
          last edited by paviro Sep 11, 2016, 11:11 AM Sep 11, 2016, 9:33 AM

          I want to create a module that gets data from a website in div id.

          Example:

          Website code:

          <b class="b2 nieb" title="Kurs EUR na żywo" id="EURPLN">4.33320</b>
          

          And i want display value of id="EURPLN"

          I want to set the config which site is to collect data and from which the ID

          example:

          defaults: {
          		url: http://domain.com/
          		findID: "EURPLN"
          }
          

          how to do it?


          Note from admin: Please use Markdown on code snippets!

          1 Reply Last reply Reply Quote 0
          • A Offline
            alihallo
            last edited by Sep 11, 2016, 12:29 PM

            @strawberry-3-141 You are absolutly right, I made a bad example. The real html code is a little bit more complicated, so I mixed the real code with this example.
            @ianperrin My input html is more complicated, but thanks for your answer, good to know!

            @Plati It helped me a lot to look at other modules. You should use a node_helper.js, there you can create a function which gets the html code of the website.

            //function which gets the data from the given URL
            getTheData: function(theURLtoCatch) {
               var options = {url: theURLtoCatch};
               request(options, (error, response, body) => {
                  if (response.statusCode === 200) {
                     this.sendSocketNotification("DATA", this.parseHTML(body));
                  } else {
                     console.log("Error getting Data " + response.statusCode);
                     this.sendSocketNotification("ERROR", response.statusCode);
                  }
               });
            },
            
            parseHTML: function(dataBody) {
               //use something like ianperrin and strawberry showed in his example
            
            }
            
            1 Reply Last reply Reply Quote 0
            • 1 / 1
            • First post
              Last post
            Enjoying MagicMirror? Please consider a donation!
            MagicMirror created by Michael Teeuw.
            Forum managed by Sam, technical setup by Karsten.
            This forum is using NodeBB as its core | Contributors
            Contact | Privacy Policy