How to read and parse html in Nodejs?
I would recommend using Cheerio. It tries to implement jQuery functionality to Node.js.
const cheerio = require('cheerio')
var html = "<p>In practice, it is usually a bad idea to modify global variables inside the function scope since it often be the cause of confusion and weird errors that are hard to debug.<br />If you want to modify a global variable via a function, it is recommended to pass it as an argument and reassign the return-value.<br />For example:</p>"
const $ = cheerio.load(html)
var paragraph = $('p').html(); //Contents of paragraph. You can manipulate this in any other way you like
//...You would do the same for any other element you require
You should check out Cheerio and read its documentation. I find it really neat!
Edit: for the new part of your question
You can iterate over every element and insert it into an array of JSON objects like this:
var jsonObject = []; //An array of JSON objects that will hold everything
$('p').each(function() { //Loop for each paragraph
//Now let's take the content of the paragraph and put it into a json object
jsonObject.push({"paragraph":$(this).html()}); //Add data to the main jsonObject
});
So the resulting array of JSON objects should look something like this:
[
{
"paragraph": "text"
},
{
"paragraph": "text 2"
},
{
"paragraph": "text 3"
}
]
I believe You should also read up on JSON and how it works.
Kucka Prozova
Updated on June 04, 2022Comments
-
Kucka Prozova almost 2 years
I have a simple project. I need the help this is a related project. I need to read an HTML file and then convert it to JSON format. I want to get the matches as code and text. How I achieve this?
In this way, I have two HTML tags
<p>In practice, it is usually a bad idea to modify global variables inside the function scope since it often is the cause of confusion and weird errors that are hard to debug.<br /> If you want to modify a global variable via a function, it is recommended to pass it as an argument and reassign the return-value.<br /> For example:</p> <pre><code class="{python} language-{python}">a_var = 2 def a_func(some_var): return 2**3 a_var = a_func(a_var) print(a_var) </code></pre>
mycode:
const fs = require('fs') const showdown = require('showdown') var read = fs.readFileSync('./test.md', 'utf8') function importer(mdFile) { var result = [] let json = {} var converter = new showdown.Converter() var text = mdFile var html = converter.makeHtml(text); for (var i = 0; i < html.length; i++) { htmlRead = html[i] if(html == html.match(/<p>(.*?)<\/p>/g)) json.text = html.match(/<p>(.*?)<\/p>/g) if(html == html.match(/<pre>(.*?)<\/pre>/g)) json.code = html.match(/<pre>(.*?)<\/pre>/g } return html } console.log(importer(read))
How do I get these matches on the code?
new code : I write all the p tags in the same json, how to write each p tag into different json blocks?
$('html').each(function(){ if ($('p').text != undefined) { json.code = $('p').text() json.language = "Text" } })
-
Kucka Prozova over 5 yearsYeah, that's exactly what I did. But I have a question, I write all the p tags in the same json, how to write each p tag into different json blocks? I updated question.
-
Ayo Reis over 2 yearsDoes anyone know a only JS alternative?