Is there javascript to convert HTML to markdown?
Solution 1
I've started a project to do this:
https://github.com/domchristie/turndown
It's still in its early stages, so has not been heavily tested, but it's a start.
Feedback/contributions welcome.
Solution 2
I have also collaborated on a project on github that does this. At this moment, it is only tested in browser.
I have done a lot of testing on web. Added ton of unit tests. Still not perfect but works nicely. Feedback welcome and I will be happy to receive pull requests or fix any defects you find.
Solution 3
Theoretically, you can convert it back. You'd have to write your own DOM traversal code and convert the HTML back to Markdown.
Generally Markdown is thought to be the human readable/writable source of the information that is converted to HTML for further markup and styling.
HTML can be much more complex than Markdown and can be indefinitely nested and partitioned into tags. This is why it's so questionable to write a general purpose converter which reliably converts HTML back to Markdown. Just imagine all the whitespace and paragraphs going bye-bye and possibly causing a terrible mess for the human eye.
My suggestion is: Unless you generate originating HTML yourself and know what it consists of, don't convert it back to Markdown. Keep the Markdown version all the time and convert to HTML when needed.
Related videos on Youtube
Comments
-
Ethan over 4 years
There is showdown.js to convert markdown to HTML, and PHP Markdown to convert markdown to and from HTML. My question is, is there javascript library to convert HTML to markdown?
-
Dean Harding about 14 yearsCorrect me if I'm wrong, but I don't think either of those libraries convert HTML to markdown. I don't think it's possible, in general, since the markdown->HTML convertion is lossy (that is, data is lost in the conversion that would be required to convert back again).
-
Ethan about 14 yearsWhy do you think that markdown->HTML is lossy? I think HTML->markdown is lossy, because every markdown syntax has its HTML equivalent, but not vice versa.
-
Nick Craver about 14 years@Ethan - When whatever it was went through conversion from markdown to HTML in the first place, it lost data, e.g. extra returns, etc...there's no way to restore that data completely accurately, it's gone. You'll notice SO stores both the original text and the html version of each post...this is one of the reasons.
-
Ethan about 14 years@Nick: Also, some HTML tags has more than one markdown equivalents, such as <h2> can be either ## or ----. But what I am looking for is something that can convert HTML to "standard" markdown, i.e., stripping out extra returns and unsupported HTML tags, use ---- for all headings, and others.
-
Justin Johnson about 14 yearsComplete and accurate restoration is not part of the OP. Lossiness of markdown to HTML is irrelevant unless the OP specifies. At any rate, whitespace is lost when rendering HTML as it is, unless of course, it's in a
pre
tag. -
Shikiryu over 13 yearsApparently, after many research, it doesn't exist. I should do a "DIY answer" :)
-
Marco Demaio almost 13 yearsshowdown.js is gone as well as WMD! :(
-
jonschlinkert over 7 yearsI created github.com/breakdance/breakdance to do this. every other solution I found leaves too much junk in the resulting HTML. IMHO, if you're converting to markdown, you probably aren't interested in keeping tags that don't work in markdown.
-
-
Jordan Reiter almost 13 yearsOne case where you'd want to convert HTML -> Markdown is in a WYSIWYG editor. Most of them provide the text in HTML or XHTML. It'd be nice to convert that into MarkDown for storage.
-
aleemb almost 12 yearscool project, thanks very much for sharing.
-
Phoenix over 11 yearsOf all the html-to-markdown converters I've looked at, this one fit my needs best. The rest moved the links to the bottom, like annotations.
-
GaryBishop about 11 yearsNot constructive my A$$! This is great!
-
Paul Verest almost 11 yearsThis is the same as Dom Christie's answer
-
tilgovi almost 11 yearsPaul, while the projects have the same goal they are separate implementations. This is not the same answer. It was helpful to me to be able to look at both and compare them.
-
fer about 9 yearsLovely stuff! It would be great if it finally supports GitHub Flavored Markdown too.
-
Dom Christie about 9 years@fer feel free to try: github.com/domchristie/to-markdown/tree/gfm I can’t say when it’s going to be merged, but it’d definitely help to have some testing done :)
-
fer about 9 yearsgood! i will have a look and see if i can contribute... great job Dom!
-
Admin over 7 yearsCan the library implement on server side ? I used to install this package by npm install on nodejs, however I can't get affort when I use it. No change with result
-
Dom Christie over 7 yearsYes, the library should work on the server side (node version 4+), and is available on NPM. If you are having problems using it, please raise an issue on the GitHub repository. Thanks.
-
stackovermat over 5 yearsThe link is broken and the project made deprecated. The new project with the same goal can be found here: Turndown
-
Alex G over 3 yearsThank you so much for creating this project! Exactly what I am looking for!