remove only some html tags on c#

11,981

Solution 1

Use Regex:

var result = Regex.Replace(html, @"</?DIV>", "");

UPDATED

as you mentioned, by this code, regex removes all tages else B

var hmtl = "<DIV><B> xpto </B></DIV>";
var remainTag = "B";
var pattern = String.Format("(</?(?!{0})[^<>]*(?<!{0})>)", remainTag );
var result =  Regex.Replace(hmtl , pattern, "");

Solution 2

Use htmlagilitypack

HtmlDocument doc = new HtmlDocument();
doc.LoadHtml("<html>yourHtml</html>");

foreach(var item in doc.DocumentNode.SelectNodes("//div"))// "//div" is a xpath which means select div nodes that are anywhere in the html
{
 item.InnerHtml;//your div content
}

If you want only B tags..

foreach(var item in doc.DocumentNode.SelectNodes("//B"))
    {
     item.OuterHtml;//your B tag and its content
    }

Solution 3

If you are just removing div tags, this will get div tags as well as any attributes they may have.

var html = 
  "<DIV><B> xpto <div text='abc'/></B></DIV><b>Other text <div>test</div>" 

var pattern = "@"(\</?DIV(.*?)/?\>)"";  

// Replace any match with nothing/empty string
Regex.Replace(html, pattern, string.Empty, RegexOptions.IgnoreCase);

Result

<B> xpto </B><b>Other text test

Solution 4

you can use regular

<[(/body|html)\s]*>

in c#:

 var result = Regex.Replace(html, @"<[(/body|html)\s]*>", "");

<html>
<body>
< / html> 
< / body>
Share:
11,981

Related videos on Youtube

r-magalhaes
Author by

r-magalhaes

Outsystems Tech Lead and Power Bi Accountable ; .net Enthusiast; Portista Carago; Wine lover; Father of one

Updated on September 15, 2022

Comments

  • r-magalhaes
    r-magalhaes over 1 year

    I have a string:

    string hmtl = "<DIV><B> xpto </B></DIV>
    

    and need to remove the tags of <div> and </DIV>. With a result of : <B> xpto </B>


    Just <DIV> and </DIV> without the removal of a lot of html tags, but save the <B> xpto </B>.

  • r-magalhaes
    r-magalhaes over 11 years
    who i transforme my string to a htmldocument, which is the library i need to use HtmlDocumen?? thanks
  • Fandango68
    Fandango68 over 7 years
    @CasperLeonNielsen Why does everyone refer to that old chest-nut post about OOHH REGEX IS EVIL - DO NOT USE post?! Seriously... not everything has to go through HTML Agility Pack!
  • Fandango68
    Fandango68 over 7 years
    This is a good answer, but what if I want to pass it a set of tags, like <p> <a> <li>, etc? Say I want to create a function that I can pass a list of tags into (string lists that is)?
  • Anirudha
    Anirudha over 7 years
    @Fernando68 since its an xpath , you can use multiple or conditions like //p | //a | //li
  • Casper Leon Nielsen
    Casper Leon Nielsen over 7 years
    that post is so funny it hurts. and true. thanks for reminding me :)