How to convert NSString HTML markup to plain text NSString?
21,688
Solution 1
You can do it by parsing the html by using NSScanner class
- (NSString *)flattenHTML:(NSString *)html {
NSScanner *theScanner;
NSString *text = nil;
theScanner = [NSScanner scannerWithString:html];
while ([theScanner isAtEnd] == NO) {
[theScanner scanUpToString:@"<" intoString:NULL] ;
[theScanner scanUpToString:@">" intoString:&text] ;
html = [html stringByReplacingOccurrencesOfString:[NSString stringWithFormat:@"%@>", text] withString:@""];
}
//
html = [html stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
return html;
}
Hope this helps.
Solution 2
If you are using UIWebView then it will be easier to parse HTML to text:
fullArticle = [webView stringByEvaluatingJavaScriptFromString:@"document.body.getElementsByTagName('article')[0].innerText;"]; // extract the contents by tag
fullArticle = [webView stringByEvaluatingJavaScriptFromString:@"document.body.innerText"]; // extract text inside body part of HTML
Author by
Frames84
Updated on July 22, 2022Comments
-
Frames84 almost 2 years
Been searching the net for an example of how to convert HTML string markup into Plain text.
I get my information from a feed which contains
HTML
, I then display this information in a Text View. does theUITextView
have a property to convertHTML
or do I have to do it in code. I tried:NSString *str = [NSString stringWithCString:self.fullText encoding:NSUTF8StringEndcoding];
but doesn't seem to work. Anyone got any ideas?
-
Frames84 about 14 yearsDoesn't deal with single quotes but for everything else works fine.
-
Frames84 about 14 yearswould this method keep the formatting? What I want is to display the formatted HTML in plain text, so keep links, <h1> <p> etc.. how do other app do this?
-
Frames84 about 14 yearsUIWebView display's a webpage inside a app? need a control or method of keeping the html format but not displaying it. my output contains the markup were i want to it keep to style but not show the html.
-
Madhup Singh Yadav about 14 yearsIf you are having single quotes and you don't want to show them just replace there occurrence by blank string
-
chatur over 12 yearsHi @Madhup. please have look at the question -stackoverflow.com/questions/8148291/… and advice.
-
Neil over 11 yearsNSXML parser will not parse normal HTML. It fails on HTML only characters.