Convert HTML output into a plain text using php
Solution 1
Use php strip_tags
If strip_tags is not working for then maybe you can use regex to extract the info you want.
Try using PHP preg_match with /(<td>.*?<\/td>)/
as the pattern
Solution 2
Have a look at simplexml_load_file():
http://www.php.net/manual/en/function.simplexml-load-file.php
It will allow you to load the HTML data into an object (SimpleXMLElement) and traverse that object like a tree.
Solution 3
try to use PHP function strip_tags
Solution 4
try this one,
<?php
$data = file_get_contents("your_file");
preg_match_all('|<div[^>]*?>(.*?)</div>|si',$data, $result);
print_r($result[0][0]);
?>
I have try this one, and it seems work for me, for you too i hope
Dan
Updated on July 15, 2022Comments
-
Dan almost 2 years
I'm trying to convert my sample HTML output into a plain text but I don't know how. I use file_get_contents but the page which I'm trying to convert returns most like the same.
$raw = "http://localhost/guestbook/profiles.php"; $file_converted = file_get_contents($raw); echo $file_converted;
profiles.php
<html> <head> <title>Profiles - GuestBook</title> <link rel="stylesheet" type="text/css" href="css/style.css"> </head> <body> <!-- Some Divs --> <div id="profile-wrapper"> <h2>Profile</h2> <table> <tr> <td>Name:</td><td> John Dela Cruz</td> </tr> <tr> <td>Age:</td><td>15</td> </tr> <tr> <td>Location:</td><td> SomewhereIn, Asia</td> </tr> </table> </div> </body> </html>
Basically, I trying to echo out something like this (plain text, no styles)
Profile Name: John Dela Cruz Age: 15 Location: SomewhereIn, Asia
but i don't know how. :-( . Please help me guys , thank you in advance.
EDIT: Since i am only after of the content of the page, no matter if it's styled or just a plain text , is there a way to select only (see code below) using file_get_contents() ?
<h2>Profile</h2> <table> <tr> <td>Name:</td><td> John Dela Cruz</td> </tr> <tr> <td>Age:</td><td>15</td> </tr> <tr> <td>Location:</td><td> SomewhereIn, Asia</td> </tr> </table>