Storing HTML in MySQL

14,147

Solution 1

Use & instead of &.

Solution 2

What you want to do is use the php function htmlentities()...
It will convert your input into html entities, and then when it is outputted it will be interpreted as HTML and outputted as the result of that HTML...
For example:

$mything = "<b>BOLD & BOLD</b>";
//normally would throw an error if not converted...
//lets convert!!
$mynewthing = htmlentities($mything);

Now, just insert $mynewthing to your database!!

Solution 3

htmlentities is basically as superset of htmlspecialchars, and htmlspecialchars replaces also < and >.

Actually, what you are trying to do is to fix invalid HTML code, and I think this needs an ad-hoc solution:

$row['details'] = preg_replace("/&(?![#0-9a-z]+;)/i", "&amp;", $row['details']);

This is not a perfect solution, since it will fail for strings like: someone&son; (with a trailing ;), but at least it won't break existing HTML entities.

However, if you have decision power over how the data is stored, please enforce that the HTML code stored in the database is correct.

Share:
14,147
MAX POWER
Author by

MAX POWER

Updated on June 04, 2022

Comments

  • MAX POWER
    MAX POWER almost 2 years

    I'm storing HTML and text data in my database table in its raw form - however I am having a slight problem in getting it to output correctly. Here is some sample data stored in the table AS IS:

    <p>Professional Freelance PHP & MySQL developer based in Manchester.
    <br />Providing an unbeatable service at a competitive price.</p>
    

    To output this data I do:

    echo $row['details'];
    

    And this outputs the data correctly, however when I do a W3C validator check it says:

    character "&" is the first character of a delimiter but occurred as data
    

    So I tried using htmlemtities and htmlspecialchars but this just causes the HMTL tags to output on the page.

    What is the correct way of doing this?

  • Luc M
    Luc M almost 11 years
    Simplest way to solve this problem. The data is HTML anyway.
  • MAX POWER
    MAX POWER almost 11 years
    I think you're correct.. since I'm storing HTML data then it should be safe enough to store the HTML entities.
  • urraka
    urraka almost 11 years
    This could break valid html. Consider a case where you have valid html like ... &amp; ..., it would replace it with ... &amp;amp; ...
  • gd1
    gd1 almost 11 years
    Yes, but I think it can be made even better by using regex.
  • Luc M
    Luc M almost 11 years
    Now, with the space, you have a problem when the string is something like Someone&son's
  • gd1
    gd1 almost 11 years
    I updated my answer with a regexp expression. There are still cases in which it will fail, but making it "perfect" requires actually recognizing and skipping individual HTML entities. Feel free to improve or to provide alternative (non regexp) solutions.
  • Madara's Ghost
    Madara's Ghost almost 11 years
    Since he has control over the content. The actual best solution would be to write the & as &amp; beforehand. This is actually a good practice with HTML in general, so there's no reason the HTML shouldn't be stored as valid in the database to begin with. (Even though the HTML content itself probably shouldn't be stored in the database to begin with, but that's a different story).