Storing HTML in SQL Server

58,262

Solution 1

VARCHAR(MAX) if it's all going to be ascii-based, say for basic HTML tepmplates

NVARCHAR(MAX) if the HTML could contain any content

NVARCHAR will double your storage use as it uses double the amount of space as VARCHAR. HTML itself does not require NVARCHAR, only the content in-between the HTML tags could based on the language, etc..

Edit:

Many years on from giving this answer I almost always use NVARCHAR now if there is any between the tag content. Unicode is popular...

I only use VARCHAR if just storing simple html templates, eg tags and placeholders
eg: <div><span>[PLACEHOLDER]</span><div>

Make the call based on your use-case..

Solution 2

Put it in an NVARCHAR(MAX) (or smaller).
HTML is no different from other text.

Share:
58,262
Petrus Theron
Author by

Petrus Theron

I solve business problems with software. Over the past 15 years I have risked significant personal wealth to work on multiple startups. My unique experience allows me to understand a human desire, design a product that satisfies that need - then build and deploy it single-handedly from concept to scale. Sometimes I write about it. I primarily work in Clojure/ClojureScript, but I have built small to medium-sized products (sub-million LOC) using Python, C#, C/C++, Java, Pascal and JavaScript. Products I built: 2007-2011 Rhythm Music Store: online music store / record label that that sold 80k MP3s online. MyNames: an API to register .CO.ZA domains and provision nameservers. Stack: Python, AngularJS. Krit.com: a mobile customer feedback tool that used geolocation and SMS to bridge the gap between customers and retail managers 2007-2016: iFix (now weFix) Repair Management System tracks 500k repairs and millions in revenues at 36+ branches. Acquired by weFix. ...several others. There are more. Happy to delve into technical details. Good with people and recruiting.

Updated on June 02, 2020

Comments

  • Petrus Theron
    Petrus Theron almost 4 years

    What data type should I use to store HTML content in SQL Server 2008?

    It's for dynamic content for a CMS.

  • TomTom
    TomTom about 13 years
    Ah - no. That is a good question, but standard HTML would require that content "basd on language" to encode special chars ;) No unicode in HTML, sorry. Normal Ascii set only.
  • Dave Sumter
    Dave Sumter about 13 years
    Damn, and I've been storing HTML in VARCHAR all these years.. ;-)
  • TomTom
    TomTom about 13 years
    Because HTML has only ascii characters, as I say. All special langauge characers must be encoded ;) So, the db only sees ASCII.... OR it is not HTML ;)
  • Petrus Theron
    Petrus Theron about 13 years
    Chosen as answer because valid HTML should not contain (unencoded) Unicode and my HTML content is guaranteed to be valid. Hence, VARCHAR(MAX).
  • Kenny Evitt
    Kenny Evitt about 12 years
    According to the Wikipedia article Unicode and HTML the HTML standard extended the document character set from ISO-8859-1 to ISO 10646 and asserts (parenthetically) that that character set "... is basically equivalent to Unicode".
  • Jodrell
    Jodrell about 10 years
    Upvote @KennyEvitt, so the type needs to be NVarChar unless you will only store HTML documents with a "external character encoding" or "charset" which is a sub set of ASCII. In the later case, all graphemes represented in the document will be ASCII characters so VarChar will be safe.
  • Kushan Randima
    Kushan Randima over 5 years
    Can you please explain why it depends on the language? Is it related to Unicode?
  • SLaks
    SLaks over 5 years
    @KushanRandima: What are you talking about?