SQL Server silently truncates varchar's in stored procedures

47,145

Solution 1

It just is.

I've never noticed a problem though because one of my checks would be to ensure my parameters match my table column lengths. In the client code too. Personally, I'd expect SQL to never see data that is too long. If I did see truncated data, it'd be bleeding obvious what caused it.

If you do feel the need for varchar(max) beware a massive performance issue because of datatype precedence. varchar(max) has higher precedence than varchar(n) (longest is highest). So in this type of query you'll get a scan not a seek and every varchar(100) value is CAST to varchar(max)

UPDATE ...WHERE varchar100column = @varcharmaxvalue

Edit:

There is an open Microsoft Connect item regarding this issue.

And it's probably worthy of inclusion in Erland Sommarkog's Strict settings (and matching Connect item).

Edit 2, after Martins comment:

DECLARE @sql VARCHAR(MAX), @nsql nVARCHAR(MAX);
SELECT @sql = 'B', @nsql = 'B'; 
SELECT 
   LEN(@sql), 
   LEN(@nsql), 
   DATALENGTH(@sql), 
   DATALENGTH(@nsql)
;

DECLARE @t table(c varchar(8000));
INSERT INTO @t values (replicate('A', 7500));

SELECT LEN(c) from @t;
SELECT 
   LEN(@sql + c), 
   LEN(@nsql + c), 
   DATALENGTH(@sql + c), 
   DATALENGTH(@nsql + c) 
FROM @t;

Solution 2

Thanks, as always, to StackOverflow for eliciting this kind of in-depth discussion. I have recently been scouring through my Stored Procedures to make them more robust using a standard approach to transactions and try/catch blocks. I disagree with Joe Stefanelli that "My suggestion would be to make the application side responsible", and fully agree with Jez: "Having SQL Server verify the string length would be much preferable". The whole point for me of using stored procedures is that they are written in a language native to the database and should act as a last line of defence. On the application side the difference between 255 and 256 is just a meangingless number but within the database environment, a field with a maximum size of 255 will simply not accept 256 characters. The application validation mechanisms should reflect the backend db as best they can, but maintenance is hard so I want the database to give me good feedback if the application mistakenly allows unsuitable data. That's why I'm using a database instead of a bunch of text files with CSV or JSON or whatever.

I was puzzled why one of my SPs threw the 8152 error and another silently truncated. I finally twigged: The SP which threw the 8152 error had a parameter which allowed one character more than the related table column. The table column was set to nvarchar(255) but the parameter was nvarchar(256). So, wouldn't my "mistake" address gbn's concern: "massive performance issue"? Instead of using max, perhaps we could consistently set the table column size to, say, 255 and the SP parameter to just one character longer, say 256. This solves the silent truncation problem and doesn't incur any performance penalty. Presumably there is some other disadvantage that I haven't thought of, but it seems a good compromise to me.

Update: I'm afraid this technique is not consistent. Further testing reveals that I can sometimes trigger the 8152 error and sometimes the data is silently truncated. I would be very grateful if someone could help me find a more reliable way of dealing with this.

Update 2: Please see Pyitoechito's answer on this page.

Solution 3

Update: I'm afraid this technique is not consistent. Further testing reveals that I can sometimes trigger the 8152 error and sometimes the data is silently truncated. I would be very grateful if someone could help me find a more reliable way of dealing with this.

This is probably occurring because the 256th character in the string is white-space. VARCHARs will truncate trailing white-space on insertion and just generate a warning. So your stored procedure is silently truncating your strings to 256 characters, and your insertion is truncating the trailing white-space (with a warning). It will produce an error when said character is not white-space.

Perhaps a solution would be to make the stored procedure's VARCHAR a suitable length to catch a non-white-space character. VARCHAR(512) would probably be safe enough.

Solution 4

The same behavior can be seen here:

declare @testStringField [nvarchar](5)
set @testStringField = N'string which is too long'
select @testStringField

My suggestion would be to make the application side responsible for validating the input before calling the stored procedure.

Solution 5

One solution would be to:

  1. Change all incoming parameters to be varchar(max)
  2. Have sp private variable of the correct datalength (simply copy and paste all in parameters and add "int" at the end
  3. Declare a table variable with the column names the same as variable names
  4. Insert into the table a row where each variable goes into the column with the same name
  5. Select from the table into internal variables

This way your modifications to the existing code are going to be very minimal like in the sample below.

This is the original code:

create procedure spTest
(
    @p1 varchar(2),
    @p2 varchar(3)
)

This is the new code:

create procedure spTest
(
    @p1 varchar(max),
    @p2 varchar(max)
)
declare @p1Int varchar(2), @p2Int varchar(3)
declare @test table (p1 varchar(2), p2 varchar(3)
insert into @test (p1,p2) varlues (@p1, @p2)
select @p1Int=p1, @p2Int=p2 from @test

Note that if the length of the incoming parameters is going to be greater than the limit instead of silently chopping off the string SQL Server will throw off an error.

Share:
47,145
Jez
Author by

Jez

Long-time coder, with some interest in French and Philosophy. I sometimes hang out in the English Language & Usage chatroom. Check out my Firefox addons! :-)

Updated on December 19, 2020

Comments

  • Jez
    Jez over 3 years

    According to this forum discussion, SQL Server (I'm using 2005 but I gather this also applies to 2000 and 2008) silently truncates any varchars you specify as stored procedure parameters to the length of the varchar, even if inserting that string directly using an INSERT would actually cause an error. eg. If I create this table:

    CREATE TABLE testTable(
        [testStringField] [nvarchar](5) NOT NULL
    )
    

    then when I execute the following:

    INSERT INTO testTable(testStringField) VALUES(N'string which is too long')
    

    I get an error:

    String or binary data would be truncated.
    The statement has been terminated.
    

    Great. Data integrity preserved, and the caller knows about it. Now let's define a stored procedure to insert that:

    CREATE PROCEDURE spTestTableInsert
        @testStringField [nvarchar](5)
    AS
        INSERT INTO testTable(testStringField) VALUES(@testStringField)
    GO
    

    and execute it:

    EXEC spTestTableInsert @testStringField = N'string which is too long'
    

    No errors, 1 row affected. A row is inserted into the table, with testStringField as 'strin'. SQL Server silently truncated the stored procedure's varchar parameter.

    Now, this behaviour might be convenient at times but I gather there is NO WAY to turn it off. This is extremely annoying, as I want the thing to error if I pass too long a string to the stored procedure. There seem to be 2 ways to deal with this.

    First, declare the stored proc's @testStringField parameter as size 6, and check whether its length is over 5. This seems like a bit of a hack and involves irritating amounts of boilerplate code.

    Second, just declare ALL stored procedure varchar parameters to be varchar(max), and then let the INSERT statement within the stored procedure fail.

    The latter seems to work fine, so my question is: is it a good idea to use varchar(max) ALWAYS for strings in SQL Server stored procedures, if I actually want the stored proc to fail when too long a string is passed? Could it even be best practice? The silent truncation that can't be disabled seems stupid to me.

  • Jez
    Jez over 13 years
    Well yeah, that's obviously what I'll do, but as it's a stored procedure I don't think it's a good idea to assume that my code would be the only code calling it. Having SQL Server verify the string length would be much preferable.
  • Matthew
    Matthew over 13 years
    I know I've seen this behavior with the SET command, but I hadn't noticed this in the sp parameters... Then again I have only used R2. Do you know if this still applies? (Sorry, cannot test as I do not have access to a server right now)
  • Jez
    Jez over 13 years
    If you're gonna rely on all client-side code checking the length of varchar's, why bother setting a size for varchar fields at all?
  • ZygD
    ZygD over 13 years
    varchar(max) is a LOB type: there are limitations. Also, my parameters match my table lengths. And, I don't really care that much: my design and tables and parameters are what they are.
  • Matthew
    Matthew over 13 years
    @Jez, a large part of DB design is ease of upkeep and search performance. As gbn noted if you can avoid using (MAX) on searched fields you probably should. That being said, it's also a good idea generally to make the field the largest you could ever consider using it.
  • Jez
    Jez over 13 years
    @gbn: Yes, but surely a major (if not THE major) reason for typing fields in a DBMS is to ensure database integrity? For that reason, if SQL Server started doing unavoidable string truncation for varchar's in an INSERT statement, I think (although I'm not sure given some of the other stuff SQL Server gets away with) that there would be widespread condemnation of this behaviour. But, who uses raw INSERT statements anymore? The overwhelming best practice is to insert via a stored proc. So assuming you must do the latter, SQL server is - effectively - silently truncating strings in an INSERT.
  • ZygD
    ZygD over 13 years
    @Jez: many folk use ORMs, others have no idea and use raw INSERTs. As I said, it just is. Sybase at least has the same behaviour.
  • Jez
    Jez over 13 years
    Well I could accept your answer if you amended it to say something about how this is probably irritating and bad behaviour that should be changed by Microsoft in a future release. :-)
  • ZygD
    ZygD over 13 years
    @Jez: I try to fight battles I can win ;-) I've added some suggestions though but it really doesn't bother me day to day.
  • araqnid
    araqnid over 13 years
    "The overwhelming best practice is to insert via a stored proc" - just because lots of people do it, doesn't mean it's a good idea.
  • Jez
    Jez over 13 years
    So, you're telling me it's not a good idea? You're advocating inline SQL?
  • Jez
    Jez over 13 years
    I've accepted this answer. However, I do think this is a bug MS should fix. SQL:2008 seems to suggest that ...
  • Jez
    Jez over 13 years
    ... it violates the standard: "With two exceptions, a character string expression is assignable only to sites of a character string type whose character set is the same. The exceptions are as specified in Subclause 4.2.8, “Universal character sets”, and such other cases as may be implementation-defined. If a store assignment would result in the loss of non-<space> characters due to truncation, then an exception condition is raised. If a retrieval assignment or evaluation of a <cast specification> would result in the loss of characters due to truncation, then a warning condition is raised." ...
  • Jez
    Jez over 13 years
    I'd say that passing data to a stored procedure as a parameter is a 'store assignment', but maybe there is something even more specific about SPs in the spec that says this explicitly. It's too massive for me to read in any more detail. :-)
  • Jez
    Jez over 13 years
    Oh, and just 1 more thing in case I hadn't said enough. I just checked and MySQL truncates varchar's that are too lnog by default (not surprising); but, if you first run "SET sql_mode='TRADITIONAL'", it gives you an ERROR when you try to pass a string to an SP which is too long! So SQL Server is actually behind MySQL on this one. ;-)
  • ZygD
    ZygD over 13 years
    @Jez: thank you. We use SPs mostly, I was just saying to appease the "who needs SQL" crowd...
  • Martin Smith
    Martin Smith about 13 years
    Are you sure on the data type precedence thing? If so why does this get converted to nvarchar(4000) and truncated instead of nvarchar(max)? DECLARE @sql NVARCHAR(MAX) = ''; declare @t table(c varchar(8000)) insert into @t values (replicate('A',8000)) select LEN(@sql + c) from @t Also I see a seek here declare @v varchar(max) = 'a';declare @foo table(v varchar(10) primary key);SELECT * FROM @foo WHERE v = @v
  • ZygD
    ZygD about 13 years
    @martin: 1. replicate truncates at 8000 bytes if first param is not max 2. LEN(@sql + c) is cast correctly. See update
  • Martin Smith
    Martin Smith about 13 years
    @gbn - but I'm not trying to use replicate to create a string > 8000 bytes. It creates a varchar(8000) string. I'm saying why is this correctly implicitly cast to nvarchar but not to max if max has higher precedence?
  • ZygD
    ZygD about 13 years
    @martin: c is CAST to nvarchar(max) from varchar(8000). This changes a 8000 ascii chars to 4000 unicode chars. Now I've really updated answer. Update: 4000 chars are lost from the end of c
  • Martin Smith
    Martin Smith about 13 years
    @gbn - Actually RE: the seek having looked at the plan more closely it looks like it gets converted to a range seek not just a straight forward eguality seek Seek Keys[1]: Start: v > Scalar Operator([Expr1004]), End: v < Scalar Operator([Expr1005]) so yes I can see it is treated differently and this conversion is happenning but it doesn't result in a scan in 2008. I guess the varchar(4000-8000) to nvarchar(max) via nvarchar(4000) is just a bit of a gotcha to be aware of.
  • Martin Smith
    Martin Smith about 13 years
    And probably particularly a gotcha in the context I first noticed it which was where the OP was concatenating a string literal that happened to be between 4000 and 8000 characters to an nvarchar(max) variable and had neglected to use the N prefix
  • Tim Goodman
    Tim Goodman almost 11 years
    @Pyitoechito mentions in a separate answer that the silent truncation probably happens when it's only truncating whitespace. (I figured it'd be good to add a comment on this answer.)
  • DavidHyogo
    DavidHyogo almost 11 years
    Good tip. Thanks for the advice.
  • Elaskanator
    Elaskanator over 4 years
  • Richardissimo
    Richardissimo over 3 years
    There is a Broken link: "open Microsoft Connect item"... is it this page? feedback.azure.com/forums/908035-sql-server/suggestions/…
  • ZygD
    ZygD over 3 years
    @Richardissimo MS shut connect down, and this answer is old
  • Richardissimo
    Richardissimo over 3 years
    @gbn I'm aware of that, and I made the effort to try to find where the article in question had gone, and I posted the link to the one I found in my comment. I wasn't sure whether the one I found was the correct one, so I didn't want to just edit the answer.