Best way to write union query when dealing with NULL and Empty String values

25,299

Solution 1

I don't think it would make any difference in performance, but NULLIF is another way to write this and, IMHO, looks a little cleaner.

Select  
    NULLIF(Column1, '') as [Column1],
    NULLIF(Column2, '') as [Column2]
From TableA

UNION

Select 
    NULLIF(Column1, '') as [Column1],
    NULLIF(Column2, '') as [Column2]
From TableB

Solution 2

A Case should perform fine, but IsNull is more natural in this situation. And if you're searching for distinct rows, doing a union instead of a union all will accomplish that (thanks to Jeffrey L Whitledge for pointing this out):

select  IsNull(col1, '')
,       IsNull(col2, '')
from    TableA
union
select  IsNull(col1, '')
,       IsNull(col2, '')
from    TableB

Solution 3

Use UNION to remove duplicates - it's slower than UNION ALL for this functionality:

SELECT CASE 
         WHEN LEN(LTRIM(RTRIM(column1))) = 0 THEN NULL
         ELSE column1
       END AS column1,
       CASE 
         WHEN LEN(LTRIM(RTRIM(column2))) = 0 THEN NULL
         ELSE column2
       END AS column2
  FROM TableA
UNION 
SELECT CASE 
         WHEN LEN(LTRIM(RTRIM(column1))) = 0 THEN NULL
         ELSE column1
       END,
       CASE 
         WHEN LEN(LTRIM(RTRIM(column2))) = 0 THEN NULL
         ELSE column2
       END 
  FROM TableB

I changed the logic to return NULL if the column value contains any number of spaces and no actual content.

CASE expressions are ANSI, and more customizable than NULLIF/etc syntax.

Share:
25,299
dretzlaff17
Author by

dretzlaff17

.NET Software Engineer

Updated on August 20, 2020

Comments

  • dretzlaff17
    dretzlaff17 almost 4 years

    I have to write a query that performs a union between two tables with similar data. The results need to be distinct. The problem I have is that some fields that should be the same are not when it comes to empty values. Some are indicated as null, and some have empty string values. My question is, is there a better way to perform the following query? (without fixing the actual data to ensure proper defaults are set, etc) Will using the Case When be a big performance hit?

    Select  
        When Column1 = '' Then NULL Else Column1 as [Column1],
        When Column2 = '' Then NULL Else Column2 as [Column2]
    From TableA
    
    UNION ALL
    
    Select 
        When Column1 = '' Then NULL Else Column1 as [Column1],
        When Column2 = '' Then NULL Else Column2 as [Column2]
    From TableB