Case Sensitive join TSQL

13,294

The problem is that in the second row of your results "John's Shop" shouldn't have matched "John's shop"?

You can use a case sensitive collation.

This is probably best achieved by altering the collation of the columns involved to allow index use but you can also do it at run time with an explicit COLLATE clause as below.

SELECT DISTINCT TenantID,
                Name,
                Name2,
                TenantNumber,
                Cashname
FROM   Tenants
       LEFT OUTER JOIN tCash
         ON TenantNumber = CashNumber
            AND tMoney.CashName = Tenants.Name2 COLLATE Latin1_General_100_CS_AS 

The comments about joining on id instead of name are likely correct though and would negate the need to do this at all.

Share:
13,294

Related videos on Youtube

Geo
Author by

Geo

Analyst focused in BI and Analytics!

Updated on June 04, 2022

Comments

  • Geo
    Geo almost 2 years

    I am at a bit of a standstill here. I have a simple left outer join to a table that is returning an ID.

    My code is as

    Select distinct TenantID
    ,Name
    ,Name2
    ,TenantNumber
    ,Cashname 
    From Tenants 
    LEFT OUTER JOIN tCash 
    on TenantNumber = CashNumber 
    and tMoney.CashName = Tenants.Name2
    

    My result set is as follows:

    **TenantID | Name      | Name2          | TenantNo | CashName**
    
    100      |MyShop     | John's shop    | 12345    |John's shop
    
    999      |MyShop     | John's Shop    | 12345    |John's shop
    

    My Issue: for all intents and purposes, "John's shop" IS different from "John's Shop" - I am correctly joining my money table on the TenantNo and then on Name2, but name 2 is different by Case.

    Question: Is there any way to differentiate a join based on case sensitivity? I would not want to use UPPER or LOWER due to the fact that it would ruin the case on reporting.

    Thanks!

    Adding Table information below, please assume all columns are trimmed of whitespace.

    tMoney

    CashNumnbr |  CashName
    102504       Bill's Place     
    
    102374       Tom's Shop      
    
    12345        John's Shop
    
    12345        John's shop
    

    Tenants

       TenantID | Name     | Name2            |TenantNumber 
    
      1         |MyShop    | John's Shop     | 12345
    
      2         |MyShop    | John's shop     | 12345
    
      3         |Shoppee   |  Bill's Place   | 102504        
    
      4         | Shop2    | Toms Shop      | 102374        
    

    Since I want to join to get the correct TenantID for an AR report, I would want to make sure I am always bringing in the correct tenant. If the case is different, is there anything I can write to differentiate a situation like John's Shop?

    • dfundako
      dfundako almost 8 years
      Try trimming leading and trailing white space or confirming the length/datalength of both values
    • Geo
      Geo almost 8 years
      I am trimming both Name2 and the TenantNumber. It will still produce 2 rows as shown above.
    • dfundako
      dfundako almost 8 years
      please provide data from both tCash and Tenants
    • Kritner
      Kritner almost 8 years
      yucky. If you can help it, you should probably change your tCash column to use the tenantId, rather than the name of the shop
    • Geo
      Geo almost 8 years
      @Kritner the issue is i'm trying to return a distinct tenantID. In the world of tenants and leases, unfortunately situations like this can happen. It is crucial to get the right, case-sensitive tenant. Otherwise, I would if I could :/
    • Martin Smith
      Martin Smith almost 8 years
      @Geo - But instead of storing the tenant's name as a string in tCash you should use a surrogate key such as tenant id. How do you know that the names are unique even case sensitively? What happens if a name changes?
    • Geo
      Geo almost 8 years
      A tenant can have multiple variations of a Name2 and have the same TenantNumber. So for reporting, we want to bring in the right tenantID. For instance, lease 100, 101, 102 have tenants John's Shop, John's shop and J's Shop. We can assume they are all the same owner/tenant so they have = TenantNumber. This could happen in terms of an alias or DBA in the business world. I don't like this rational but it is what it is. Thus, I know that unique names have = Tenant#s but different TenantIDs...and different leases, etc. etc. @MartinSmith
  • CathalMF
    CathalMF almost 8 years
    Thats interesting. I was totally unaware that string comparisons were case insensitive.
  • Martin Smith
    Martin Smith almost 8 years
    @CathalMF - It depends on the collation but if you just accept the defaults when installing it and never set database or column collations explicitly you end up with a case insensitive collation.