Selecting Values From a table as Column Headers

sql sql-server sql-server-2008 tsql pivot

22,357

Solution 1

A version with joins that works regardless of missing rows:

SELECT  
    pd.FileID 
  , p1.Value  AS Name
  , p2.Value  AS Size
  , p3.Value  AS Type
FROM
        ( SELECT DISTINCT FileID
          FROM propertyvalues 
        ) AS pd
    LEFT JOIN
        propertyvalues AS p1
            ON  p1.FileID = pd.FileID 
            AND p1.Property = 'Name'
    LEFT JOIN
        propertyvalues AS p2
            ON  p2.FileID = pd.FileID 
            AND p2.Property = 'Size'
    LEFT JOIN
        propertyvalues AS p3
            ON  p3.FileID = pd.FileID
            AND p3.Property = 'Type' ;

If you have a table where FileID is the primary key, you may replace the DISTINCT subquery with that table.

Regarding efficiency, it depends on a lot of factors. Examples:

Do all FileIDs have rows with Name, Size and Type and no other properties (and your table has a clustered index on (FileID, Property))? Then the MAX(CASE...) version would perform quite well as the whole table would have to be scanned anyway.
Are there (many) more than 3 properties and a lot of FileIDs have no Name, Size and Type, then the JOIN version would work well with an index on (Property, FileID) INCLUDE (Value) as only this index data would be used for the joins.
Not sure how efficient is the PIVOT version.

What I suggest though is to test the various versions with your data and table sizes, in your envirorment (version, disk, memory, settings, ...) before you select which one to use.

Solution 2

You did not specify RDBMS, if you know the number of columns to transform then you can hard-code the values:

select FileId,
  max(case when property = 'Name' then value end) Name,
  max(case when property = 'Size' then value end) Size,
  max(case when property = 'Type' then value end) Type
from yourtable
group by FileId

This is basically a PIVOT function, some RDBMS will have a PIVOT, if you do then you can use the following, PIVOT is available in SQL Server, Oracle:

select *
from 
(
  select FileId, Property, Value
  from yourTable
) x
pivot
(
  max(value)
  for property in ([Name], [Size], [Type])
) p

If you have an unknown number of columns to transform, then you can use a dynamic PIVOT. This gets the list of columns to transform at run-time:

DECLARE @cols AS NVARCHAR(MAX),
    @query  AS NVARCHAR(MAX)

select @cols = STUFF((SELECT distinct ',' + QUOTENAME(property) 
                    from yourtable
            FOR XML PATH(''), TYPE
            ).value('.', 'NVARCHAR(MAX)') 
        ,1,1,'')

set @query = 'SELECT ' + @cols + ' from 
             (
                select FileId, Property, Value
                from yourtable
            ) x
            pivot 
            (
                max(value)
                for Property in (' + @cols + ')
            ) p '

execute(@query)

22,357

Author by

Ankit Khatri

Updated on November 16, 2020

Comments

Ankit Khatri over 3 years

I have a table 'propertyvalues' as follows:

ID  FileID  Property  Value

 1   x       Name      1.pdf

 2   x       Size      12567

 3   x       Type      application/pdf

 4   y       Name      2.pdf

 5   y       Size      23576

 6   y       Type      application/pdf

......

and so on

How to write a SQL query on the table above to fetch a result like below

 
FileID  Name     Size      Type

  x     1.pdf    12567     application/pdf

  y     2.pdf    23576     application/pdf

Anthony over 11 years

Too complicated solution for too simple problem
Taryn over 11 years

@Antonio I am not sure why you say it is too complicated. This is much easier than performing multiple joins. If they had 10 fields to transform, then you would have to perform 10 joins.
Anthony over 11 years

Obviously there are just 3 fields. Neither more, nor less.
Taryn over 11 years

but then you still have to perform multiple joins, this method has no joins, they are not needed for this type of query. Not only that, the OP possibly only showed a partial list of fields. Your query will work, but this it not wrong an IMO does not deserve a downvote.
Nikola Markovinović over 11 years

+1 - obviously not over-engineering, also three answers in one.
dezso over 11 years

@Antonio You are terribly wrong. The first solution is even simpler than yours, the others are more generalized but still not complicated.
Anthony over 11 years

The solution with "CASE" will ALWAYS work slower than "JOIN" solution because JOIN can use RDBMS indexes. in tables with >1M records the diference is really huge. "PIVOT" approach is good but it is not supported by all the databases and is also slow. I recommend use JOINs because of the speed.
ypercubeᵀᴹ over 11 years

You probably want to put LEFT joins there, not JOIN.
Anthony over 11 years

I mean "INNER JOIN" which is the default
ypercubeᵀᴹ over 11 years

Then your query will not show results for FileIDs that have a Name but not a Size or Type (and vice-versa).
ypercubeᵀᴹ over 11 years

I like Joins but they are not always faster. It depends.
Admin over 11 years

@Antonio "The solution with "CASE" will ALWAYS work slower than "JOIN" solution because JOIN can use RDBMS indexes." this is utter nonsense. Using 'CASE' like this may be heavier on the CPU, and may be lighter on i/o, but which is 'faster' will depend on your environment. If I had to guess I'd say that using CASE will usually be faster.
Admin over 11 years

worth noting that your answer and bluefeet's are not equivalent unless certain assumptions are made regarding uniqueness and the presence of nulls
ypercubeᵀᴹ over 11 years

@Jack: Yes, I assumed that (FileID, Property) is unique. I don't think that nulls matters in that case.
Nikola Markovinović over 11 years

@JackDouglas case always won during my (rather limited) testing; the only time join came close to max(case...) was with index on (FileID, Property) and even then it didn't win. 3M records, different indexing.
ypercubeᵀᴹ over 11 years

Devil's advocate on a similar problem: 20 LEFT JOINS lightly faster than SUM(CASE) with GROUP BY
Taryn over 11 years

my preference would be PIVOT, I personally think it is easier than coding for each CASE statement. :)
Admin over 11 years

@ypercube MySQL is MySQL - my comments refer to SQL Server. I'd never hazard a guess about anything on MySQL as it seems to have so many quirks.
Admin over 11 years

+1 from me this is a perfectly valid answer (making certain assumptions about the data) - not sure why it is downvoted
ypercubeᵀᴹ over 11 years

@Jack: An SQL-Server example (tested by the OP, not me so I cannot assert on indexes or distribution): Totalling up ballot results (and not exactly the same case, just a similarity)
Martin Smith over 11 years

@JackDouglas - If optimal indexes are present then I would favour the join version for performance though not maintainability as per my reasoning here
Admin over 11 years

@Martin - interesting. I presume the same would not apply to CASE however as that will just be a full scan and an aggregate from a plan point of view.
aswzen about 7 years

soooo slow..15 row generated on 15 sec
ypercubeᵀᴹ about 7 years

@aswzen what exactly is slow?