Selecting Values From a table as Column Headers
Solution 1
A version with joins that works regardless of missing rows:
SELECT
pd.FileID
, p1.Value AS Name
, p2.Value AS Size
, p3.Value AS Type
FROM
( SELECT DISTINCT FileID
FROM propertyvalues
) AS pd
LEFT JOIN
propertyvalues AS p1
ON p1.FileID = pd.FileID
AND p1.Property = 'Name'
LEFT JOIN
propertyvalues AS p2
ON p2.FileID = pd.FileID
AND p2.Property = 'Size'
LEFT JOIN
propertyvalues AS p3
ON p3.FileID = pd.FileID
AND p3.Property = 'Type' ;
If you have a table where FileID
is the primary key, you may replace the DISTINCT
subquery with that table.
Regarding efficiency, it depends on a lot of factors. Examples:
Do all FileIDs have rows with Name, Size and Type and no other properties (and your table has a clustered index on
(FileID, Property)
)? Then theMAX(CASE...)
version would perform quite well as the whole table would have to be scanned anyway.Are there (many) more than 3 properties and a lot of FileIDs have no Name, Size and Type, then the
JOIN
version would work well with an index on(Property, FileID) INCLUDE (Value)
as only this index data would be used for the joins.Not sure how efficient is the
PIVOT
version.
What I suggest though is to test the various versions with your data and table sizes, in your envirorment (version, disk, memory, settings, ...) before you select which one to use.
Solution 2
You did not specify RDBMS, if you know the number of columns to transform then you can hard-code the values:
select FileId,
max(case when property = 'Name' then value end) Name,
max(case when property = 'Size' then value end) Size,
max(case when property = 'Type' then value end) Type
from yourtable
group by FileId
This is basically a PIVOT
function, some RDBMS will have a PIVOT
, if you do then you can use the following, PIVOT
is available in SQL Server, Oracle:
select *
from
(
select FileId, Property, Value
from yourTable
) x
pivot
(
max(value)
for property in ([Name], [Size], [Type])
) p
If you have an unknown number of columns to transform, then you can use a dynamic PIVOT
. This gets the list of columns to transform at run-time:
DECLARE @cols AS NVARCHAR(MAX),
@query AS NVARCHAR(MAX)
select @cols = STUFF((SELECT distinct ',' + QUOTENAME(property)
from yourtable
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set @query = 'SELECT ' + @cols + ' from
(
select FileId, Property, Value
from yourtable
) x
pivot
(
max(value)
for Property in (' + @cols + ')
) p '
execute(@query)
Ankit Khatri
Updated on November 16, 2020Comments
-
Ankit Khatri over 3 years
I have a table 'propertyvalues' as follows:
ID FileID Property Value
1 x Name 1.pdf
2 x Size 12567
3 x Type application/pdf
4 y Name 2.pdf
5 y Size 23576
6 y Type application/pdf
......
and so onHow to write a SQL query on the table above to fetch a result like below
FileID Name Size Type
x 1.pdf 12567 application/pdf
y 2.pdf 23576 application/pdf -
Anthony over 11 yearsToo complicated solution for too simple problem
-
Taryn over 11 years@Antonio I am not sure why you say it is too complicated. This is much easier than performing multiple joins. If they had 10 fields to transform, then you would have to perform 10 joins.
-
Anthony over 11 yearsObviously there are just 3 fields. Neither more, nor less.
-
Taryn over 11 yearsbut then you still have to perform multiple joins, this method has no joins, they are not needed for this type of query. Not only that, the OP possibly only showed a partial list of fields. Your query will work, but this it not wrong an IMO does not deserve a downvote.
-
Nikola Markovinović over 11 years+1 - obviously not over-engineering, also three answers in one.
-
dezso over 11 years@Antonio You are terribly wrong. The first solution is even simpler than yours, the others are more generalized but still not complicated.
-
Anthony over 11 yearsThe solution with "CASE" will ALWAYS work slower than "JOIN" solution because JOIN can use RDBMS indexes. in tables with >1M records the diference is really huge. "PIVOT" approach is good but it is not supported by all the databases and is also slow. I recommend use JOINs because of the speed.
-
ypercubeᵀᴹ over 11 yearsYou probably want to put
LEFT
joins there, notJOIN
. -
Anthony over 11 yearsI mean "INNER JOIN" which is the default
-
ypercubeᵀᴹ over 11 yearsThen your query will not show results for FileIDs that have a Name but not a Size or Type (and vice-versa).
-
ypercubeᵀᴹ over 11 yearsI like Joins but they are not always faster. It depends.
-
Admin over 11 years@Antonio "The solution with "CASE" will ALWAYS work slower than "JOIN" solution because JOIN can use RDBMS indexes." this is utter nonsense. Using 'CASE' like this may be heavier on the CPU, and may be lighter on i/o, but which is 'faster' will depend on your environment. If I had to guess I'd say that using CASE will usually be faster.
-
Admin over 11 yearsworth noting that your answer and bluefeet's are not equivalent unless certain assumptions are made regarding uniqueness and the presence of nulls
-
ypercubeᵀᴹ over 11 years@Jack: Yes, I assumed that
(FileID, Property)
is unique. I don't think that nulls matters in that case. -
Nikola Markovinović over 11 years@JackDouglas case always won during my (rather limited) testing; the only time
join
came close tomax(case...)
was with index on (FileID, Property) and even then it didn't win. 3M records, different indexing. -
ypercubeᵀᴹ over 11 yearsDevil's advocate on a similar problem: 20 LEFT JOINS lightly faster than SUM(CASE) with GROUP BY
-
Taryn over 11 yearsmy preference would be
PIVOT
, I personally think it is easier than coding for eachCASE
statement. :) -
Admin over 11 years@ypercube MySQL is MySQL - my comments refer to SQL Server. I'd never hazard a guess about anything on MySQL as it seems to have so many quirks.
-
Admin over 11 years+1 from me this is a perfectly valid answer (making certain assumptions about the data) - not sure why it is downvoted
-
ypercubeᵀᴹ over 11 years@Jack: An SQL-Server example (tested by the OP, not me so I cannot assert on indexes or distribution): Totalling up ballot results (and not exactly the same case, just a similarity)
-
Martin Smith over 11 years@JackDouglas - If optimal indexes are present then I would favour the join version for performance though not maintainability as per my reasoning here
-
Admin over 11 years@Martin - interesting. I presume the same would not apply to CASE however as that will just be a full scan and an aggregate from a plan point of view.
-
aswzen about 7 yearssoooo slow..15 row generated on 15 sec
-
ypercubeᵀᴹ about 7 years@aswzen what exactly is slow?