PostgreSQL case insensitive SELECT on array
Solution 1
One alternative not mentioned is to install the citext
extension that comes with PostgreSQL 8.4+ and use an array of citext
:
regress=# CREATE EXTENSION citext;
regress=# SELECT 'foo' = ANY( '{"Foo","bar","bAz"}'::citext[] );
?column?
----------
t
(1 row)
If you want to be strictly correct about this and avoid extensions you have to do some pretty ugly subqueries because Pg doesn't have many rich array operations, in particular no functional mapping operations. Something like:
SELECT array_agg(lower(($1)[n])) FROM generate_subscripts($1,1) n;
... where $1 is the array parameter. In your case I think you can cheat a bit because you don't care about preserving the array's order, so you can do something like:
SELECT 'foo' IN (SELECT lower(x) FROM unnest('{"Foo","bar","bAz"}'::text[]) x);
Solution 2
This seems hackish to me but I think it should work
SELECT value FROM table WHERE 'foo' = ANY(lower(value::text)::text[])
ilike
could have issues if your arrays can have _
or %
Note that what you are doing is converting the text array to a single text string, converting it to lower case, and then back to an array. This should be safe. If this is not sufficient you could use various combinations of string_to_array and array_to_string, but I think the standard textual representations should be safer.
Update building on subquery solution below, one option would be a simple function:
CREATE OR REPLACE FUNCTION lower(text[]) RETURNS text[] LANGUAGE SQL IMMUTABLE AS
$$
SELECT array_agg(lower(value)) FROM unnest($1) value;
$$;
Then you could do:
SELECT value FROM table WHERE 'foo' = ANY(lower(value));
This might actually be the best approach. You could also create GIN indexes on the output of the function if you want.
Solution 3
Another alternative would be with unnest()
WITH tbl AS (SELECT 1 AS id, '{"Foo","bar","bAz"}'::text[] AS value)
SELECT value
FROM (SELECT id, value, unnest(value) AS val FROM tbl) x
WHERE lower(val) = 'foo'
GROUP BY id, value;
I added an id
column to get exactly identical results - i.e. duplicate value
if there are duplicates in the base table. Depending on your circumstances, you can probably omit the id
from the query to collapse duplicates in the results or if there are no dupes to begin with. Also demonstrating a syntax alternative:
SELECT value
FROM (SELECT value, lower(unnest(value)) AS val FROM tbl) x
WHERE val = 'foo'
GROUP BY value;
If array elements are unique within arrays in lower case, you don't even need the GROUP BY
, since every value
can only match once.
SELECT value
FROM (SELECT value, lower(unnest(value)) AS val FROM tbl) x
WHERE val = 'foo';
'foo'
must be lower case, obviously.
Should be fast.
If you want that fast wit a big table, I would create a functional GIN index, though.
Related videos on Youtube
PerryW
I used to be a developer, many many years ago... These days I'm an IT manager to pay the bills and a farmer by choice. Still like to code a bit to keep my hand in and ward off senility. You'll mostly find me asking questions on StackOverflow and answering them on ELL
Updated on June 05, 2022Comments
-
PerryW almost 2 years
I'm having problems finding the answer here, on google or in the docs ...
I need to do a case insensitive select against an array type.So if:
value = {"Foo","bar","bAz"}
I need
SELECT value FROM table WHERE 'foo' = ANY(value)
to match.
I've tried lots of combinations of lower() with no success.
ILIKE
instead of=
seems to work but I've always been nervous aboutLIKE
- is that the best way?-
PerryW about 11 yearsSo ILIKE is ruled out as pointed out by @Chris Travers below - it's quite likely that a value could legitimately contain an underscore
-
PerryW about 11 yearsSo it wasn't such a dumb question then @Erwin? :) (first time I've been edited - not complaining, just fascinated)
-
Erwin Brandstetter about 11 yearsOn the contrary: it's a very interesting question, IMO, and it has attracted a number of interesting answers already. But primarily I edited that bit out, because we try to keep the noise ratio in questions and answers low on SO. Some noise can go into comments. :)
-
-
Craig Ringer about 11 yearsUseful hack. I don't think there are any charsets/locales where
lower
will transform chars that the array syntax cares about. (BTW, I sometimes wish Pg had arraymap
,filter
andfold
/foldl
/foldr
for those cases where PL/PgSQL is overkill but pure SQL is clumsy. Arraysort
too, actually.) -
Erwin Brandstetter about 11 years+1 on citext. May or may not be practical for the OP, but it's the perfect opportunity to mention that extension.
-
PerryW about 11 years@ErwinBrandstetter is right - I can't install libraries sadly - it's a bit of a black-box instance
-
Chris Travers about 11 years@craig, agree. Actually sort could be easily done with a plain sql function.
-
Chris Travers about 11 yearsActually, you can wrap the subquery in a function (I just added this to my answer) so that lower(array_of_text) will just work.
-
Chris Travers about 11 yearsAlso note you can take the subquery and write a lower(text[]) function so it just works.
-
Craig Ringer about 11 years@ChrisTravers Easily, but not necessarily efficiently. Array access at SQL level isn't super-efficient unfortunately. One day I'll get around to writing them, array access in C doesn't look too hard and things like sorts can be done with generic opclass actions.
-
Chris Travers about 11 yearsActually, maybe it would be worth putting together an extension with these array functions. They could all be sql or pl/pgsql easily enough.
-
Erwin Brandstetter about 11 yearsWhile that works and simplifies the code, it would be slower. You would go back and forth between array and set representation. You could base the functional GIN Index I mentioned on it, though. Then you have fast & simple syntax.
-
Lee over 3 years@ChrisTraversI actually tried your function but it returns NULL for empty array, I posted another answer that uses casting to text instead.