In Perl, how do I create a hash whose keys come from a given array?

127,383

Solution 1

%hash = map { $_ => 1 } @array;

It's not as short as the "@hash{@array} = ..." solutions, but those ones require the hash and array to already be defined somewhere else, whereas this one can take an anonymous array and return an anonymous hash.

What this does is take each element in the array and pair it up with a "1". When this list of (key, 1, key, 1, key 1) pairs get assigned to a hash, the odd-numbered ones become the hash's keys, and the even-numbered ones become the respective values.

Solution 2

 @hash{@array} = (1) x @array;

It's a hash slice, a list of values from the hash, so it gets the list-y @ in front.

From the docs:

If you're confused about why you use an '@' there on a hash slice instead of a '%', think of it like this. The type of bracket (square or curly) governs whether it's an array or a hash being looked at. On the other hand, the leading symbol ('$' or '@') on the array or hash indicates whether you are getting back a singular value (a scalar) or a plural one (a list).

Solution 3

@hash{@keys} = undef;

The syntax here where you are referring to the hash with an @ is a hash slice. We're basically saying $hash{$keys[0]} AND $hash{$keys[1]} AND $hash{$keys[2]} ... is a list on the left hand side of the =, an lvalue, and we're assigning to that list, which actually goes into the hash and sets the values for all the named keys. In this case, I only specified one value, so that value goes into $hash{$keys[0]}, and the other hash entries all auto-vivify (come to life) with undefined values. [My original suggestion here was set the expression = 1, which would've set that one key to 1 and the others to undef. I changed it for consistency, but as we'll see below, the exact values do not matter.]

When you realize that the lvalue, the expression on the left hand side of the =, is a list built out of the hash, then it'll start to make some sense why we're using that @. [Except I think this will change in Perl 6.]

The idea here is that you are using the hash as a set. What matters is not the value I am assigning; it's just the existence of the keys. So what you want to do is not something like:

if ($hash{$key} == 1) # then key is in the hash

instead:

if (exists $hash{$key}) # then key is in the set

It's actually more efficient to just run an exists check than to bother with the value in the hash, although to me the important thing here is just the concept that you are representing a set just with the keys of the hash. Also, somebody pointed out that by using undef as the value here, we will consume less storage space than we would assigning a value. (And also generate less confusion, as the value does not matter, and my solution would assign a value only to the first element in the hash and leave the others undef, and some other solutions are turning cartwheels to build an array of values to go into the hash; completely wasted effort).

Solution 4

Note that if typing if ( exists $hash{ key } ) isn’t too much work for you (which I prefer to use since the matter of interest is really the presence of a key rather than the truthiness of its value), then you can use the short and sweet

@hash{@key} = ();

Solution 5

I always thought that

foreach my $item (@array) { $hash{$item} = 1 }

was at least nice and readable / maintainable.

Share:
127,383

Related videos on Youtube

raldi
Author by

raldi

Updated on July 05, 2022

Comments

  • raldi
    raldi almost 2 years

    Let's say I have an array, and I know I'm going to be doing a lot of "Does the array contain X?" checks. The efficient way to do this is to turn that array into a hash, where the keys are the array's elements, and then you can just say

    if($hash{X}) { ... }

    Is there an easy way to do this array-to-hash conversion? Ideally, it should be versatile enough to take an anonymous array and return an anonymous hash.

  • Michael Carman
    Michael Carman over 15 years
    Frosty: You have to declare "my %hash" first, then do "@hash{@arr} = 1" (no "my").
  • xdg
    xdg over 15 years
    This only sets $hash{$keys[0]} = 1. The other hash values are undef. See @hash{@array} = (1) x @array instead xdg (0 seconds ago)
  • ysth
    ysth over 15 years
    = (), not = undef, just for consistency in implicitly using undef for all the values, not just all after the first. (As demonstrated in these comments, it's too easy to see the undef and think it can just be changed to 1 and affect all the hash values.)
  • ysth
    ysth over 15 years
    If doing it multiple times for a large array, that's potentially going to be a lot slower.
  • ysth
    ysth over 15 years
    If doing it multiple times for a large array, that's potentially going to be a lot slower.
  • Jacob
    Jacob over 15 years
    Actually doing it once is a lot slower. it has to create an object. Then shortly after, it will destroy that object. This is just an example of what is possible.
  • brian d foy
    brian d foy over 15 years
    It'd the "smart match operator" :)
  • raldi
    raldi over 15 years
    Wow, I never heard of (or thought of) that one. Thanks! I'm having trouble understanding how it works. Can you add an explanation? In particular, how can you take a hash named %hash and refer to it with an @ sign?
  • ysth
    ysth over 15 years
    raldi: it's a hash slice, a list of values from the hash, so it gets the list-y @ in front. See perldoc.perl.org/perldata.html#Slices - particularly the last paragraph of the section
  • Dave Cross
    Dave Cross over 15 years
    As the values end up as "undef" here (and probably for not quite the reason you think - as ysth has pointed out) you can't just use the hash in code like "if ($hash{$value})". You'd need "if (exists $hash{$value})".
  • Dave Cross
    Dave Cross over 15 years
    The different between $_ => 1 and $_,1 is purely stylistic. Personally I prefer => as it seems to indicate the key/value link more explicitly. Your @hash{@array} = 1 solution doesn't work. Only one of the values (the one associated with the first key in @array) gets set to 1.
  • Yaser Har
    Yaser Har over 15 years
    I would've never used it with any check other than if (exists). It's just a set to me.
  • bhollis
    bhollis over 15 years
    It'd be nice if you edited your answer to point out that it needs to be used with exists, that exists is more efficient than checking truthiness by actually loading the hash value, and that undef takes less space than 1.
  • raldi
    raldi over 15 years
    You should add that to your answer!
  • Jacob
    Jacob over 15 years
    If you want to use an anonymous array you can @hash{@{[ ... ]}} = undef;.
  • Susheel Javadi
    Susheel Javadi about 14 years
    Could you explain the RHS as well? Thanks.
  • Tim Ludwinski
    Tim Ludwinski about 14 years
    (list) x $number replicates the list $number times. Using an array in scalar context returns the number of elements, so (1) x @array is a list of 1s with the same length as @array.
  • Jacob
    Jacob about 13 years
    %hash = map{ $_, undef } @keylist
  • Stefan Majewsky
    Stefan Majewsky about 12 years
    The usual idiom for @keys-1 is $#keys.
  • Tamzin Blake
    Tamzin Blake about 12 years
    @StefanMajewsky I haven't seen that one actually used in a while. I stay away from it myself - it's ugly.
  • bobbogo
    bobbogo almost 10 years
    A hasref seems more than a little overblown here.
  • Jim Balter
    Jim Balter over 5 years
    This doesn't answer the question. It also misses the point ... array to hash conversion only happens once ... a total 0.04 seconds (in 2008) added to the run time of the program, whereas lookups happen many times.
  • arclight
    arclight over 5 years
    I attempted to solve the underlying problem not just answer the question. List::MoreUtils may or may not be an appropriate method, depending on the use case. Your use case may have many lookups; others may not. The point is that both array-to-hash conversion and List::MoreUtils solve the underlying problem of determining membership; knowing multiple approaches allows you to choose the best method for your specific use case.
  • eremmel
    eremmel over 5 years
    The nice thing about the @hash{@array} = .... solution is that you extend/update %hash with new key/value pairs
  • PhilHarvey
    PhilHarvey over 4 years
    This technique is twice as fast as the map technique on my computer.
  • soger
    soger over 2 years
    Actually internally Tie::IxHash keeps both the hash and an array with key order, it has abismal performance for huge hashes, so use it only on small hashes.
  • soger
    soger over 2 years
    Yes and surprisingly it's also the best performing one. Here are my results with my personal computer on an array with 10 million values. create: 0.465167 for: 6.015478 map: 17.346142 slice: 7.154768
  • soger
    soger over 2 years
    I just made a small test on my pc using an array of 10 million values and this solution was far the worst performing: create: 0.465167 for: 6.015478 map: 17.346142 slice: 7.154768