How to add index into a dict

python list dictionary indexing list-comprehension

12,056

Solution 1

A simple dictionary comprehension should do the trick:

{key: [index for index, x in enumerate(my_list) if x == key] for key in my_list}

A simple trial:

>>>> my_list = ['A','B','A','B']
>>>> {key: [index for index, x in enumerate(my_list) if x == key] for key in my_list}
>>>> {'A': [0, 2], 'B': [1, 3]}

How It Works

List comprehensions are often used in Python as syntactic sugar for a for loop. Instead of writing

my_list = []
for item in range(10):
    my_list.append(item)

list comprehensions essentially let you condense this series of statements into a single line:

my_list = [item for item in range(10)]

Whenever you see a list comprehension, you should remember that it's just a condensed version of the original three line statement. They are effectively identical - the only benefit offered here is conciseness.

A similar, related species is the dictionary comprehension. It is similar to the list comprehension, except that it allows you to specify both the keys and values at the same time.

An example of a dictionary comprehension:

{k: None for k in ["Hello", "Adele"]}
>>>> {"Hello": None, "Adele": None}

In the answer I provide, I have simply used a dictionary comprehension that

Pulls out keys from my_list
Assigns a list of indices for each key from my_list as the corresponding value

Syntactically, it expands out into a fairly complicated program that reads like this:

my_dict = {}
for key in my_list:
    indices = []
    for index,value in enumerate(my_list):
         if value == key:
              indices.append(index)
    my_dict[key] = indices

Here, enumerate is a standard library function that returns a list of tuples. The first element of each tuple refers to an index of the list, and the second element refers to the value at that index in the list.

Observe:

 enumerate(['a','b','a','b'])
 >>>> [(0,'a'),(1,'b'),(2,'b'),(3,'b')]

That is the power of enumerate.

Efficiency

As always, premature optimisation is the root of all evil. It is indeed true that this implementation is inefficient: it duplicates work, and runs in quadratic time. The important thing, however, is to ask if it is okay for the specific task you have. For relatively small lists, this is sufficient.

You can look at certain optimisations. @wilinx's way works well. @Rob in the comments suggests iterating over set(my_list), which prevents duplicated work.

Solution 2

Use enumerate and setdefault:

example = ['a', 'b', 'a', 'b']
mydict = {}
for idx, item in enumerate(example):
     indexes = mydict.setdefault(item, [])
     indexes.append(idx)

Solution 3

Why not use defaultdict from itertools instead:

>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> 
>>> for i,x in enumerate(l):
        d[x].append(i)


>>> d
defaultdict(<class 'list'>, {'A': [0, 2], 'B': [1, 3]})

12,056

Author by

Long Vuong

Updated on June 04, 2022

Comments

Long Vuong almost 2 years
For example, given:
```
['A', 'B', 'A', 'B']    
```
I want to have:
```
{'A': [0, 2], 'B': [1, 3]}
```
I tried a loop that goes like; add the index of where the character is found, then replace it with '' so the next time the loop goes through, it passes on to the next character.

However, that loops doesn't work for other reasons, and I'm stuck with no idea how to proceed.