Dynamodb: query using more than two attributes

11,451

Your data modeling process has to take into consideration your data retrieval requirements, in DynamoDB you can only query by hash or hash + range key.

If querying by primary key is not enough for your requirements, you can certainly have alternate keys by creating secondary indexes (Local or Global).

However, the concatenation of multiple attributes can be used in certain scenarios as your primary key to avoid the cost of maintaining secondary indexes.

If you need to get users by First Name, Last Name and Creation Date, I would suggest you to include those attributes in the Hash and Range Key, so the creation of additional indexes are not needed.

The Hash Key should contain a value that could be computed by your application and at same time provides uniform data access. For example, say that you choose to define your keys as follow:

Hash Key (name): first_name#last_name

Range Key (created) : MM-DD-YYYY-HH-mm-SS-milliseconds

You can always append additional attributes in case the ones mentioned are not enough to make your key unique across the table.

users = Table.create('users', schema=[
        HashKey('name'),
        RangeKey('created'),
     ], throughput={
        'read': 5,
        'write': 15,
     })

Adding the user to the table:

with users.batch_write() as batch:
     batch.put_item(data={
         'name': 'John#Doe',
         'first_name': 'John',
         'last_name': 'Doe',
         'created': '03-21-2015-03-03-02-3243',
     })

Your code to find the user John Doe, created on '03-21-2015' should be something like:

name_john_doe = users.query_2(
   name__eq='John#Doe',
   created__beginswith='03-21-2015'
)

for user in name_john_doe:
     print user['first_name']

Important Considerations:

i. If your query starts to get too complicated and the Hash or Range Key too long by having too many concatenated fields then definitely use Secondary Indexes. That's a good sign that only a primary index is not enough for your requirements.

ii. I mentioned that the Hash Key should provide uniform data access:

"Dynamo uses consistent hashing to partition its key space across its replicas and to ensure uniform load distribution. A uniform key distribution can help us achieve uniform load distribution assuming the access distribution of keys is not highly skewed." [DYN]

Not only the Hash Key allows to uniquely identify the record, but also is the mechanism to ensure load distribution. The Range Key (when used) helps to indicate the records that will be mostly retrieved together, therefore, the storage can also be optimized for such need.

The link below has a complete explanation about the topic:

http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GuidelinesForTables.html#GuidelinesForTables.UniformWorkload

Share:
11,451

Related videos on Youtube

Juan Pablo
Author by

Juan Pablo

Updated on September 14, 2022

Comments

  • Juan Pablo
    Juan Pablo over 1 year

    In Dynamodb you need to specify in an index the attributes that can be used for making queries.

    How can I make a query using more than two attributes?

    Example using boto.

    Table.create('users', 
            schema=[
                HashKey('id') # defaults to STRING data_type
            ], throughput={
                'read': 5,
                'write': 15,
            }, global_indexes=[
                GlobalAllIndex('FirstnameTimeIndex', parts=[
                    HashKey('first_name'),
                    RangeKey('creation_date', data_type=NUMBER),
                ],
                throughput={
                    'read': 1,
                    'write': 1,
                }),
                GlobalAllIndex('LastnameTimeIndex', parts=[
                    HashKey('last_name'),
                    RangeKey('creation_date', data_type=NUMBER),
                ],
                throughput={
                    'read': 1,
                    'write': 1,
                })
            ],
            connection=conn)
    

    How can I look for users with first name 'John', last name 'Doe', and created on '3-21-2015' using boto?