"IN" statement in dynamodb

28,985

KeyConditionExpression's cannot use the "IN" operator (see http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/QueryAndScan.html#FilteringResults). The idea with KeyConditions/KeyConditionExpression in a query operation is to more efficiently read pages of items from DynamoDB, since items with the same hash key but different range keys are stored contiguously and in sorted order. The IN operator would require extracting small portions of certain pages, which makes the Query operation less efficient, so it is not allowed in KeyConditions. You would want to add that as a FilterExpression instead, which is a convenience parameter to reduce the number of items returned from DynamoDB, but does not impact how the data is read from DynamoDB.

Share:
28,985

Related videos on Youtube

Hadrien Pierre Mazelier
Author by

Hadrien Pierre Mazelier

Updated on July 09, 2022

Comments

  • Hadrien Pierre Mazelier
    Hadrien Pierre Mazelier almost 2 years

    I have a "Users" table, here is a sample :

    {
        username:"haddox",
        formattedPhoneNumber:"676767676",
        verified: 0,
    }
    

    My wish is to retrieve all users whose formattedPhoneNumber is contained in an array of phone numbers (retrieved from my contacts). I created a secondary index, with verified as HASH and formattedPhoneNumber as RANGE. Here is my try :

    var params = {
        TableName: "Users",
        IndexName: "FormattedPhoneSecondaryIndex",
        KeyConditionExpression: "verified  = :v AND formattedPhone IN :n",
        ExpressionAttributeValues: {
            ":v":1,
            ":n": ["672053916", "642117296"]
        },
        ProjectionExpression: "username, formattedPhoneNumber"
    };
    
    
    
    dynamodb.query(params, function(err, data) {
        if (err)
            console.log(JSON.stringify(err, null, 2));
        else
            console.log(JSON.stringify(data, null, 2));
    });
    

    But I get the following error : Invalid KeyConditionExpression: Syntax error; token: \":n\", near: \"IN :n\"",

    Is there something wrong with the IN keyword ? Maybe there is another way to achieve this ?

    • jarmod
      jarmod over 8 years
      Did you try "(verified = :v) AND (formattedPhone IN (:n))"?
    • Hadrien Pierre Mazelier
      Hadrien Pierre Mazelier over 8 years
      Hi! I just tried but it didn't worked..
    • Nirmal Goswami
      Nirmal Goswami over 7 years
      @HadrienPierreMazelier can you paste you working example
  • Hadrien Pierre Mazelier
    Hadrien Pierre Mazelier over 8 years
    Hm ok, thank you! That means it will get all items from db and then filter it right ?
  • Jeffrey Nieh
    Jeffrey Nieh over 8 years
    Yes - so keep the KeyConditionExpression on verified = 1, but add a filter expression for formattedPhoneNumber IN [ ]. That way the request will only fetch the items where verified = 1, then perform the filter.
  • Min Soe
    Min Soe almost 8 years
    scanExpression.filterExpression = @"Id IN (:val)"; scanExpression.expressionAttributeValues = @{@":val":[idList componentsJoinedByString:@","]}; Is that how to use? I have tried putting the array without success, that's why joining the array
  • Nirmal Goswami
    Nirmal Goswami over 7 years
    @MinSoe i have same issue like you.how to use In operator like you
  • cryanbhu
    cryanbhu almost 6 years
    this is merely using the IN inside a FilterExpression, it still reads lots of data and then filters the result before returning, is not actually efficiently reading only rows which satisfy the IN condition.
  • Min Soe
    Min Soe almost 6 years
    hi @cryanbhu could you please suggest? yes, we know it is inefficient but that's the only way we were able to figure out that time.
  • cryanbhu
    cryanbhu over 5 years
    i was just reiterating the top answer's point that there is no way to efficiently do the IN equivalent in DynamoDB. To optimize query efficiency you'd have to rethink the data perhaps split the list items into columns and then have indices on that column. A good understanding of SQL vs noSQL in the decision phase would've helped a ton.