How to query in AWS athena connected through S3 using lambda functions in python

33,125

Solution 1

Yes! You can use boto3 to interact with Athena.

Particularly, you're going to probably want the start_query_execution method.

http://boto3.readthedocs.io/en/latest/reference/services/athena.html#Athena.Client.start_query_execution

Solution 2

Like Chris Pollard said, you can use boto3 to query Athena from a Lambda function.

http://boto3.readthedocs.io/en/latest/reference/services/athena.html

To initialize the Athena client:

import boto3
client = boto3.client('athena')

You will then execute your query:

queryStart = client.start_query_execution(
    QueryString = 'SELECT * FROM myTable',
    QueryExecutionContext = {
        'Database': 'myDatabase'
    }, 
    ResultConfiguration = { 'OutputLocation': 's3://your-bucket/key'}
)

If you want to retrieve the results within Lambda (possibly using a second function, due to time constraints - see docs - also note that you pay per 100ms running time), you would use get_query_execution to determine the status of the query:

queryExecution = client.get_query_execution(QueryExecutionId=queryStart['QueryExecutionId'])

You will need to parse the returned object for the value of the QueryExecution.Status.State field. Continue updating the object using get_query_execution() until the result is Succeeded.

Note: Please don't call get_query_execution() in a continuous loop. Rather, use an exponential backoff algorithm to prevent being throttled by that API. You should use this approach for all API calls.

Then you can use get_query_results() to retrieve the results for processing:

results = client.get_query_results(QueryExecutionId=queryStart['QueryExecutionId'])

Solution 3

You can use boto3 client to query Athena tables.

You can read more about it here: Simple way to query Amazon Athena in python with boto3

Share:
33,125
Vipendra Singh
Author by

Vipendra Singh

Updated on August 05, 2022

Comments

  • Vipendra Singh
    Vipendra Singh almost 2 years

    I have my .csv files saved in the S3 Bucket. I am able to query the data of S3 using AWS Athena. Is there any way we can connect the lambda function to athena and query the data from lambda function. please help

    Thanks