AWS Athena Import CSV file
To reproduce your situation, I did the following:
- Created a text file using your sample data (
gps.txt
) - Uploaded it to an Amazon S3 bucket in its own folder (with no other files in that folder)
- Created a table in Amazon Athena
- Specified the location as the folder name (
s3://my-bucket/gps/
) - Specified 7 columns (since there are 7 string values in your sample file)
- Specified the location as the folder name (
However, since the data has commas within each pair of numbers, I changed the SerDe to OpenCSVSerDe for Processing CSV - Amazon Athena:
CREATE EXTERNAL TABLE IF NOT EXISTS default.gps (
`c1` string,
`c2` string,
`c3` string,
`c4` string,
`c5` string,
`c6` string,
`c7` string
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES ("separatorChar" = ",", "escapeChar" = "\\")
LOCATION 's3://my-bucket/gps/'
TBLPROPERTIES ('has_encrypted_data'='false');
I was then able to successfully query the table. A sample column value is: 41.9100687,-87.8805614
harunB10
Updated on June 20, 2022Comments
-
harunB10 almost 2 years
I have this data stored in S3 as .csv (but it can be any other file format which is the best suitable for my requirement):
"41.9100687,-87.8805614","41.9802511,-87.8803253","41.9806802,-87.8792417","41.9810128,-87.8785121","41.9200687,-87.8805614","41.9802511,-87.8803253","41.9806802,-87.8792417", "41.9100687,-87.8805614","41.9802511,-87.8803253","41.9806802,-87.8792417","41.9810128,-87.8785121","41.9200687,-87.8805614","41.9802511,-87.8803253","41.9806802,-87.8792417", "41.9100687,-87.8805614","41.9802511,-87.8803253","41.9806802,-87.8792417","41.9810128,-87.8785121","41.9200687,-87.8805614","41.9802511,-87.8803253","41.9806802,-87.8792417", "41.9100687,-87.8805614","41.9802511,-87.8803253","41.9806802,-87.8792417","41.9810128,-87.8785121","41.9200687,-87.8805614","41.9802511,-87.8803253","41.9806802,-87.8792417",
and I would like to have one coordinate per column:
Like this:
Coordinates:
1. 41.9100687,-87.8805614 2. 41.9802511,-87.8803253 3. 41.9806802,-87.8792417
After importing S3 I choose CSV as data type... And then I add string column.
But instead I get some weird table output. Beside this I tried to import this as plain txt file with comma delimiter.. I get same weird output.
What am I doing wrong here?
EDIT
This
test
column screenshot is query from another but identical example. There should begps_coordinates