How to select a file from aws s3 by using wild character
Solution 1
You may want to add the "--exclude" flag before your include filter.
The AWS CLI takes the filter "--include" to include it in your already existing search. Since all the files are being returned, you need to exclude all the files first, before including the 2015*.xlsx.
If you want files with only the format "201502_nts_*.xlsx", you can run aws s3 cp s3://bp-dev/bp_source_input/ C:\Business_Panorama\nts\data\in --recursive --exclude * --include "201502_nts_*.xlsx"
Solution 2
I had to add quotes around the --exclude *
wildcard, so it'd look like:
aws s3 cp s3://bp-dev/bp_source_input/ C:\Business_Panorama\nts\data\in --recursive --exclude "*" --include "201502_nts_*.xlsx"
Solution 3
After doing many rounds of check and getting help from bsnchan, I am able to use exclude and include command in aws s3 cli. Please make sure that you put the spaces correctly.
for copy specific file:
aws s3 cp s3://itx-agj-cons-ww-bp-dev/bp_source_input/ C:\Business_Panorama\nts\data\in --recursive --exclude "*" --include "*%mth_cd%_%source%_all.xlsx"
(Note mth_cd is parameter used in bat file)
For checking of file existance.
aws s3 ls s3://itx-agj-cons-ww-bp-dev/bp_source_input/ --recursive | FINDSTR "201502_nts_.*.xlsx"
(Note: windows cli, for unix it will be grep)
Thanks a lot.
user3858193
Updated on September 29, 2020Comments
-
user3858193 over 3 years
I have many a files in s3 bucket and I want to copy those files which have start date of 2012. This below command copies all the file.
aws s3 cp s3://bp-dev/bp_source_input/ C:\Business_Panorama\nts\data\in --recursive --include "201502_nts_*.xlsx"
-
user3858193 about 9 yearsHey, That's worked for me. I have one more question . I want to do ls first to see if the file exist , then I should copy. This is throwiing error ..aws s3 ls s3://bp-dev/bp_source_input/ --recursive --exclude --include "201502_nts_.xlsx"
-
bsnchan about 9 yearsThe --exclude and --include filter flags only work for s3 object operations (such as cp, mv, rm).
ls
is a directory operation. You can run the ls command and pipe it to grep:aws s3 ls s3://bp-dev/bp_source_input/ --recursive | grep 201502_nts_*.xlsx
-
bsnchan about 9 years
grep
is a unix command (I shouldn't have made the assumption that you were on a *nix system). What kind of machine are you running the aws cli from? -
user3858193 about 9 yearsits from windows..sorry it was my bad.
-
bsnchan about 9 yearsI think the equivalent for windows is
findstr
.aws s3 ls s3://bp-dev/bp_source_input/ --recursive | findstr 201502_nts_*.xlsx
-
user3858193 about 9 yearsaws s3 ls s3://bp-dev/bp_source_input/ --recursive | findstr 201502_nts_* is working fine but not the aws s3 ls s3://bp-dev/bp_source_input/ --recursive | findstr 201502_nts_*xlsx
-
bsnchan about 9 yearsMy mistake, wildcards in windows are different. I tried it out in a windows machine and
findstr 201502_nts_.*.xlsx
should work. -
user3858193 about 9 yearsHi @bsnchan, When I am using exclude it is not working. Can you suggest pls. C:\Users\admin_spanda20>aws s3 cp s3://bp-dev/bp_source_input/in C:\Business_Panorama\nts\data\in --recursive --exclude * --include "201502_nts_.xlsx" C:\Users\admin_spanda20> C:\Users\admin_spanda20>aws s3 cp s3://ibp-dev/bp_source_input/in C:\Business_Panorama\nts\data\in --recursive --include "201502_nts_act.xlsx" download: s3://bp-dev/bp_source_input/in/201502_nts_act_apac.xlsx to ..\..\Business_Panorama\nts\data\in\201502_nts_act_apac.xlsx
-
Pramit over 7 yearsWould also advise to use the flag --dryrun as it might be very helpful in avoiding mistakes
aws s3 rm s3://mybucket/ --profile <profile_name> --exclude * --include "file_name_*" --dryrun