How to delete all glacier data?

21,729

Solution 1

How to delete Vault (AWS Glacier)

This Gist give some tips in order to remove AWS Glacier Vault with AWS CLI (ie. https://aws.amazon.com/en/cli/).

Step 1 / Retrive inventory

$ aws glacier initiate-job --job-parameters "{\"Type\": \"inventory-retrieval\"}" --vault-name YOUR_VAULT_NAME --account-id YOUR_ACCOUNT_ID --region YOUR_REGION

Wait during 3/5 hours… :-(

For the new step you need to get the JobId. When the retrive inventory is done you can get it with the following command: aws glacier list-jobs --vault-name YOUR_VAULT_NAME --region YOUR_REGION

Step 2 / Get the ArchivesIds

$ aws glacier get-job-output --job-id YOUR_JOB_ID --vault-name YOUR_VAULT_NAME --region YOUR_REGION ./output.json

See. Downloading a Vault Inventory in Amazon Glacier

You can get all the ArchiveId in the ./output.json file.

Step 3 / Delete Archives

Powershell

from @vinyar

$input_file_name = 'output.json'
$vault_name = 'my_vault'
# $account_id = 'AFDKFKEKF9EKALD' #not used. using - instead

$a = ConvertFrom-Json $(get-content $input_file_name)

$a.ArchiveList.archiveid | %{
write "executing: aws glacier delete-archive --archive-id=$_ --vault-name $vault_name --account-id -"
aws glacier delete-archive --archive-id=$_ --vault-name $vault_name --account-id - }

Python

from @robweber

ijson, which reads in the file as a stream instead. You can install it with pip

import ijson, subprocess

input_file_name = 'output.json'
vault_name = ''
account_id = ''

f = open(input_file_name)
archive_list = ijson.items(f,'ArchiveList.item')

for archive in archive_list:
    print("Deleting archive " + archive['ArchiveId'])
    command = "aws glacier delete-archive --archive-id='" + archive['ArchiveId'] + "' --vault-name " + vault_name + " --acc$
    subprocess.run(command, shell=True, check=True)

f.close()

PHP

from @Remiii

<?php

$file = './output.json' ;
$accountId = 'YOUR_ACCOUNT_ID' ;
$region = 'YOUR_REGION' ;
$vaultName = 'YOUR_VAULT_NAME' ;

$string = file_get_contents ( $file ) ;
$json = json_decode($string, true ) ;
foreach ( $json [ 'ArchiveList' ] as $jsonArchives )
{
    echo 'Delete Archive: ' . $jsonArchives [ 'ArchiveId' ] . "\n" ;
    exec ( 'aws glacier delete-archive --archive-id="' . $jsonArchives [ 'ArchiveId' ] . '" --vault-name ' . $vaultName . ' --account-id ' . $accountId . ' --region ' . $region , $output ) ;
    echo $output ;
}

Mark: After you delete an archive, if you immediately download the vault inventory, it might include the deleted archive in the list because Amazon Glacier prepares vault inventory only about once a day.

See. Deleting an Archive in Amazon Glacier

Step 4 / Delete a Vault

$ aws glacier delete-vault --vault-name YOUR_VAULT_NAME --account-id YOUR_ACCOUNT_ID --region YOUR_REGION

Gist originally by @Remiii

Ok So a few years ago I closed my account and just reopened it a few month ago and guess what amazon still has my 3TB there on my account and now I got billed for them for the last few months.

So I came back to this question and found that:

  • mt-aws-glacier is almost impossible to setup on the latest ubuntu then went to 12.04 awscli is not there, then when to 14.04 got an error about my signature...
  • The Arq Answer is no longer relevant in Arq 5
  • Then I found the above gist and copied it here because it is better for the community
  • Tried cloudberry and it looks like it should work I will update here in 4~10 hours

Solution 2

The purge-vault from this project works nicely: https://github.com/vsespb/mt-aws-glacier

Install, then run these commands (replace vault-name with the name of your vault):

mtglacier retrieve-inventory --config glacier.cfg --vault vault-name

wait for about 2 hours, and then

mtglacier download-inventory --config glacier.cfg --vault vault-name --new-journal vault-name.log
mtglacier purge-vault --config glacier.cfg --vault vault-name --journal vault-name.log

Solution 3

https://github.com/leeroybrun/glacier-vault-remove was created for this exact purpose.

To remove a vault, first install the dependencies:

$ git clone https://github.com/leeroybrun/glacier-vault-remove.git
$ cd glacier-vault-remove
$ python setup.py install

Then create a credentials file, credentials.json in the same directory:

{
  "AWSAccessKeyId": "YOURACCESSKEY",
  "AWSSecretKey":   "YOURSECRETKEY"
}

Then run the script like this

$ python removeVault.py REGION-NAME VAULT-NAME

Example :

$ python removeVault.py us-east-1 my_vault

Solution 4

If you remove a Glacier-backed folder in Arq it goes into Arq's trash. If you select it in Arq's trash and click "Delete Permanently", Arq will delete all the Glacier archives and attempt to delete the Glacier vault. The vault delete might fail because Amazon has to update its "inventory", which it does once/day. The next day, browse under "Other Backup Sets" in Arq, find that vault, select it and click "Delete" to delete it.

If you have a vault that's not associated with any Arq backups, pick "Legacy Glacier Vaults" from Arq's menu, select the vault, and click the button to delete.

Solution 5

You can use a freeware product like CloudBerry Explorer http://www.cloudberrylab.com/free

Note, Glacier data doesn't become available immediately. you need to wait 24 hours for the global inventory to occur on the Amazon side, then you should click Get Inventory button and wait another 5 hours to get the inventory for your account.

Thanks

Share:
21,729
Hasibul-
Author by

Hasibul-

Updated on September 18, 2022

Comments

  • Hasibul-
    Hasibul- over 1 year

    I was using a tool on Mac OS X called Arq to backup my data, but i found it so hard to upload all my stuff since I don't and can't have an internet connection that is fast enough for it.

    So I decided to delete all my backups, but whenever I try from the software itself it does nothing.

    I also tried FastGlacier on my other windows machine, it hangs up and takes too much resources.

    I was wondering if there is an easy way to do this.

    P.S. My glacier has ~450 GB in 341907 archives

    • Joe Wicentowski
      Joe Wicentowski over 7 years
      Note to Arq users - see the answer from Arq developer Stefan Reitshamer below. Avoid the headache of setting up mtglacier, and just use the tool built into Arq!
  • Hasibul-
    Hasibul- over 10 years
    I had nothing but glacier on that account, so i just deleted my aws account, will mark it as the correct answer since, i think it would have worked out if i had tried it.
  • Hasibul-
    Hasibul- about 10 years
    Thank you very much for this, but sadly I don't have any glacier storages to test with it, so please if anyone tests it let me know to mark it the correct answer.
  • user3353
    user3353 almost 10 years
    Not really a good answer because this product doesn't run on OSX.
  • Hasibul-
    Hasibul- almost 10 years
    Thanks for the feedback @CamiloNova I have chosen this as best answer based on your feedback ^_^
  • Pradheep T
    Pradheep T almost 10 years
    I connected to Amazon S3 but it doesn't show me anything. Do I have to specify a server other than s3.amazonaws.com to access glacier?
  • Marius
    Marius over 9 years
    Sorry it was a while ago for me now... I can't quite recall how I eventually fixed it... I think it might have been via these command-line tools listed in one of these other posts.
  • Slipp D. Thompson
    Slipp D. Thompson over 8 years
    Glacier is not S3. They're both part of Amazon Web Services and they're both used to store files, but they have different use-cases, payment structures, restrictions and APIs. Because of this, S3 tools don't work with Glacier and Glacier tools don't work with S3 (though that's not to say there aren't tools out there that are both S3- and Glacier-compatible, written with distinct network handlers and app logic for each service).
  • Dan Poltawski
    Dan Poltawski over 8 years
    This script is much slower than mt-aws-glacier at the current time
  • pmagunia
    pmagunia over 8 years
    I had to wait closer to 4 hours to be able to download-inventory
  • Hasibul-
    Hasibul- over 7 years
    I was using many other amazon services and didn't want to lose them, and i guess many use amazon for buying stuff, but it's good to have this written somewhere for people that never used amazon for something else
  • Form
    Form over 7 years
    @ShereefMarzouk Well, when you close your account in the AWS control panel, it's actually your AWS account you're closing, not your Amazon account that you're using to make purchases. So you'll still be able to use the other Amazon services (as long as they're not part of AWS) as usual.
  • gbmhunter
    gbmhunter over 7 years
    This method seems to be much faster compared to glacier-vault-remove. This method was able to remove 350GB of data in a few hours, while glacier-vault-remove was removing only about 30GB of data every 12 hours.
  • Joe Wicentowski
    Joe Wicentowski over 7 years
    I realize this answer is marked as the confirmed solution, but for Arq users like the original poster, Stefan Reitshamer's answer below is the best, hands down. Arq has a built-in tool for deleting Glacier Vaults. No need to mess around with mtglacier. Just read that answer, and you're done.
  • Joe Wicentowski
    Joe Wicentowski over 7 years
    Thanks, Stefan! I struggled for days to figure out how to delete my Arq vaults—failing to install mtglacier on my Mac, creating a dropcloud ubuntu instance to run mtglacier—and this whole time, the solution was right there in Arq.
  • aaronk6
    aaronk6 over 7 years
    Also, it eats a lot of RAM. I’m trying to delete roughly 120.000 archives—at 1142 of 125413 it already uses more than 1 GB of memory (and it’s increasing with each archive).
  • jrgd
    jrgd about 2 years
    if you use --archive-id="XXXX" the random and infrequent error will stop; it's due to dash-starting archive ids that are used by AWS and that breaks Bash