AWS: Mount S3 Bucket to an EC2 instance. (Later FTP Tunneling)

29,852

Solution 1

I followed these instructions now. https://github.com/s3fs-fuse/s3fs-fuse

I guess they are calling the API in background too, but it works as I wished.

Solution 2

One possible solution to mount S3 to an EC2 instance is to use the new file gateway.

Check out this: https://aws.amazon.com/about-aws/whats-new/2017/02/aws-storage-gateway-supports-running-file-gateway-in-ec2-and-adds-file-share-security-options/

http://docs.aws.amazon.com/storagegateway/latest/userguide/WhatIsStorageGateway.html

Solution 3

Point 1

Whilst the other answerer is correct in saying that S3 is not built for this, it's not true to say a bucket cannot be mounted (I'd seriously consider finding a better way to solve your problem however).

That being said, you can use s3fuse to mount S3 buckets within EC2. There's plenty of good reasons not to do this, detailed here.

Point 2

From there it's just a case of setting up a standard FTP server, since the bucket now appears to your system as if it is any other file system (mostly).

vsftpd could be good choice for this. I'd have a go at both and then post separate questions with any specific problems you run into, but this should give you a rough outline to work from. (Well, in reality I'd have a go at neither and use S3 via app code consuming the API, but still).

Share:
29,852
Timo
Author by

Timo

Updated on August 22, 2022

Comments

  • Timo
    Timo over 1 year

    what do I want to do?

    Step1: Mount a S3 Bucket to an EC2 Instance.

    Step2: Install a FTP Server on the EC2 Instance and tunnel ftp-requests to files in the bucket.

    What did I do so far?

    • create bucket
    • create security group with open input ports (FTP:20,21 - SSH:22 - some more)
    • connect to ec2

    And the following code:

    wget https://storage.googleapis.com/google-code-archive-downloads/v2/code.google.com/s3fs/s3fs-1.74.tar.gz
    tar -xvzf s3fs-1.74.tar.gz
    yum update all
    yum install gcc libstdc++-devel gcc-c++ fuse fuse-devel curl-devel libxml2-devel openssl-devel mailcap
    cd s3fs-1.74
    ./configure  --prefix=/usr
    make
    make install
    vi /etc/passwd-s3fs # set access:secret keys
    chmod 640 /etc/passwd-s3fs
    mkdir /s3bucket
    cd /s3bucket
    

    And cd anwers: Transport endpoint is not connected

    Dunno what's wrong. Maybe I am using the wrong user? But currently I only have one user (for test reasons) except for root.

    Next step would be the ftp tunnel, but for now I'd like getting this to work.

  • Michael - sqlbot
    Michael - sqlbot over 7 years
    There are concerns with this solution, as I mentioned, here but one of yours is not among them. "The biggest for me is that this is an unsupported use of an API by a third party. Stability can and will be a problem." This isn't accurate. The S3 API is public, and documented, and this application is as valid as any other. The problem is actually one of trying to use S3 as something that it isn"t -- it's not a filesystem, and there is an inevitable impedance gap that isn't fully possible to bridge.
  • Michael - sqlbot
    Michael - sqlbot over 7 years
    "(If AWS make a breaking change to the API, the library won't work until it is updated)" is also not a realistic concern. If developing directly against the underlying API weren't suported, it wouldn't be so thoroughly documented. The S3 REST API has not introduced a single breaking change in 10 years. All of the evolution of the service has been completely backwards-compatible. Even object versioning was introduced without breaking existing code that was unaware of versioning.
  • Tom Manterfield
    Tom Manterfield over 7 years
    So a few points, an application being as valid as any other doesn't really prevent an application not being able to update in time when the API it is consuming has large changes. Those changes don't need to be breaking from one version to the next. They just need to be breaking before the consuming application can update. I'd say an application that uses an API for an unintended use case has a much bigger chance of falling fowl of that kind of breaking change. That being said... I must admit I hadn't realised that the S3 API hasn't updated since 2006, so probably not a major issue here!
  • Tom Manterfield
    Tom Manterfield over 7 years
    Actually, turns out they have had one breaking change; SOAP over HTTP was removed. Your point still stands I feel, the stability level of the API seems to be high enough that this shouldn't be the main concern, so I edited to reflect that.
  • Michael - sqlbot
    Michael - sqlbot over 7 years
    I deliberately said the "REST" API. :) Thanks for hearing me out. +1. As long as you are aware of the issues, this limited application is actually pretty flawless. I use proftpd with s3fs and have been for at least a couple of years. Fully aware of the non-idealness of the setup and ever vigilant in the early days for unexpected behavior, I have never had a problem. This really is a viable use case, though many others would be problematic. You can even create objects using the console or API, set x-amz-meta-uid/-gid/-mode metadata and s3fs will interpret those as file owner/group/permissions.
  • Tom Manterfield
    Tom Manterfield over 7 years
    Ha, so you did! I must admit, I've been at a few places that have used it without much/any issue. There was a similarly named library that was a different kettle of fish however, which left me with a fairly strong aversion to any of these solutions. I guess the real problem I have with non-standard (or at least, less common) uses of a technology is that you often find available support jumps off a cliff. As you say though, as long as you know what you are doing!