AWS S3: Uploading large file fails with ResetException: Failed to reset the request input stream

14,402

Solution 1

This definitely looks like a bug, which I have reported. The solution is to use the other constructor which accepts a File instead of InputStream

def upload(bucketName: String,  keyName: String,  file: File,  contentLength: Long,  contentType: String,  serverSideEncryption: Boolean = true,  storageClass: StorageClass = StorageClass.ReducedRedundancy ):Upload = {
  val metaData = new ObjectMetadata
  metaData.setContentType(contentType)
  metaData.setContentLength(contentLength)

  if(serverSideEncryption) {
    metaData.setSSEAlgorithm(ObjectMetadata.AES_256_SERVER_SIDE_ENCRYPTION)
  }

  val putRequest = new PutObjectRequest(bucketName, keyName, file)
  putRequest.setStorageClass(storageClass)
  putRequest.getRequestClientOptions.setReadLimit(100000)
  putRequest.setMetadata(metaData)
  tx.upload(putRequest)

}
}

Solution 2

I'v investigated this issue, it was a long story.

The conclusion is: pass a system property to java by insert following options to java command line

-Dcom.amazonaws.sdk.s3.defaultStreamBufferSize=YOUR_MAX_PUT_SIZE

See https://github.com/aws/aws-sdk-java/blob/master/aws-java-sdk-s3/src/main/java/com/amazonaws/services/s3/AmazonS3Client.java#L1668

This tells AmazonS3Client to set appropriate max size of unwindable buffer which will be used to re-read for retry.

Solution 3

S3 doesn't support a PUT request that large.

The largest object that can be uploaded in a single PUT is 5 gigabytes.

http://aws.amazon.com/s3/faqs

Beyond that, you have to use the multipart upload API, which allows each part to be 5GB and the maximum object size to be 5TB. You'd be well-served to use multipart for files smaller than 5GB, too, since multipart supports parallel uploading of the parts.

Share:
14,402
lolski
Author by

lolski

Updated on June 09, 2022

Comments

  • lolski
    lolski almost 2 years

    Can anyone tell me what is wrong with the following code such that a large file upload (>10GB) always fails with ResetException: Failed to reset the request input stream?

    The failure always happens after a while (i.e. after around 15 minutes), which must mean that the upload process is executing only to fail somewhere in the middle.

    Here's what I've tried to debug the problem:

    1. in.marksSupported() == false // checking if mark is supported on my FileInputStream

      I highly suspect that this is the problem, since the S3 SDK seems to want to do a reset operation at some point during the upload, probably if the connection is lost or if the transfer process encounters some error.

    2. Wrapping my FileInputStream within a BufferedInputStream to enable marking. Now calling in.marksSupported() returns true, meaning that mark support is there. Strangely, the upload process still fails with the same kind of error.

    3. Adding putRequest.getRequestClientOptions.setReadLimit(n), where n=100000 (100kb), and 800000000 (800mb) but it still throws the same error. I suspect because this parameter is used to reset the stream, which, as stated above, isn't supported on a FileInputStream

    Interestingly, the same problem doesn't happen on my AWS development account. I assume that is just because the dev account is not under a heavy load as my production account, meaning that the upload process can execute as smoothly as possible without any failure at all.

    Please have a look at my code below:

    object S3TransferExample {
    // in main class
    def main(args: Array[String]): Unit = {
        ...
        val file = new File("/mnt/10gbfile.zip")
        val in = new FileInputStream(file)
        // val in = new BufferedInputStream(new FileInputStream(file)) --> tried wrapping file inputstream in a buffered input stream, but it didn't help..
        upload("mybucket", "mykey", in, file.length, "application/zip").waitForUploadResult
        ...
    }
    
    val awsCred = new BasicAWSCredentials("access_key", "secret_key")
    val s3Client = new AmazonS3Client(awsCred)
    val tx = new TransferManager(s3Client)
    
    def upload(bucketName: String,  keyName: String,  inputStream: InputStream,  contentLength: Long,  contentType: String,  serverSideEncryption: Boolean = true,  storageClass: StorageClass = StorageClass.ReducedRedundancy ):Upload = {
      val metaData = new ObjectMetadata
      metaData.setContentType(contentType)
      metaData.setContentLength(contentLength)
    
      if(serverSideEncryption) {
        metaData.setSSEAlgorithm(ObjectMetadata.AES_256_SERVER_SIDE_ENCRYPTION)
      }
    
      val putRequest = new PutObjectRequest(bucketName, keyName, inputStream, metaData)
      putRequest.setStorageClass(storageClass)
      putRequest.getRequestClientOptions.setReadLimit(100000)
    
      tx.upload(putRequest)
     
    }
    }
    

    Here is the complete stack trace:

    Unable to execute HTTP request: mybucket.s3.amazonaws.com failed to respond
    org.apache.http.NoHttpResponseException: mybuckets3.amazonaws.com failed to respond
        at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:143) ~[httpclient-4.3.4.jar:4.3.4]
        at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57) ~[httpclient-4.3.4.jar:4.3.4]
        at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:260) ~[httpcore-4.3.2.jar:4.3.2]
        at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:283) ~[httpcore-4.3.2.jar:4.3.2]
        at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:251) ~[httpclient-4.3.4.jar:4.3.4]
        at org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:197) ~[httpclient-4.3.4.jar:4.3.4]
        at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:271) ~[httpcore-4.3.2.jar:4.3.2]
        at com.amazonaws.http.protocol.SdkHttpRequestExecutor.doReceiveResponse(SdkHttpRequestExecutor.java:66) ~[aws-java-sdk-core-1.9.13.jar:na]
        at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123) ~[httpcore-4.3.2.jar:4.3.2]
        at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:685) ~[httpclient-4.3.4.jar:4.3.4]
        at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:487) ~[httpclient-4.3.4.jar:4.3.4]
        at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863) ~[httpclient-4.3.4.jar:4.3.4]
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82) ~[httpclient-4.3.4.jar:4.3.4]
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57) ~[httpclient-4.3.4.jar:4.3.4]
        at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:685) [aws-java-sdk-core-1.9.13.jar:na]
        at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:460) [aws-java-sdk-core-1.9.13.jar:na]
        at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:295) [aws-java-sdk-core-1.9.13.jar:na]
        at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3710) [aws-java-sdk-s3-1.9.13.jar:na]
        at com.amazonaws.services.s3.AmazonS3Client.doUploadPart(AmazonS3Client.java:2799) [aws-java-sdk-s3-1.9.13.jar:na]
        at com.amazonaws.services.s3.AmazonS3Client.uploadPart(AmazonS3Client.java:2784) [aws-java-sdk-s3-1.9.13.jar:na]
        at com.amazonaws.services.s3.transfer.internal.UploadCallable.uploadPartsInSeries(UploadCallable.java:259) [aws-java-sdk-s3-1.9.13.jar:na]
        at com.amazonaws.services.s3.transfer.internal.UploadCallable.uploadInParts(UploadCallable.java:193) [aws-java-sdk-s3-1.9.13.jar:na]
        at com.amazonaws.services.s3.transfer.internal.UploadCallable.call(UploadCallable.java:125) [aws-java-sdk-s3-1.9.13.jar:na]
        at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:129) [aws-java-sdk-s3-1.9.13.jar:na]
        at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:50) [aws-java-sdk-s3-1.9.13.jar:na]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_40]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_40]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_40]
        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40]
    com.amazonaws.ResetException: Failed to reset the request input stream;  If the request involves an input stream, the maximum stream buffer size can be configured via request.getRequestClientOptions().setReadLimit(int)
      at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:636)
      at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:460)
      at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:295)
      at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3710)
      at com.amazonaws.services.s3.AmazonS3Client.doUploadPart(AmazonS3Client.java:2799)
      at com.amazonaws.services.s3.AmazonS3Client.uploadPart(AmazonS3Client.java:2784)
      at com.amazonaws.services.s3.transfer.internal.UploadCallable.uploadPartsInSeries(UploadCallable.java:259)
      at com.amazonaws.services.s3.transfer.internal.UploadCallable.uploadInParts(UploadCallable.java:193)
      at com.amazonaws.services.s3.transfer.internal.UploadCallable.call(UploadCallable.java:125)
      at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:129)
      at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:50)
      at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      at java.lang.Thread.run(Thread.java:745)
    Caused by: java.io.IOException: Resetting to invalid mark
      at java.io.BufferedInputStream.reset(BufferedInputStream.java:448)
      at com.amazonaws.internal.SdkBufferedInputStream.reset(SdkBufferedInputStream.java:106)
      at com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:103)
      at com.amazonaws.event.ProgressInputStream.reset(ProgressInputStream.java:139)
      at com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:103)
      at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:634) 
    
  • lolski
    lolski almost 9 years
    My upload is done using TransferManager, meaning that multipart upload has been taken care of by the SDK API automatically. We can refer to the multipart upload example docs, which is similar to what I already have: docs.aws.amazon.com/AmazonS3/latest/dev/HLuploadFileJava.htm‌​l
  • Admin
    Admin almost 9 years
    Could you post a link to that bug report?
  • lolski
    lolski about 6 years
    @bagi No, this does not fix the issue unfortunately.
  • Chris F
    Chris F almost 3 years
    Where would you put this option in a Gradle setting, in the gradle.properties file?