How to get the progress status of the file uploaded to Amazon S3 using Java

15,237

Solution 1

I got the answer of my questions the best way get the true progress status by using below code

ObjectMetadata metadata = new ObjectMetadata();
metadata.setContentType(mpf.getContentType());

String key = Util.getLoginUserName() + "/"
        + mpf.getOriginalFilename();
metadata.setContentLength(mpf.getSize());
PutObjectRequest putObjectRequest = new PutObjectRequest(
                Constants.S3_BUCKET_NAME, key, mpf.getInputStream(),
                metadata)
        .withStorageClass(StorageClass.ReducedRedundancy);

putObjectRequest.setProgressListener(new ProgressListener() {
        @Override
        public void progressChanged(ProgressEvent progressEvent) {
            System.out.println(progressEvent
                    .getBytesTransfered()
                    + ">> Number of byte transfered "
                    + new Date());
            progressEvent.getBytesTransfered();
            double totalByteRead = request
                    .getSession().getAttribute(
                                                    Constants.TOTAL_BYTE_READ) != null ? (Double) request
                                            .getSession().getAttribute(Constants.TOTAL_BYTE_READ) : 0;

            totalByteRead += progressEvent.getBytesTransfered();
            request.getSession().setAttribute(Constants.TOTAL_BYTE_READ, totalByteRead);
            System.out.println("total Byte read "+ totalByteRead);

            request.getSession().setAttribute(Constants.TOTAL_PROGRESS, (totalByteRead/size)*100);
        System.out.println("percentage completed >>>"+ (totalByteRead/size)*100);   
        if (progressEvent.getEventCode() == ProgressEvent.COMPLETED_EVENT_CODE) {
            System.out.println("completed  ******");
        }
    }
});
s3Client.putObject(putObjectRequest);

The problem with my previous code was , I was not setting the content length in meta data so i was not getting the true progress status. The below line is copy from PutObjectRequest class API

Constructs a new PutObjectRequest object to upload a stream of data to the specified bucket and key. After constructing the request, users may optionally specify object metadata or a canned ACL as well.

Content length for the data stream must be specified in the object metadata parameter; Amazon S3 requires it be passed in before the data is uploaded. Failure to specify a content length will cause the entire contents of the input stream to be buffered locally in memory so that the content length can be calculated, which can result in negative performance problems.

Solution 2

I going to assume you are using the AWS SDK for Java.

Your code is working as it should: It shows read is being called with 4K being read each time. Your idea (updated in the message) is also correct: The AWS SDK provides ProgressListener as a way to inform the application of progress in the upload.

The "problem" is in the implementation of the AWS SDK it is buffering more than the ~30K size of your file (I'm going to assume it's 64K) so you're not getting any progress reports.

Try to upload a bigger file (say 1M) and you'll see both methods give you better results, after all with today's network speeds reporting the progress on a 30K file is not even worth it.

If you want better control you could implement the upload yourself using the S3 REST interface (which is what the AWS Java SDK ultimately uses) it is not very difficult, but it is a bit of work. If you want to go this route I recommend finding an example for computing the session authorization token instead of doing it yourself (sorry my search foo is not strong enough for a link to actual sample code right now.) However once you go to all that trouble you'll find that you actually want to have a 64K buffer on the socket stream to ensure maximum throughput in a fast network (which is probably why the AWS Java SDK behaves as it does.)

Share:
15,237
Krushna
Author by

Krushna

Passionate Java Programmer/Developer

Updated on July 24, 2022

Comments

  • Krushna
    Krushna almost 2 years

    I'm uploading multiple files to Amazon S3 using Java.

    The code I'm using is as follows:

    MultipartHttpServletRequest multipartRequest = (MultipartHttpServletRequest) request;
    MultiValueMap < String,
    MultipartFile > map = multipartRequest.getMultiFileMap();
    try {
        if (map != null) {
            for (String filename: map.keySet()) {
                List < MultipartFile > fileList = map.get(filename);
                incrPercentge = 100 / fileList.size();
                request.getSession().setAttribute("incrPercentge", incrPercentge);
                for (MultipartFile mpf: fileList) {
    
                    /*
             * custom input stream wrap to original input stream to get
             * the progress
             */
                    ProgressInputStream inputStream = new ProgressInputStream("test", mpf.getInputStream(), mpf.getBytes().length);
                    ObjectMetadata metadata = new ObjectMetadata();
                    metadata.setContentType(mpf.getContentType());
                    String key = Util.getLoginUserName() + "/" + mpf.getOriginalFilename();
                    PutObjectRequest putObjectRequest = new PutObjectRequest(
                    Constants.S3_BUCKET_NAME, key, inputStream, metadata).withStorageClass(StorageClass.ReducedRedundancy);
                    PutObjectResult response = s3Client.putObject(putObjectRequest);
    
                }
            }
        }
    } catch(Exception e) {
        e.printStackTrace();
    }
    

    I have to create the custom input stream to get the number byte consumed by Amazon S3. I got that idea from the question here: Upload file or InputStream to S3 with a progress callback

    My ProgressInputStream class code is as follows:

    package com.spectralnetworks.net.util;
    import java.io.IOException;
    import java.io.InputStream;
    
    import org.apache.commons.vfs.FileContent;
    import org.apache.commons.vfs.FileSystemException;
    
    public class ProgressInputStream extends InputStream {
        private final long size;
        private long progress,
        lastUpdate = 0;
        private final InputStream inputStream;
        private final String name;
        private boolean closed = false;
    
        public ProgressInputStream(String name, InputStream inputStream, long size) {
            this.size = size;
            this.inputStream = inputStream;
            this.name = name;
        }
    
        public ProgressInputStream(String name, FileContent content)
        throws FileSystemException {
            this.size = content.getSize();
            this.name = name;
            this.inputStream = content.getInputStream();
        }
    
        @Override
        public void close() throws IOException {
            super.close();
            if (closed) throw new IOException("already closed");
            closed = true;
        }
    
        @Override
        public int read() throws IOException {
            int count = inputStream.read();
            if (count > 0) progress += count;
            lastUpdate = maybeUpdateDisplay(name, progress, lastUpdate, size);
            return count;
        }@Override
        public int read(byte[] b, int off, int len) throws IOException {
            int count = inputStream.read(b, off, len);
            if (count > 0) progress += count;
            lastUpdate = maybeUpdateDisplay(name, progress, lastUpdate, size);
            return count;
        }
    
        /**
         * This is on reserach to show a progress bar
         * @param name
         * @param progress
         * @param lastUpdate
         * @param size
         * @return
         */
        static long maybeUpdateDisplay(String name, long progress, long lastUpdate, long size) {
            /* if (Config.isInUnitTests()) return lastUpdate;
            if (size < B_IN_MB/10) return lastUpdate;
            if (progress - lastUpdate > 1024 * 10) {
                lastUpdate = progress;
                int hashes = (int) (((double)progress / (double)size) * 40);
                if (hashes > 40) hashes = 40;
                String bar = StringUtils.repeat("#",
                        hashes);
                bar = StringUtils.rightPad(bar, 40);
                System.out.format("%s [%s] %.2fMB/%.2fMB\r",
                        name, bar, progress / B_IN_MB, size / B_IN_MB);
                System.out.flush();
            }*/
            System.out.println("name " + name + "  progress " + progress + " lastUpdate " + lastUpdate + " " + "sie " + size);
            return lastUpdate;
        }
    }
    

    But this is not working properly. It is printing immediately up to the file size as follows:

    name test  progress 4096 lastUpdate 0 sie 30489
    name test  progress 8192 lastUpdate 0 sie 30489
    name test  progress 12288 lastUpdate 0 sie 30489
    name test  progress 16384 lastUpdate 0 sie 30489
    name test  progress 20480 lastUpdate 0 sie 30489
    name test  progress 24576 lastUpdate 0 sie 30489
    name test  progress 28672 lastUpdate 0 sie 30489
    name test  progress 30489 lastUpdate 0 sie 30489
    name test  progress 30489 lastUpdate 0 sie 30489
    

    And the actual uploading is taking more time (more than 10 times after printing the lines).

    What I should do so that I can get a true upload status?

    • Krushna
      Krushna over 11 years
      Now I'm got some idea to choose putObjectRequest.setProgressListener(new ProgressListener() { @Override public void progressChanged(ProgressEvent progressEvent) { System.out.println(progressEvent.getBytesTransfered(‌​)+">> Number of byte transferd"); } }); Still I'm not getting true status
  • Krushna
    Krushna over 11 years
    Thank you Eli, But i figure out the problem , the problem is My server is sits on local host so from my browser to my server data is being transfer very fast , but when my server is tiring to send the file to S3 it is taking time ,(Because now only the file is going out from my computer). I have debug the cody found that AWS SDK making HTTPClient request to put the file in S3 and this one is taking time.Is there any way so that the file can directly transfer to S3.
  • Eli Algranti
    Eli Algranti over 11 years
    Hi Krushna, now I think I get you: the progress on your browser shows data being uploaded very fast and then at the end it seems to get stuck, doesn't it? If this is the case you did not disable buffering of client requests in the server.
  • Eli Algranti
    Eli Algranti over 11 years
    Continuing my previous comment... You need to disable buffering for the servlet(?) handling the upload. Then your code will be called immediately after web server parses the http header and you can start streaming the file to S3 at the same time the client streams the file to you. Sending the file directly from the browser is not recommended (even if you could somehow bypass the same server sandbox) because you'd have to share your AWS private key with the client.
  • Krushna
    Krushna over 11 years
    My code is written above, I'm using s3Client.putObject(putObjectRequest); can you give me any idea , where I will write the code to disable buffering. The AWS SDK internally reading the file after completing it , it's making a HTTPClient request to upload the file to S3
  • Eli Algranti
    Eli Algranti over 11 years
    Your problem is in mpf.getBytes().length getBytes() reads the whole file and returns it as a byte array, use getSize() instead. Sorry for misleading you earlier, I got my response and request buffering mixed up. Usually it is response buffering that can be turned off. Requests are not usually buffered unless you have a proxy somewhere.
  • Armand
    Armand over 10 years
    where does request come from?
  • Krushna
    Krushna over 7 years
    Here the request come from the HttpServletRequest , actually this code was inside service method of a servlet.