IllegalStateException in Amazon SDK for Java

In one of our projects we need to communicate with Amazon S3. We use the SDK for Java provided by Amazon itself. The library incorporates Apache HttpComponents for managing HTTP and associated protocols. We noticed there was an issue with the way how Amazon SDK uses the Apache library.

Race condition in Amazon SDK

Here’s what we found in the project’s logs.

java.lang.IllegalStateException: Content has been consumed
    at org.apache.http.entity.BasicHttpEntity.getContent(BasicHttpEntity.java:84)
    at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:254)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:168)
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:2555)
    at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1044)
    at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:928)
    at com.amazonaws.services.s3.AmazonS3$putObject.call(Unknown Source)
    ...

The stacktrace led us to AmazonS3Client class from Amazon SDK for Java. We were using version 1.2.15 of this library, so we took a look into its code and saw the following in com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:254):

if (entity != null && entity.getContent().markSupported()) {
    entity.getContent().reset();
}

Let’s take a look at documentation of method public InputStream getContent() from the entity, object of org.apache.http.entity.BasicHttpEntity:

getContent

public InputStream getContent()
                    throws IllegalStateException

   Obtains the content, once only.

Returns:
    the content, if this is the first call to this method since setContent has been called
Throws:
    IllegalStateException - if the content has not been provided
See Also:
    HttpEntity.isRepeatable()

It looks like getContent should only be called once and the result should be kept locally if we want to reuse it. So, that’s our culprit.

Fix in the newest version of Amazon library

The first step for us was to find out if the issue was fixed in the most recent version of the library. And, actually, it was!

In version 1.3.22 of Amazon SDK, in class AmazonHttpClient around line 262 we can see:

InputStream content = entity.getContent();
if ( retryCount > 0 ) {
    if ( content.markSupported() ) {
        content.reset();
        content.mark(-1);
    }
}

That’s exactly what we were looking for! entity.getContent() is called once, and the input stream is reused later.

Analyze the source code, read the documentation

And now, since version 1.3.22, we can use the library like:

AWSCredentials awsCreds = ...

AmazonS3Client client = new AmazonS3Client(awsCreds)

client.putObject(bucket, key, inputStream,
		getObjectMetadata(contentType, contentLength))

client.setObjectAcl(bucket, key, CannedAccessControlList.PublicRead)

The lesson to remember is that analyzing the source code and documentation together helps hunting down even hard to reproduce bugs.