When you’re using the Ama­zon S3 client, have you come across the occa­sional excep­tion that says some­thing like one of these excep­tion messages:

“The request was aborted: The con­nec­tion was closed unexpectedly”

“Unable to read data from the trans­port con­nec­tion: A block­ing oper­a­tion was inter­rupted by a call to WSACancelBlockingCall”

“Unable to read data from the trans­port con­nec­tion: An estab­lished con­nec­tion was aborted by the soft­ware in your host machine “

If you do, then you’re prob­a­bly attempt­ing to return the response stream directly back to the rest of your appli­ca­tion with some­thing like this:

   1: var response = _s3Client.GetObject(request);

   2: return response.ResponseStream;

How­ever, because the stream is com­ing from the Ama­zon S3 ser­vice and is fed to your code in chunks, your code needs to ensure that the con­nec­tion to S3 stays open until all the data has been received. So as men­tioned in the S3 doc­u­men­ta­tion (which inci­den­tally, most of us don’t read in great details…) here, you should be wrap­ping the response you get from the GetO­b­ject method in a using clause.

Depends on what it is you want to do with the stream, you might have to han­dle it dif­fer­ently. For instance, if you just want to read the string con­tent of a text file, you might want to do this:

   1: using (var response = _s3Client.GetObject(request))

   2: {

   3:     using (var reader = new StreamReader(response.ResponseStream))

   4:     {

   5:         return reader.ReadToEnd();

   6:     }

   7: }

Alter­na­tively, if you want to return the response stream itself, you’ll need to first load the stream in its entirety and return the loaded stream. Unfor­tu­nately, at the time of this writ­ing, the AWSSDK library still hasn’t been migrated to .Net 4 and there­fore doesn’t have the uber use­ful CopyTo method added in .Net 4, so you will most likely have to do the heavy lift­ing your­self and read the data out man­u­ally into a mem­ory stream:

   1: using (var response = _s3Client.GetObject(request))

   2: {

   3:     var binaryData = ReadFully(response.ResponseStream);

   4:     return new MemoryStream(binaryData);

   5: }

   6:

   7: /// <summary>

   8: /// See Jon Skeet's article on reading binary data:

   9: /// http://www.yoda.arachsys.com/csharp/readbinary.html

  10: /// </summary>

  11: public static byte[] ReadFully (Stream stream, int initialLength = -1)

  12: {

  13:     // If we've been passed an unhelpful initial length, just

  14:     // use 32K.

  15:     if (initialLength < 1)

  16:     {

  17:         initialLength = 32768;

  18:     }

  19:

  20:     byte[] buffer = new byte[initialLength];

  21:     int read=0;

  22:

  23:     int chunk;

  24:     while ( (chunk = stream.Read(buffer, read, buffer.Length-read)) > 0)

  25:     {

  26:         read += chunk;

  27:

  28:         // If we've reached the end of our buffer, check to see if there's

  29:         // any more information

  30:         if (read == buffer.Length)

  31:         {

  32:             int nextByte = stream.ReadByte();

  33:

  34:             // End of stream? If so, we're done

  35:             if (nextByte==-1)

  36:             {

  37:                 return buffer;

  38:             }

  39:

  40:             // Nope. Resize the buffer, put in the byte we've just

  41:             // read, and continue

  42:             byte[] newBuffer = new byte[buffer.Length*2];

  43:             Array.Copy(buffer, newBuffer, buffer.Length);

  44:             newBuffer[read]=(byte)nextByte;

  45:             buffer = newBuffer;

  46:             read++;

  47:         }

  48:     }

  49:     // Buffer is now too big. Shrink it.

  50:     byte[] ret = new byte[read];

  51:     Array.Copy(buffer, ret, read);

  52:     return ret;

  53: }

Share

4 Responses to “S3 — Use using block to get the stream”

  1. Prinzhorn says:

    Thank you very much.
    I’m now using a Mem­o­ryS­tream and copy the Respons­eS­tream using the cus­tom CopyTo method I wrote a while ago.

    Mem­o­ryS­tream mem = new MemoryStream();

    Tools.CopyStream(response.ResponseStream, mem);

    mem.Position = 0;

    response.ResponseStream.Close();

  2. Shira says:

    Hi,
    Thank on your com­ment. It solve my brob­lem.
    But when I try to make it num­ber of times it is some­times failed, some­times not.
    (Excep­tion : “The request was aborted: The con­nec­tion was closed unex­pect­edly” )
    My expe­ri­ence with big size image .
    You can to help me ?

  3. theburningmonk says:

    @Shira — when work­ing with S3 (or any AWS ser­vice for that mat­ter) the ser­vice has a self-protection mech­a­nism against over-usage/DNS-attacks, if you’re mak­ing too many con­cur­rent requests against S3 then the ser­vice is able to han­dle you’ll start to see an ele­vated rate of errors such as the you described.

    The .Net AWSSDK already has retry mech­a­nism built-in (which you can con­fig­ure and tweak) but we found it to be ben­e­fi­cial in some cases to add another layer of retries with expo­nen­tial backoff.

    Fur­ther­more, the S3 ser­vice auto­mat­i­cally par­ti­tions objects by pre­fix, and with some sim­ple tricks you can get a lot more out of S3 and poten­tially cure the errors you get if they’re load-related.

    Have a read of these two posts on sharding:

    http://highscalability.com/blog/2012/3/7/scale-indefinitely-on-s3-with-these-secrets-of-the-s3-master.html

    http://aws.typepad.com/aws/2012/03/amazon-s3-performance-tips-tricks-seattle-hiring-event.html

    Lastly, if you’re work­ing with large files then make sure your time­out set­ting is suit­able (the default time­out for .Net AWSSDK is 20 mins I believe).

  4. Shira says:

    Thank for your reply,
    I will try it.

Leave a Reply