Working with S3 folders using the .Net AWS SDK

If you’ve been using S3 client in the AWS SDK for .Net you might have noticed that there are no methods that let you interact with the folders in a bucket. As it turns out, S3 does not support folders in the conventional sense*, everything is still a key value pair, but tools such as Cloud Berry or indeed the Amazon web console simply uses ‘/’ characters in the key to indicate a folder structure.

This might seem odd at first but when you think about it, there are no folder structure on your hard drive either, it’s a logical structure the OS provides for you to make it easier for us mere mortals to work with.

Back to the topic at hand, what this means is that:

  • if you add an object with key myfolder/ to S3, it’ll be seen as a folder
  • if you add an object with key myfolder/myfile.txt to S3, it’ll be seen as a file myfile.txt inside a myfolder folder, if the folder object doesn’t exist already it’ll be added automatically
  • when you make a ListObjects call both myfolder/ and myfolder/myfile.txt will be included in the result

Creating folders

To create a folder, you just need to add an object which ends with ‘/’, like this:

public void CreateFolder(string bucket, string folder)
{
    var key = string.Format(@"{0}/", folder);
    var request = new PutObjectRequest().WithBucketName(bucket).WithKey(key);
    request.InputStream = new MemoryStream();
    _client.PutObject(request);
}

Here is a thread on the Amazon forum which covers this technique.

Listing contents of a folder

With the ListObjects method on the S3 client you can provide a prefix requirement, and to get the list of objects in a particular folder simply add the path of the folder (e.g. topfolder/middlefolder/) in the request:

var request = new ListObjectsRequest().WithBucketName(bucket).WithPrefix(folder);

If you are only interested in the objects (including folders) that are in the top level of your folder/bucket then you’d need to do some filtering on the S3 objects returned in the response, something along the line of:

// get the objects at the TOP LEVEL, i.e. not inside any folders
var objects = response.S3Objects.Where(o => !o.Key.Contains(@"/"));

// get the folders at the TOP LEVEL only
var folders = response.S3Objects.Except(objects)
                      .Where(o => o.Key.Last() == '/' &&
                                  o.Key.IndexOf(@"/") == o.Key.LastIndexOf(@"/"));

9 Comments

  1. Kamal   •  

    Hi,

    That was really useful, but when I list the folders in the root it does not display any result though I have many folders. The response does not display folders as separate objects. It lists only the files within the folder. Why is it so? Do I need to specify delimiter or prefix? I could not find solution to this after searching a lot. Help me to get out of this.

    Thanks,
    Kamal

  2. theburningmonk   •     Author

    Hi Kamal,

    When you make the ListObjects request, to list the top level folders, don’t set the prefix but set the delimiter to ‘/’, then inspect the ‘CommonPrefixes’ property on the response for the folders that are in the top folder.

    To list the contents of a ‘rootfolder’, make the request with prefix set to the name of the folder plus the backslash, e.g. ‘rootfolder/’ and set the delimiter to ‘/’. In the response you’ll always have the folder itself as an element with the same key as the prefix you used in the request, plus any subfolders in the ‘CommonPrefixes’ property.

    Hope this helps!
    Yan

  3. Kamal   •  

    Hi Yan,

    Thanks for your reply. That was really helpful. Could you please guide me how to create a folder in Amazon S3 as Cloudberry does?
    I tried many ways but nothing worked. Please give me sample code so that it will be very useful to me. Help me get out of this issue.

    Thanks,
    Kamal

  4. theburningmonk   •     Author

    Hi Kamal,

    All you need is something like this (assuming that you’ve referenced the AWSSDK dll):

    var awsKey = “AWS key for your accout here”;
    var awsSecret = “AWS secret for your account here”;

    // by default, the s3 client will try to use HTTPS to talk to the service
    // if you don’t wanna have to deal with SSL then pass in a config object
    // whose CommunicationProtocol is set to HTTP
    var config = new AmazonS3Config { CommunicationProtocol = Protocol.HTTP };

    // create the client
    var client = Amazon.AWSClientFactory.CreateAmazonS3Client(awsKey, awsSecret, config);

    // make sure the key for the object you put ends with /, this needs to be an empty
    // object which is why in the next line I’m setting the input stream to a brand
    // new MemoryStream
    var request = new PutObjectRequest().WithBucketName(“bucketname”).WithKey(“testfolder/”);
    request.InputStream = new MemoryStream();

    // this will create the folder for you, which you can see in Cloudberry
    client.PutObject(request);

  5. AJ   •  

    Great, thanks for this. Was looking for the “prefix” method… that sorted me out a treat :)

  6. Si Hong   •  

    Hi AJ/Yan:
    below is my codes, I got error “Object reference not set to an instance of an object”. I know I have file in my bucket. Can you tell me what’s wrong here? Thanks…
    StringBuilder output = new StringBuilder();
    ListObjectsRequest GetList = new ListObjectsRequest()
    {
    BucketName = “myBucketName”,
    Prefix = “images/”,
    Delimiter = “/”
    };
    ListObjectsResponse response1 = s3.ListObjects(GetList);
    foreach (S3Object s3Object in response1.S3Objects)
    {
    output.AppendFormat(“{0}”, s3Object.Key);
    Response.Write(“output: ” + output.ToString());
    }

  7. Imran Aziz (@ManaKultras)   •  

    Thanks!! That helped me

  8. devendra   •  

    Hi Yan,

    I recently started working on AEM. As you article suggest how to create a new folder in S3.
    I think i need to rewrite the connector to include that function. i mean how to customize connector.
    only jar files are available.

    my case- Amazon S3 stores data in a flat directory structure, but in our case we want to create a new folder every time the folder reaches a particular object count.

    Example- suppose we want to store 5 objects and we have restriction of 3 objects per folder then after 3 objects it will save the next object in a new folder.
    is there any way to do these? Thanks in advance.

  9. theburningmonk   •     Author

    Hello devendra, unfortunately there’s no scalable (and strongly consistent) way of doing what you’re thinking in S3 since: a) S3 list operations are very expensive, prefix and doesn’t scale well to high number of concurrent operations, you can add prefix folders to encourage more sharding (ie. have root folders named `1`, `2`, `3`, … and so on, and then your application folders are hashed to one of these); b) S3 is eventually consistent and there’s no way around it AFAIK.

    What you should consider doing is using DynamoDB or ElastiCache where you can do atomic increments (or in the case of DynamoDB, a conditional put). So, before putting an object into a folder, you first perform a conditional put against DynamoDB with the pre-condition being a `count` associated with the key (the folder name) is 3. If that succeeds, then proceed with the S3 put. *you might need to consider rolling back the DynamoDB update if you care about consistency between the count in S3 vs DynamoDB. Hope this helps.

Leave a Reply

Your email address will not be published.