I’m afraid you’re thinking about AWS Lambda cold starts all wrong

When I discuss AWS Lambda cold starts with folks in the context of API Gateway, I often get responses along the line of:

Meh, it’s only the first request right? So what if one request is slow, the next million requests would be fast.

Unfortunately that is an oversimplification of what happens.

Cold start happens once for each concurrent execution of your function.

API Gateway would reuse concurrent executions of your function that are already running if possible, and based on my observations, might even queue up requests for a short time in hope that one of the concurrent executions would finish and can be reused.

If, all the user requests to an API happen one after another, then sure, you will only experience one cold start in the process. You can simulate what happens with Charles proxy by capturing a request to an API Gateway endpoint and repeat it with a concurrency setting of 1.

As you can see in the timeline below, only the first request experienced a cold start and was therefore noticeably slower compared to the rest.

1 out of 100, that’s bearable, hell, it won’t even show up in my 99 percentile latency metric.

What if the user requests came in droves instead? After all, user behaviours are unpredictable and unlikely to follow the nice sequential pattern we see above. So let’s simulate what happens when we receive 100 requests with a concurrency of 10.

All of a sudden, things don’t look quite as rosy – the first 10 requests were all cold starts! This could spell trouble if your traffic pattern is highly bursty around specific times of the day or specific events, e.g.

  • food ordering services (e.g. JustEat, Deliveroo) have bursts of traffic around meal times
  • e-commence sites have highly concentrated bursts of traffic around popular shopping days of the year – cyber monday, black friday, etc.
  • betting services have bursts of traffic around sporting events
  • social networks have bursts of traffic around, well, just about any notable events happening around the world

For all of these services, the sudden bursts of traffic means API Gateway would have to add more concurrent executions of your Lambda function, and that equates to a bursts of cold starts, and that’s bad news for you. These are the most crucial periods for your business, and precisely when you want your service to be at its best behaviour.

If the spikes are predictable, as is the case for food ordering services, you can mitigate the effect of cold starts by pre-warming your APIs – i.e. if you know there will be a burst of traffic at noon, then you can schedule a cron job (aka, CloudWatch schedule + Lambda) for 11:58am that will hit the relevant APIs with a blast of concurrent requests, enough to cause API Gateway to spawn sufficient no. of concurrent executions of your function(s).

You can mark these requests with a special HTTP header or payload, so that the handling function can distinguish it from a normal user requests and can short-circuit without attempting to execute the normal code path.

That’s great that you mitigates the impact of cold starts during these predictable bursts of traffic, but does it not betray the ethos of serverless compute, that you shouldn’t have to worry about scaling?

Sure, but making users happy thumps everything else, and users are not happy if they have to wait for your function to cold start before they can order their food, and the cost of switching to a competitor is low so they might not even come back the next day.

Alternatively, you could consider reducing the impact of cold starts by reducing the length of cold starts:

  • by authoring your Lambda functions in a language that doesn’t incur a high cold start time – i.e. Node.js, Python, or Go
  • choose a higher memory setting for functions on the critical path of handling user requests (i.e. anything that the user would have to wait for a response from, including intermediate APIs)
  • optimizing your function’s dependencies, and package size
  • stay as far away from VPCs as you possibly can! VPC access requires Lambda to create ENIs (elastic network interface) to the target VPC and that easily adds 10s (yeah, you’re reading it right) to your cold start

There are also two other factors to consider:

Finally, what about those APIs that are seldom used? It’s actually quite likely that every time someone hits that API they will incur a cold start, so to your users, that API is always slow and they become even less likely to use it in the future, which is a vicious cycle.

For these APIs, you can have a cron job that runs every 5–10 mins and pings the API (with a special ping request), so that by the time the API is used by a real user it’ll hopefully be warm and the user would not be the one to have to endure the cold start time.

This method for pinging API endpoints to keep them warm is less effective for “busy” functions with lots of concurrent executions – your ping message would only reach one of the concurrent executions and if the level of concurrent user requests drop then some of the concurrent executions would be garbage collected after some period of idleness, and that is what you want (don’t pay for resources you don’t need).

Anyhow, this post is not intended to be your one-stop guide to AWS Lambda cold starts, but merely to illustrate that it’s a more nuanced discussion than “just the first request”.

As much as cold starts is a characteristic of the platform that we just have to accept, and we love the AWS Lambda platform and want to use it as it delivers on so many fronts. It’s important not to let our own preference blind us from what’s important, which is to keep our users happy and build a product that they would want to keep on using.

To do that, you do need to know the platform you’re building on top of, and with the cost of experimentation so low, there’s no good reason not to experiment with AWS Lambda yourself and try to learn more about how it behaves and how you can make the most of it.

Finding coldstarts : how long does AWS Lambda keep your idle functions around?

In the last post I compared the coldstart time for Lambda functions with different language, memory and code size. One of the things I learnt was that idle functions are no longer terminated after 5 minutes of inactivity.

AWS Lambda – compare coldstart time with different languages, memory and code sizes

It is a fantastic news and something that Amazon has quietly changed behind the scene. However, it lead me to ask some follow up questions:

  1. what’s the new idle time that would trigger a coldstart?
  2. does it differ by memory allocation?
  3. are functions still recycled 4 hours from the creation of host VM?

To answer the first 2 questions, I devised an experiment.

First, here are my hypotheses going into the experiment.

WARNING: this experiment is intended to help us glimpse into implementation details of the AWS Lambda platform, they are fun and satisfy my curiosity but you shouldn’t build your application with the results in mind as AWS can change these implementation details without notice!


Hypothesis 1 : there is an upper bound to how long Lambda allows your function to stay idle before reclaiming the associated resources

This should be a given. Idle functions occupy resources that can be used to help other AWS customers scale up to meet their needs (and not to mention the first customer is not paying for his idle functions!), it simply wouldn’t make any sense for AWS to keep idle functions around forever.

Hypothesis 2 : the idle timeout is not a constant

From an implementor’s point-of-view, it might be simpler to keep this timeout a constant?—?ie. functions are always terminated after X mins of inactivity. However, I’m sure AWS will vary this timeout to optimise for higher utilisation and keep the utilisation levels more evenly distributed across its fleet of physical servers.

For example, if there’s an elevated level of resource contention in a region, why not terminate idle functions earlier to free up space?

Hypothesis 3 : the upper bound for inactivity varies by memory allocation

An idle function with 1536 MB of memory allocation is wasting a lot more resource than an idle function with 128 MB of memory, so it makes sense for AWS to terminate idle functions with higher memory allocation earlier.

Experiment : find the upper bound for inactivity

To find the upper bound for inactivity, we need a Lambda function to act as the system-under-test and report when it has experienced a coldstart. We also need a mechanism to progressively increase the interval between invocations until we arrive at an interval where each invocation is guaranteed to be a coldstart?—?the upper bound. We will determine the upper bound when we see 10 consecutive coldstarts when invoked X minutes apart.

To answer hypothesis 3 we will also replicate the system-under-test function with different memory allocations.

This experiment is a time consuming process, it requires discipline and a degree of precision in timing. Suffice to say I won’t be doing this by hand!

My first approach was to use a CloudWatch Schedule to trigger the system-under-test function, and let the function dynamically adjust the schedule based on whether it’s experienced a coldstart. It failed miserably?—?whenever the system-under-test updates the schedule the schedule will fire shortly after rather than wait for the newly specified interval…

Instead, I turned to Step Functions for help.

AWS Step Functions allows you to create a state machine where you can invoke Lambda functions, wait for a specified amount of time, execute parallel tasks, retry, catch errors, etc.

A Wait state allows you to drive the no. of seconds to wait using data (see SecondsPath param in the documentation). Which means I can start the state machine with an input like this:

    “target”: “when-will-i-coldstart-dev-system-under-test-128”, 
    “interval”: 600, 
    “coldstarts”: 0 

The input is passed to another find-idle-timeout function as invocation event. The function will invoke the target (which is one of the variants of the system-under-test function with different memory allocations) and increase the interval if the system-under-test function doesn’t report a coldstart. The find-idle-timeout function will return a new piece of data for the Step Function execution:

    “target”: “when-will-i-coldstart-dev-system-under-test-128”, 
    “interval”: 660, 
    “coldstarts”: 0 

Now, the Wait state will use the interval value and wait 660 seconds before switching back to the FindIdleTimeout state where it’ll invoke our system-under-test function again (with the previous output as input).

"Wait": {
    "Type": "Wait",
    "SecondsPath": "$.interval",
    "Next": "FindIdleTimeout"

With this setup I’m able to kick off multiple executions?—?one for each memory setting.

Along the way I have plenty of visibility into what’s happening, all from the comfort of the Step Functions management console.

Here are the results of the experiment:

From the data, it’s clear that AWS Lambda shuts down idle functions around the hour mark. It’s also interesting to note that the function with 1536 MB memory is terminate over 10 mins earlier, this supports hypothesis 3.

I also collected data on all the idle intervals where we saw a coldstart and categorised them into 5 minute brackets.

Even though the data is serious lacking, but from what little data I managed to collect you can still spot some high level trends:

  • over 60% of coldstarts (prior to hitting the upper bound) happened after 45 mins of inactivity
  • the function with 1536 MB memory sees significantly fewer no. of cold starts prior to hitting the upper bound (worth noting that it also has a lower upper bound (48 mins) than other functions in this test

The data supports hypothesis 2 though there’s no way for us to figure out the reason behind these coldstarts or if there’s significance to the 45 mins barrier.


To summary the findings from our little experiment in one line:

AWS Lambda will generally terminate functions after 45–60 mins of inactivity, although idle functions can sometimes be terminated a lot earlier to free up resources needed by other customers.

I hope you find this experiment interesting, but please do not build applications on the assumptions that:

    a) these results are valid, and

    b) they will remain valid for the foreseeable future

I cannot stress enough that this experiment is meant for fun and to satisfy a curious mind, and nothing more!

The results from this experiment also deserve further investigation. For instance, the 1536 MB function exhibited very different behaviour to other functions, but is it a special case or would functions with more than 1024 MB of memory all share these traits? I’d love to find out, maybe I’ll write a follow up to this experiment in the future.

Watch this space ;-)

From F# to Scala – implicits

Note: read the whole series here.


Having looked at case class and extractors recently, the next logical thing would be partial functions. Since Andrea pointed me to a really well article on the subject I don’t think there’s anything else for me to add, so instead, let’s look at Scala’s implicits, which is a very powerful language feature that enables some interesting patterns in Scala.


implicit operator in .Net

You can define both implicit and explicit operators in C#, which allows you to either:

  • implicitly converts a type to another in assignment, method argument, etc.; or
  • explicitly cast a type to another

F# on the other hand, is a more strongly typed language and does not allow such implicit type conversion. You can still implement and use existing implicit operators created in C#, which is available to you as a static member op_Implicit on the type it’s defined on.

For example.

Additionally, you can also create type extensions to add extension methods AND properties to a type. Whilst this is the idiomatic F# way, these extension members are only visible to F# (and not to C#).


implicit in Scala

Where the implicit operator in .Net (or more specifically, in C#) is concerned with type conversion, implicit in Scala is far more generalised and powerful.

Scala’s implicit comes in 3 flavours:

  • implicit parameters
  • implicit conversions
  • implicit classes

implicit parameters

You can mark the last parameter of a function as implicit, which tells the compiler that the caller can omit the argument and the compiler should find a suitable substitute from the closure.

For example, take the multiplyImplicitly function below.

The last argument is omitted at invocation but the compiler sees a suitable substitute – mult – in scope because:

  1. it’s the right type – Multiplier
  2. it’s declared as implicit

and implicitly applies it as the second argument to complete the invocation.

That’s right, only val/var/def that are declared as implicit can be used as an implicit argument.

If mult was not declared as implicit, then a compiler error awaits you instead.

What if there are more than one matching implicit value in scope?

Then you also get a compiler error.

Unsurprisingly, implicit var also works, and given the mutable nature of var it means multiplyImplicitly can yield different value depending on when it’s called.

Finally, you can also use an implicit def (which you can think of as a property, it is evaluated each time but it doesn’t have to be attached to an object).

A common use case for implicit parameters is to implicitly use the global ExecutionContext when working with Scala’s Future. Similarly, the Akka framework use implicit to pass around ActorContext and ActorSystem objects.

implicit conversions

What if you define a higher-order function that takes in another function, f, as argument, can f be chosen implicitly as well?

Yes, it can. It is in fact a common pattern to achieve implicit type conversion (similar to .Net’s implicit operator as we saw at the start of this post).

Notice in the above that show(“42”) compiles even though we haven’t defined an implicit function of the signature String => String. We have the built-in identity function to thank for that.

Just before the Scala compiler throws a typemismatch exception it’ll look for suitable implicit conversion in scope and apply it. Which means, our implicit conversions can be useful outside of the show function too.

And you’re protected by the same guarantee that there can only be one matching implicit function in scope.

What if there’s a more generic implicit conversion with the signature Any -> String, would the compiler complain about ambiguous implicit values or is it smart enough to use intToStr for Int?

It’s smart enough and does the right thing.

implicit classes

Finally, we have implicit classes which allows you to implement .Net style extension methods.

You must create the implicit class inside another object/trait/class, and it

and the class can take only one non-implicit argument in the constructor.

Note that in addition to extension methods, you can also create extension values and properties with implicit class. Which, as we mentioned at the start of the post, is something that you can also do with F#’s type extensions mechanism.



From F# to Scala – apply & unapply functions

Note: read the whole series here.


Last time around we looked at Scala’s Case Class in depth and how it compares to F#’s Discriminated Unions. F# also has Active Patterns, which is a very powerful language feature in its own right. Unsurprisingly, Scala also has something similar in the shape of extractors (via the unapply function).

Before we can talk about extractors we have to first talk about Scala’s object again. Remember when we first met object in Scala I said it’s Scala’s equivalent to F#’s module? (except it can be generic, supports inheritance, and multiple inheritance)

Well, turns out Scala has another bit of special magic in the form of an apply function.


The apply function

In Scala, if you assign a function to a value, that value will have the type Function1[TInput, TOutput]. Since everything in Scala is an object, this value also have a couple of functions on it.

You can use andThen or compose to compose it with another function (think of them as F#’s >> and << operators respectively).

The apply function applies the argument to the function, but you can invoke the function without it.

Ok, now that we know what apply function’s role is, let’s go back to object.

If you declare an apply function in an object, it essentially allows the object to be used as a factory class (indeed this is called the Factory pattern in Scala).

You see this pattern in Scala very often, and there are some useful built-in factories such as Option (which wraps an object as Some(x) unless it’s null, in which case returns None).

String and BigInt defines their own apply function too (in String‘s case, it returns the char at the specified index) .

You can also define an apply function on a class as well as an object, and it works the same way. For instance…

I find this notion of applying arguments to an object somewhat alien, almost as if this is an elaborate way of creating a delegate even though Scala already have first-class functions…

Ok, can you pass multiple arguments to apply? What about overloading?

Check, and Check.

What about case classes and case object?

Check, and Check.

Ok. Can the apply function(s) be inherited and overridden like a normal function?

Check, and Check. Although this is consistent with inheritance and OOP in Java, I can’t help but to feel it has the potential to create ambiguity and one should just stick with plain old functions.


The unapply function (aka extractors)

When you create a case class or case object, you also create a pattern that can be used in pattern matching. It’s not the only way to create patterns in Scala, you can also create a pattern by defining an unapply function in your class/object.

or, if you don’t want to return anything from the pattern.

So, the unapply function turns a Scala object into a pattern, and here are some limitations on the unapply function:

  1. it can only take one argument
  2. if you want to return a value, then the return type T must defines members:
    • isEmpty: Boolean
    • get: Any

side note: point 2 is interesting. Looking at all the examples on the Internet one might assume the unapply function must return an Option[T], but turns out it’s OK to return any type so long it has the necessary members!

Whilst I can’t think of a situation where I’d need to use anything other than an Option[T], this insight gives me a better understanding of how pattern matching in Scala works.

Whether or not the pattern matches is determined by the value of isEmpty of the result type T. And the value returned by your pattern – ie msg in the example above – is determined by the value of get of the result type T. So if you’re feeling a bit cheeky, you can always do something like this:

Since the unapply function is a member on an object (like the apply function), it means it should work with a class too, and indeed it does.

As you can see from the snippet above, this allows you to create parameterized patterns and work around the limitation of having only one argument in the unapply function.

You can nest patterns together too, for example.

Here, the Int pattern returns an Int, and instead of binding it to a name we can apply another pattern inline to check if the value is even.

And whilst it doesn’t get mentioned in any of the articles I have seen, these patterns are not limited to the match clause either. For instance, you can use it as part of declaration. (but be careful, as you’ll get a MatchError if the pattern doesn’t match!)


Primer: F# Active Patterns

Before we compare Scala’s extractors to F#’s active patterns, here’s a quick primer if you need to catch up on how F#’s active patterns work.

Like extractors, it gives you named patterns that you can use in pattern matching, and comes in 3 flavours: single-case, partial, and multi-case.

You can parameterise a pattern.

If a pattern’s declaration has multiple arguments, then the last argument is the thing that is being pattern matched (same as the single argument to unapply); the preceding arguments can be passed into the pattern at the call site. For example…

If you don’t want to return anything, then you can always return () or Some() instead (partial patterns require the latter).

You can also mix and match different patterns together using & and |. So we can rewrite the fizzbuzz function as the following..

Patterns can be nested.

Finally, patterns can be used in assignment as well as function arguments too.


extractors vs F# Active Patterns

Scala’s extractor is the equivalent of F#’s partial pattern, and although there is no like-for-like replacement for single-case and multi-case patterns you can mimic both with extractor(s):

  • an extractor that always return Some(x) is like a single-case pattern
  • multiple extractors working together (maybe even loosely grouped together via a common trait) can mimic a multi-case pattern, although it’s up to you to ensure the extractors don’t overlap on input values

Whilst it’s possible to create parameterized patterns with Scala extractors (by using class instead of object), I find the process of doing so in F# to be much more concise. In general, the syntax for declaring patterns in Scala is a lot more verbose by comparison.

The biggest difference for me though, is that in F# you can use multiple patterns in one case expression by composing them with & and |. This makes even complex patterns easy to express and understand.



From F# to Scala – case class/object (ADTs)

Note: read the whole series here.


Continuing on from where we left off with traits last time around, let’s look at Scala’s case class/object which can be used to create Algebraic Data Types (ADTs) in Scala.


Case Class

You can declare an ADT in F# using Discriminated Unions (DUs). For example, a binary tree might be represented as the following.

In Scala, you can declare this ADT with a pair of case class.

Here is how you construct and pattern match against F# DUs.

This looks very similar in Scala (minus all the comments).

Also, one often use single case DUs in F# to model a domain, for instance…

From what I can see, this is also common practice in Scala (at least in our codebase).

From this snippet, you can also see that case classes do not have to be tied together to a top level type (via inheritance), but more on this shortly.

UPDATE 16/01/2017: as @clementd pointed out, you can turn case classes with a single member into a value class and avoid boxing by extending from anyVal. For more details, see here.

On the surface, case classes in Scala looks almost identical to DUs in F#. But as we peek under the cover, there are some subtle differences which you ought to be aware of.


Case Object

In F#, if you have a union case that is not parameterised then the compiler will optimise and compile it as a singleton. For instance, the NotStarted case is compiled to a singleton as it’s always the same.

You can declare this GameStatus with case classes in Scala.

But reference equality check (which is done with .eq in Scala) reveals that:

  • NotStarted() always create a new instance
  • but equals is overridden to perform structural equality comparison

If you want NotStarted to be a singleton then you need to say so explicitly by using a case object instead.

Couple of things to note here:

  • as mentioned in my last post, object in Scala declares a singleton, so does a case object
  • a case object cannot have constructor parameters
  • a case object cannot be generic (but a normal object can)

When you pattern match against a class object you can also lose the parentheses too (see the earlier example in print[T]).


Cases as Types

For me, the biggest difference between DUs in F# and case classes in Scala is that you declare an ADT in Scala using inheritance, which has some interesting implications.

As we saw earlier, each case class in Scala is its own type and you can define a function that takes Node[T] or Empty[T].

This is not possible in F#. Instead, you rely on pattern matching (yup, you can apply pattern matching in the function params) to achieve a similar result.

It’s also worth mentioning that, case objects do not define their own types and would require pattern matching.

What this also means is that, each case class/object can define their own members! Oh, and what we have learnt about traits so far also holds true here (multiple inheritance, resolving member clashes, etc.).

In F#, all members are defined on the top level DU type. So, a like-for-like implementation of the above might look like this.

Whilst this is a frivolous example, I think it is still a good demonstration of why the ability to define members and inheritance on a per-case basis can be quite powerful. Because we can’t do that with F#’s union cases, we had to sacrifice some compile-time safety and resort to runtime exceptions instead (and the implementation became more verbose as a result).

The autoPlay function also looks slightly more verbose than its Scala counterpart, but it’s mainly down to a F# quirk where you need to explicitly cast status to the relevant interface type to access its members.


sealed and finally

“make illegal states unpresentable” – Yaron Minsky

Ever since Yaron Minsky uttered these fabulous words, it has been repeated in many FP circles and is often achieved in F# through a combination of DUs and not having nulls in the language (apart from when inter-opting with C#, but that’s where your anti-corruption layer comes in).

This works because DU defines a finite and closed set of possible states that can be represented, and cannot be extended without directly modifying the DU. The compiler performs exhaustive checks for pattern matches and will warn you if you do not cover all possible cases. So if a new state is introduced into the system, you will quickly find out which parts of your code will need to be updated to handle this new state.

For instance, using the GameStatus type we defined in the previous section…

the compiler will issue the following warning:

warning FS0025: Incomplete pattern matches on this expression. For example, the value ‘GameOver (_)’ may indicate a case not covered by the pattern(s).

You can also upgrade this particular warning – FS0025 – to error to make it much more prominent.


In Scala, because case classes/objects are loosely grouped together via inheritance, the set of possible states represented by these case classes/objects is not closed by default. This means new states (potentially invalid states introduced either intentionally or maliciously) can be added to the system and it’s not possible for you or the compiler to know all possible states when you’re interacting with the top level trait.

There’s a way you can help the compiler (and yourself!) in this case is to mark the top level trait as sealed.

A sealed trait can only be extended inside the file it’s declared in. It also enables the compiler to perform exhaustive checks against pattern matches to warn you about any missed possible input (which you can also upgrade to an error to make them more prominent).

Since case objects cannot be extended further we don’t have to worry about it in this case. But case classes can be extended by a regular class (case class-to-case class inheritance is prohibited), which presents another angle for potential new states to creep in undetected.

So the convention in Scala is to mark case classes as final (which says it cannot be extended anywhere) as well as marking the top level trait as sealed.

Voila! And, it works on abstract classes too.


But wait, turns out sealed is not transitive.

If your function takes a case class then you won’t get compiler warnings when you miss a case in your pattern matching.

You could, make the case class sealed instead, which will allow the compiler to perform exhaustive checks against it, but also opens up the possibility that the case class might be extended in the same file.

Unfortunately you can’t mark a case class as both final and sealed, so you’d have to choose based on your situation I suppose.


Reuse through Inheritance

Because case classes are their own types and they can inherit multiple traits, it opens up the possibility for you to share case classes across multiple ADTs.

For instance, many collection types have the notion of an empty case. It’s possible to share the definition of the Empty case class.

I think it’s interesting you can do this in Scala, although I’m not sure that’s such a good thing. It allows for tight coupling between unrelated ADTs, all in the name of code reuse.

Sure, “no one would actually do this”, but one thing I learnt in the last decade is that if something can be done then sooner or later you’ll find it has been done 



To wrap up this fairly long post, here are the main points we covered:

  • you can declare ADTs in Scala using case class and/or case object
  • case classes/objects are loosely grouped together through inheritance
  • a case class defines its own type, unlike discriminated unions in F#
  • a case object creates a singleton
  • case classes/objects can define their own members
  • case classes/objects support multiple inheritance
  • marking the top level trait as sealed allows compiler to perform exhaustive checks when you pattern match against it
  • Scala convention is to seal the top level trait and mark case classes as final
  • sealed is not transitive, you lose the compiler warnings when pattern matching against case classes directly
  • you can mark case classes as final or sealed, but not both
  • multiple inheritance allows you to share case classes across different ADTs, but you probably shouldn’t