What’s the most serverless way to wait for a slow HTTP response?

During the last cohort of my Production-Ready Serverless [1] workshop, a student asked:

If I have to query an ERP system and wait for its response, and it sometimes takes more than 15 minutes to respond, is there a serverless way to do this?

This is a surprisingly hard question to answer because:

A. It’s a query and not a fire-and-forget request, they do have to wait for the ERP system to respond.

B. It’s a third-party system that they have no control over so they can’t just add a callback mechanism to notify them when the query result is ready.

C. None of the common serverless solutions let you hang onto the web connection for more than 15 minutes. Lambda’s max timeout is 15 mins. EventBridge’s API Destination is fire-and-forget and has a max timeout of 5s. Step Function’s API integration through API Gateway is subject to API Gateway’s max timeout of 30s. Similarly, Step Function’s new external endpoint also has a timeout of 1 minute. This is understandable. Ultimately, someone has to pay for the idle time while we wait for the ERP system to respond. When an AWS service does it for us, this cost is passed to AWS. So it’s understandable that they have these timeouts in place.

Given the above, we are left with these two “obvious” answers:

  1. Switch to another ERP system! It’s probably not a feasible solution and likely not an engineering decision either.
  2. Run a container. A Fargate service will charge you for uptime but you can make concurrent requests and make better use of the idle CPU cycles. If the calls to the ERP system are infrequent then you can also run Fargate tasks on-demand instead so you don’t pay for the idle time between calls.

One outside-the-box solution is to abuse the Lambda internals and “skip” the Lambda timeout (you still pay for execution time!). As described in this post. But this is a dangerous approach and one that I don’t recommend.

So that led me back to “you have to run a container”, but is there a more serverless way to do this?

I asked around on Twitter and received some really interesting suggestions! Most suggestions were ways to run a container service or task with low operational overhead. Including:

  • Use CodeBuild (aka Corey Quinn’s favourite container service) to run ephemeral containers. It takes less work to set up compared to Fargate. SST has a handy Job construct that lets you run a container job with CodeBuild.

  • AppRunner. As mentioned above, if the calls to the ERP system are frequent then it would be more cost-efficient to run a long-running service. AppRunner is yet another low-effort way to run a containerised service.

My favourite suggestion is to run a Python shell job with AWS Glue [2]. There are no container images to configure and maintain. Just set up an IAM role and point Glue to a Python script in S3 and you’re good to go.

You’d pay $0.44 per hour for Python shell jobs. If the calls to the ERP system are frequent then this will be an expensive solution. But if the calls are infrequent then it can be a viable solution. And it’s the most serverless way to wait for a slow (> 15 mins) HTTP request.

I hope you’ve enjoyed this article. If you want to level up your serverless game, why not check out the Production-Ready Serverless [1] workshop? I will teach you everything I know about building serverless applications. From structuring projects, testing, deployment and security, to monitoring and troubleshooting in production.

Hope to see you there :-)

Links

[1] Production-Ready Serverless workshop

[2] Python shell jobs in AWS Glue