Lambda functions are a crucial part of any serverless implementation on Amazon Web Services. However, they are not magic and may have some drawbacks, such as cold starts, due to the physical limitations of the hardware. Provisioned concurrency can help alleviate the problem.
What are “cold starts”?
“Cold starts” are a big problem for Lambda, especially when considering a latency sensitive application. The term refers to the startup time required for a Lambda function’s execution environment to work from scratch.
Lambda functions will stay “hot” for a while after they are called. Non-VPC features will warm up for 5 minutes and VPC features will warm up for 15 minutes. During this period, if the function is called again, it will respond immediately. This is great for services that experience constant and regular traffic.
However, if your code hasn’t run in a while, or if you need to scale and run multiple functions concurrently, it will start from scratch. Based on AWS analysis, cold starts occur in less than 1% of requests in production workloads, which is acceptable for many scenarios.
However, depending on the runtime you use (Java and .NET take a while to JIT), a cold start might delay the function call by several seconds. This may be unacceptable for latency sensitive applications.
No more cold starts
Lambda’s provisioned concurrency mode can help solve this problem. You can think of it as Reserved Instances for Lambda Functions – you’re essentially reserving a certain amount of capacity, and a Lambda function will stay active for the entire period.
This has significant benefits, including the almost complete elimination of startup costs. Actually, you won’t have to worry about initialization code optimization at all, since it will run once and keep running. This is a great advantage for JIT compiled languages like Java and C#/.NET, which can have large binaries and startup times to load them.
Compared to the previous example, where functions are cold started, provisioned concurrency starts them in advance and keeps them alive. When an invocation is needed, Lambda will use the warm functions to execute it.
However, it comes with its own drawbacks. Due to how Lambda selects versions for functions, provisioned concurrency does not work with the $LATEST tag. You’ll need to create a new alias, provide concurrency for that alias, and then update it when the version changes.
It’s also important to understand that even though the function runs for long periods of time, provisioned concurrency does not make your application stateful. Lambda functions can and will be destroyed, and you shouldn’t treat them like an EC2 server.
How much does provisioned concurrency cost?
The answer to this depends on how often your function runs and how often you see multiple execution environments being created to meet parallel demand.
The main number to worry about for provisioned concurrency is the number of function executions running at the same time. For example, if you have a function that is called ten times per second and each call lasts 500ms, that function will average 5 concurrent executions per second.
In general, provisioned concurrency doesn’t cost much more than regular Lambda functions. You can use the AWS Pricing Calculator to estimate how much it will cost you personally. For example, a Lambda called 10 times per second, with an invocation time of 500ms and 256MB of memory will cost $60/month to run.
However, that same feature, but with 10 provisioned runs, comes out a bit more at $64.50 per month. In general, this is likely to be a small 5-10% increase in cost, depending on usage.
However, provisioned concurrency is actually cheaper per GB-second of use. This means that if you are consistently running very close to 100% usage, it may be cheaper to reserve concurrency than to use regular Lambda pricing. This is largely due to the fact that it is generally cheaper to reduce the amount of time Lambdas spend on initialization code.