AWS has announced solutions to Lambda cold starts four times: Lambda extensions warming, SnapStart for Java, Provisioned Concurrency, and improved runtime initialization. Each announcement cycles through the engineering community as “cold starts are solved.”

They are not solved. They are mitigated in specific contexts.

Teams building latency-sensitive applications on Lambda still see 200-1500ms cold starts in the P99 latency. If you have SLOs that require < 100ms at P99, Lambda cold starts are a real problem, not a theoretical one.

Here is what actually works and what each mitigation costs.

What Causes Cold Starts

A Lambda cold start happens when AWS needs to initialize a new execution environment for a function. The stages:

  1. Container provisioning: AWS allocates compute resources and starts a container (or microVM in Firecracker). ~50-100ms
  2. Runtime initialization: The Node.js / Python / Java / JVM runtime starts up. Node.js: ~30-50ms. Python: ~30-80ms. Java: ~500-2000ms.
  3. Function package loading: Your deployment package downloads from S3 and loads. Larger packages = slower. ~10-100ms depending on size.
  4. Function init code: The code outside your handler function runs. Database connections, SDK initialization, loading environment variables. Your control.

The total: Node.js cold starts are typically 200-500ms. Python 200-600ms. Java 1000-3000ms. Java is the extreme outlier - the JVM startup time is the killer.

SnapStart for Java

SnapStart is the most impactful Lambda improvement for Java. It pre-initializes the runtime after the first deployment, takes a snapshot, and restores from that snapshot on cold starts.

The result: Java cold starts go from 1-3 seconds to 200-500ms. This is a real and significant improvement.

The setup:

# Lambda function configuration
Properties:
  FunctionName: my-java-function
  Runtime: java21
  SnapStart:
    ApplyOn: PublishedVersions

The limitation: SnapStart only works when you publish versioned Lambda functions. It doesn’t apply to the $LATEST version. You need to use Lambda aliases pointing to published versions.

Also: SnapStart requires checking your initialization code for things that shouldn’t be snapshot and restored (random seed initialization, unique identifiers). If your init code does something that should be unique per invocation, put it in the handler instead.

Provisioned Concurrency

Provisioned Concurrency pre-initializes a specified number of execution environments. Those environments are ready immediately - no cold start when they handle a request.

The cost: you pay for provisioned concurrency even when it’s not handling requests. For a Node.js function with 128 MB memory:

  • Lambda regular cost: $0.0000002/request + $0.0000000021/ms
  • Provisioned Concurrency: $0.000004646/GB-s allocated + regular execution cost

If you provision 10 instances of a 512 MB function for 24 hours:

  • 10 * 0.5 GB * 86,400 seconds * $0.000004646 = ~$20/day
  • ~$600/month just to keep 10 instances warm

This is why Provisioned Concurrency solves cold starts but doesn’t make Lambda competitive with containers for latency-sensitive applications unless you’re carefully sizing. At 10 instances, you’re paying $600/month and you can handle 10 concurrent requests without cold starts. A single small ECS task or Fargate container handling 10 concurrent requests might cost $30-50/month. Provisioned Concurrency is one of those AWS services where teams quietly overpay without realizing it.

Auto Scaling Provisioned Concurrency

A smarter approach: use Application Auto Scaling to scale provisioned concurrency based on traffic patterns:

ScalableTarget:
  ResourceId: !Sub "function:${LambdaFunction}:${LambdaAlias}"
  ScalableDimension: lambda:function:ProvisionedConcurrency
  MinCapacity: 2
  MaxCapacity: 50

ScheduledScaling:
  ScalableTarget: !Ref ScalableTarget
  Schedule: "cron(0 8 * * ? *)"  # 8 AM UTC
  ScheduledAction:
    ScalableTargetAction:
      MinCapacity: 10
      MaxCapacity: 50

Pre-warm more instances before your peak traffic period and scale down overnight. This reduces the cost while maintaining low cold starts during actual usage.

Code-Level Optimizations

These reduce cold start duration regardless of which AWS feature you use:

Reduce deployment package size: Every MB of your package is downloaded on cold start. Use esbuild or Rollup to tree-shake your dependencies. A 10 MB bundle cold starts noticeably faster than a 50 MB bundle.

# Typical Next.js bundle: 150 MB
# Optimized with esbuild bundling: 5-15 MB

Move expensive init outside the handler:

// This runs on every invocation - WRONG
exports.handler = async (event) => {
  const dbClient = new DatabaseClient(process.env.DB_URL);
  // ...
};

// This runs once on cold start - CORRECT
const dbClient = new DatabaseClient(process.env.DB_URL);
exports.handler = async (event) => {
  // dbClient already initialized
};

Database connections initialized outside the handler persist across invocations in the same execution environment. This adds ~50-200ms to cold start but eliminates the connection overhead from every invocation.

Lazy loading: Don’t import everything at the top level if you only need it conditionally:

// Always pays the import cost even if error path is rare
import { ErrorReporter } from './expensive-error-reporter';

// Better for rarely-used heavy modules
const getErrorReporter = () => require('./expensive-error-reporter');

When to Just Use Containers

If cold starts are frequently in your P99 latency and you need consistent sub-100ms responses:

  • ECS Fargate with min 1 task: always warm, predictable latency, ~$15-50/month for a small service
  • Fly.io or Railway: simpler than ECS, competitive pricing
  • Cloudflare Workers: sub-5ms cold starts for the right use cases

Lambda is excellent for event-driven workloads where occasional cold starts are acceptable. It’s the wrong tool for latency-sensitive synchronous APIs where P99 is a hard requirement.

Scenario Recommendation
Occasional background jobs Lambda, no warming needed
APIs with < 100ms P99 SLO Containers or Cloudflare Workers
APIs with < 300ms P99 SLO Lambda + Provisioned Concurrency
Java APIs Lambda + SnapStart
Unpredictable traffic, cost-sensitive Lambda with cold starts accepted

Bottom Line

Lambda cold starts are not solved but are manageable. SnapStart dramatically improves Java. Provisioned Concurrency eliminates cold starts at a cost that makes sense at moderate traffic volumes. Code-level optimizations (small bundles, init outside handler) reduce cold start duration regardless. For strict P99 latency requirements that Lambda cannot meet even with mitigations, containers are the right answer and Lambda is being used for a problem it was not optimized to solve.