I’ve been working on a feature for a client this week that involves scheduling a task at a specific point in the future. There were a few edge cases I needed to consider which made this a bit tricky, so I did a bit of research around the different options available within AWS to see if I could come up with a solution that didn’t involve some sort of polling. I’ve briefly documented my findings below.

Scheduling use case examples

I’ll start off by looking at generalised examples of scheduling requirements that I’ve encountered:

  1. Executing a task on a CRON schedule (e.g. execute a daily report)
  2. Scheduling parameterised individual tasks dynamically to execute at an absolute or relative point in the future (e.g. send reminder notification to user 2 hours before their appointment is due)
  3. Delay a task until an action takes place in another system (e.g. a secretary approves an appointment request)
  4. Cancel a task that has previously been scheduled (e.g. user has cancelled their appointment)
  5. Adjust the wait period (increase or decrease) of a previously scheduled task (e.g. user has rescheduled their appointment)

Operational criteria

Before we look at the tools available to implement these scenarios, there are a few operational criteria to consider that will vary between different implementation options. Yan Cui previously called out some of these items in this post and I’ve added a few of my own:

  • What level of time precision for the scheduled task time is acceptable to you? Does it need to be milliseconds or could it be days (or somewhere in between)?
  • Could there be a large number of open tasks at any one time and if so, what effects could this cause?
  • Could there be a large co-occurrence of the scheduled task that could cause scaling issues such as throttling? (e.g. all scheduled for a national or worldwide event)
  • What are the costs involved for scheduling each task?

AWS scheduling primitives

Below I’ve listed the primitives within AWS services that you can use to implement non-polling scheduling strategies. For details on how to compose these primitives into fuller solutions, see the Further Reading section below.

CloudWatch events

CloudWatch enables you to create an events rule that triggers on a regular schedule using CRON-like expressions. You can then hook this event into a Lambda function.

Step Functions

AWS Step Functions seems to be the most flexible service when it comes to built-in scheduling features. Here’s what it’s got:

  • Wait state — allows you to provide dynamic relative and absolute times to delay a state machine before continuing to the next state.
  • Callbacks allow you to pause execution of further tasks until an external event is received. Combine this with a timeout, you can accomplish a few different use cases.
  • For standard StepFunction workflows, you can use the StopExecution operation to cancel a scheduled task.

DynamoDB TimeToLive and Streams

As well as being a good fallback option for polling, DynamoDB has a Time To Live (TTL) setting that might be able to help you with scheduling. When combined with a DynamoDB stream, you can trigger a Lambda function whenever a “remove” event is received from the stream, in order to detect “dying” items. However, the precision is poor on this as AWS only states that items will be cleaned up within 48 hours, making it a bad fit for many use cases.

Further Reading

— Paul


Comments