As you may already know, we at ServMask help our users with the backup and migration their WordPress websites with the help of our All-in-One WP Migration plugin. And one of the great features we offer is the scheduled automated backup to cloud storage providers like Dropbox, Google Drive and 16+ more. Since we also provide premium support to all our users, we are aware of their problems and challenges. Owing to our analysis of data, we deduced that WP Cron is a major issue. Long story short: WP Cron relies on site’s traffic. Some of the requests trigger a scheduled task. When the traffic of a website is low, execution of these tasks is not triggered.
If you are interested in knowing more about WordPress Cron, there is a great article: https://code.tutsplus.com/articles/insights-into-wp-cron-an-introduction-to-scheduling-tasks-in-wordpress--wp-23119
To solve this problem, we came up with an idea of cloud Cron. In other words, we offer our users a small external Cron service that pings their website in order to trigger pending WP cronjobs.
OK, enough with the preface. Let’s get technical here. Not so obvious technology choice for this task is AWS lambda. It’s small, fast, easy to scale and most importantly cheap or even free in some cases.
So, AWS Lambda. It natively supports a large number of programming languages including Python, JS, Go and many others, but not PHP. And that’s where Bref with Lambda layers comes into play: using Bref PHP layers, we can easily deploy our PHP code into AWS lambda and use it just as any other language supported by AWS. Bref project kindly provides us with 3 layers: PHP lambda function, HTTP app and CLI. Use this link to find PHP layers for your region: https://runtimes.bref.sh/
Let’s make it more practical. We need some prerequisites: aws cli, composer, php7.1+, serverless.
$ mkdir cronapp
$ cd cronapp
$ composer require bref/bref
$ vendor/bref/bref/bref init
Here we need to choose the type of our project. Let’s choose HTTP application and us-east-1 as region.
Currently, Bref still generates a template for SAM, but we are aware of their intension of switching to using serverless instead. So lets create serverless.yml:
service: cronapp
provider:
name: aws
runtime: provided
stage: production
region: us-east-1
apiName: ${self:service.name}-${opt:stage, self:provider.stage}
endpointType: EDGE
memorySize: 256
timeout: 30
versionFunctions: false
functions:
subscribe:
handler: index.php
description: 'Cron scheduler subscribe endpoint'
timeout: 30 # in seconds (API Gateway has a timeout of 30 seconds)
layers:
- 'arn:aws:lambda:us-east-1:209497400698:layer:php-73-fpm:7'
events:
- http: 'ANY /'
- http: 'ANY {proxy+}'
Now let’s deploy our lambda by running:
$ serverless deploy
It’s important to have your credentials set in ~/.aws/credentials
and I would recommend to set the default region in the ~/.aws/config
like that:
[default]
region = us-east-1
If you have done everything right, you will see the endpoint of your HTTP app:
endpoints:
ANY - https://802lwyg79b.execute-api.us-east-1.amazonaws.com/production
ANY - https://802lwyg79b.execute-api.us-east-1.amazonaws.com/production/{proxy+}
If you open this URL in browser, you’ll see the Hello world!
greeting.
In case you don’t want this lambda in your AWS, you can easily clean it up including all the linked resources such as S3 bucket, api gateway, aim role, etc, by running a simple command:
serverless remove
Of course, URLs like this are not very user-friendly: https://802lwyg79b.execute-api.us-east-1.amazonaws.com/production
, and we want something shorter and nicer. It can be done by creating a custom domain name.
First we need a certificate:
- Go to certificate manager
- Request a certificate
- Choose a domain name (in my case it’s a subdomain of a zone hosted with Route53)
- Use DNS validation and create CNAME automatically with Route53
Then create a custom domain:
- Go to API gateway -> Custom Domain Names
- Create Custom Domain Name
- Input your domain name, could be a subdomain
- Choose your certificate from the dropdown
- Add base path mapping for example: path=v1, destination: cronapp-production:production
- Copy target domain name (we’ll need it later for actual domain CNAME value)
And finally we need to create CNAME for the domain:
- Go to Route53
- Choose your zone
- Create Record Set
- Input the subdomain name, type=CNAME, value=target domain name that we copied in step 6 of custom domain configuration.
Let’s try to access our app by the custom domain. Bingo!
Now that we have seen how to create and deploy our lambda, why don’t we stop for a moment and think about our app strategy.
What do we want? We want the user to be able to subscribe to our cron service to be pinged on a regular basis with a given interval at set time of day, day of week, etc.
How can we get it done?
Everybody knows that CloudWatch is used for logs and alarms, but only few know that is also used for scheduling events.
A scheduled event is run every N minutes/hours/days. There’s also a cron notation. This; cron(59 23 * * ? *)
for example, will trigger a daily event at 11:59pm. Sounds like a cloud cron, huh? But what happens when an event is triggered? Well, every event has one or more targets.
And AWS lambda is one of supported target types.
What AWS services do we need?
- First of all we need an php-fpm layer lambda to get requests for subscription [Lambda: layer:php-73-fpm]
- We need to create an event rule based on the schedule provided by user [Cloudwatch Events]
- Obviously we need to save our subscription data somewhere and the best option here is [DynamoDB]
- Our event rule needs a target which is another lambda function that pings the user’s wp-cron URL [Lambda: layer:php-73]
- Then we need logs for our lambdas. We use cloudwatch logs for that. [Cloudwatch Logs]
- And lastly we need to set permissions for all these services to access each other [IAM]
Of course there are much more services involved (S3, API gateway, Cloudformation, Route53, etc.) but thanks to serverless we don’t need to care much.
Let’s start with the subscription lambda:
<?php
require 'vendor/autoload.php';
date_default_timezone_set('UTC');
// Step 1. Read and parse the lambda input: SiteURL, cron schedule
$input = file_get_contents("php://input");
if ($input && ($request = json_decode($input)) && isValidRequest($request)) {
$sdk = new Aws\Sdk([
'version' => 'latest',
'region' => getenv('AWS_DEFAULT_REGION'),
'credentials' => Aws\Credentials\CredentialProvider::env(),
]);
// Step 2. Create CloudWatch event for this website putting all the details info Constant input
$rules = scheduleEventRules($sdk, $request);
// Step 3. Store the subscription info to dynamodb including cloudwatch rules that trigger the function
storeSubscription($sdk, $request, $rules);
}
As you may see we’re using the environment credentials provider. We could also specify the access and secret keys here,
but it’s strongly not recommended to do it this way. AWS creates an ad-hoc pair of keys for you based on lambda role.
In order to let it access other services we need to specify the permissions in the serverless.yml
:
iamRoleStatements:
- Effect: Allow
Action:
- dynamodb:*
Resource:
- '*'
- Effect: Allow
Action:
- lambda:*
Resource:
- '*'
- Effect: Allow
Action:
- events:*
Resource:
- '*'
- Effect: Allow
Action:
- logs:*
Resource:
- '*'
Here we give full access, but later it would be advisable to narrow down the permissions to exact scope we use, e.g.:
iamRoleStatements:
- Effect: Allow
Action:
- dynamodb:Query
- dynamodb:Scan
- dynamodb:GetItem
- dynamodb:PutItem
- dynamodb:UpdateItem
- dynamodb:DeleteItem
Resource:
- { "Fn::GetAtt": ["CronSubscriptionDynamoDBTable", "Arn"] }
So as you can see, first we need to get the method payload as raw HTTP POST, parse it as json and do some validation. Then we want to create the event rules:
<?php
function scheduleEventRules($sdk, $request) {
// converting URL to a normalized sitename
$site = sitename($request->url);
// shortening extension name to include just the name like gdrive or box
$extension = get_extension_name($request->extension);
$events = $sdk->createCloudWatchEvents();
$lambda = $sdk->createLambda();
// List all the events for this site and extension
$rulesList = $events->listRules([
'NamePrefix' => sprintf('%s-%s-', $site, $extension),
'Limit' => 100,
]);
$schedulesNotInUse = array_diff(SCHEDULES, $request->schedules);
$eventsToRemove = array_map(function($schedule) use ($site, $extension) {
return sprintf('%s-%s-%s', $site, $extension, $schedule);
}, $schedulesNotInUse);
// Remove events that are not in use anymore
if ($rulesList && $rulesList['Rules']) {
foreach ($rulesList['Rules'] as $rule) {
if (in_array($rule['Name'], $eventsToRemove)) {
// First we need to remove all the targets, luckily we have just 1
try {
$events->removeTargets([
'Ids' => [sprintf('%s-target', $rule['Name'])],
'Rule' => $rule['Name'],
]);
} catch (AwsException $e) {
// output error message if fails
error_log($e->getMessage());
}
// Then we are good to remove the the event rule
try {
$events->deleteRule(['Name' => $rule['Name']]);
} catch (AwsException $e) {
// output error message if fails
error_log($e->getMessage());
}
}
}
}
$eventRules = []; // Array of format: schedule => event rule ARN
foreach ($request->schedules as $schedule) {
// Set CloudWatch event rule params
$ruleName = sprintf('%s-%s-%s', $site, $extension, $schedule);
$description = sprintf('Scheduled %s call of ping lambda for %s by %s', $schedule, $site, $extension);
$payload = json_encode(['url' => $request->url, 'rule' => $ruleName]);
if (!empty($request->timestamp)) {
$expression = schedule_cron($schedule, $request->timestamp);
} else {
$expression = schedule_rate($schedule);
}
// Create rule: https://docs.aws.amazon.com/aws-sdk-php/v3/api/api-events-2015-10-07.html#putrule
try {
// Creates or updates the specified rule.
$eventResponse = $events->putRule([
'Name' => $ruleName,
'Description' => $description,
'ScheduleExpression' => $expression,
'State' => 'ENABLED',
]);
$eventRules[$schedule] = $eventResponse['RuleArn'];
} catch (AwsException $e) {
// output error message if fails
error_log($e->getMessage());
continue; // skip this event rule
}
// Map the rule to the function
try {
$targetResponse = $events->putTargets([
'Rule' => $ruleName,
'Targets' => [[
'Arn' => sprintf(getenv('CRON_PING_LAMBDA_ARN'), getenv('AWS_DEFAULT_REGION')),
'Id' => sprintf('%s-target', $ruleName),
'Input' => $payload, // SiteURL is passed here (probably also the schedule)
]],
]);
} catch (AwsException $e) {
// output error message and exit
error_log($e->getMessage());
continue; // skip this event rule
}
}
return $eventRules;
}
We’re using these environment variable CRON_PING_LAMBDA_ARN
as a template. This way we can make it region independent.
The getenv('AWS_DEFAULT_REGION')
statement returns the region in which the lambda in currently running.
Environment variables can be easily added by placing it into serverless.yml
:
environment:
CRON_PING_LAMBDA_ARN: 'arn:aws:lambda:%s:${accountId}:function:cron-production-ping'
Things become a bit complicated when your event rule needs a permission to run the ping lambda. Either you need to grant permission for each newly created event rule (except a role policy is limited in size) or you can grant the invoke permission for lambda to be executed by any event rule matching the pattern:
$ aws lambda add-permission --function-name cron-production-ping --action 'lambda:InvokeFunction' --principal events.amazonaws.com --statement-id '1' --source-arn "arn:aws:events:us-east-1:${accountId}:rule/*"
Now that we have all the event rules created we need to update our dynamodb records:
<?php
function storeSubscription($sdk, $request, $rules) {
// Set dynamodb client
$ddb = $sdk->createDynamoDb();
$site = sitename($request->url);
$extension = get_extension_name($request->extension);
// Delete all records for this site+extension prior to set new
$requests = [];
foreach (SCHEDULES as $schedule) {
$id = sprintf('%s-%s-%s', $site, $extension, $schedule);
$requests[] = [
'DeleteRequest' => [
'Key' => [
'id' => ['S' => $id],
],
],
];
}
try {
$result = $ddb->batchWriteItem([
'RequestItems' => ['CronSubscription' => $requests]
]);
} catch (AwsException $e) {
error_log($e->getMessage());
}
// Prepare requests: one insert per schedule
if (count($request->schedules) > 0) {
$requests = [];
foreach ($request->schedules as $schedule) {
$id = sprintf('%s-%s-%s', $site, $extension, $schedule);
$requests[] = [
'PutRequest' => [
'Item' => [
'id' => ['S' => (string) $id],
'url' => ['S' => (string) $request->url],
'time' => ['S' => (string) $request->timestamp],
'schedule' => ['S' => (string) $schedule],
'extension' => ['S' => (string) $request->extension],
'purchase_id' => ['S' => (string) ($request->purchase_id ?? '')],
'event_rule_arn' => ['S' => (string) ($rules[$schedule] ?? '')],
'failures' => ['N' => (string) '0'],
],
],
];
}
try {
$result = $ddb->batchWriteItem([
'RequestItems' => ['CronSubscription' => $requests]
]);
} catch (AwsException $e) {
error_log($e->getMessage());
return false;
}
}
return true;
}
To save the write unit credits, we’re using the batchWriteItem
method to insert multiple (upto 20) items. And of course we need to delete the old items by using same batchWriteItem
with DeleteRequest
clause.
Multi-regional support
It’s all cool now and working great, but… it’s just us-east-1
region. What if we want to make it multi-regional with auto geo-targeting?
Here’re the steps to go multi-regional:
- Deploy your lambdas by running the command with all the regions you need:
serverless deploy --stage production --region us-east-1
- Request a certificate in each region and configure a custom domain for api gateway
- In dynamodb you need to create global tables: just go to Global Table tab and add the regions. Note: the table shall be empty.
- In Route53 create a traffic policy based on latency. It will automatically create a CNAME based on this policy.
How does it work?
When a user’s browser is resolving the domain name, Route53’s name server detects the best matching region based on latency and returns the target domain name pointing to the regional API gateway. Then it runs the subscription lambda in a selected region. In its turn, it gets the current region from the ENV and uses all the resources from the same region. Data are replicated among the regions using DynamoDB’s global tables.
Conclusion
If you need to create a granular web app or service, or you already have a web app coded in PHP, it is possible to use serverless approach to run it in AWS lambda with Bref layers. In this article I tried to highlight some aspects to help you make your first serverless.