Scan bucket at regular intervals Add-On
This add-on enqueues all files in a bucket for scanning at regular intervals—the EC2 instances of bucketAV scan the files.
Table of Contents
Setup
- Set the Stack name to
bucketav-scheduled-bucket-scan
. - Set the BucketAVStackName parameter to the stack name of bucketAV (if you followed our docs, the name is
bucketav
, ors3-virusscan
for older installations). - Set the BucketName parameter to the name of the S3 bucket that you want to scan.
- Set the ScheduleExpression parameter to a valid expression
. E.g.,
rate(1 day)
,rate(7 days)
, orrate(1 hour)
. - Select I acknowledge that AWS CloudFormation might create IAM resources.
- Click on the Create stack button to save.
Constraints
If your bucket contains more than 100 million objects/versions (see the
NumberOfObjects
CloudWatch metric of your bucket), please send us an email for guidance!
- If you set
PagingWaitInSeconds
a value greater than 0, the add-on enqueues not more than 8333*PagingBatchSize objects/versions. With the maximumPagingBatchSize
of 10000, the add-on handles up to ~83 million objects/versions. - If you set
PagingWaitInSeconds
to 0, the add-on enqueues not more than 12500*PagingBatchSize objects/versions. With the maximumPagingBatchSize
of 10000, the add-on handles up to 125 million objects/versions.
Insights
To get insights into running and completed bucket scan runs:
- Visit the Step Functions Management Console .
- Click on the state machine (if you followed our docs, the name is
bucketav-scheduled-bucket-scan
).
You will see a list of Executions, the most recent execution is at the top and represents the latest bucket scan. If the status equals Succeeded, the bucket scan is complete. If the status equals Running, the bucket scan is running.
Performance optimization
By default, the add-on is configured to protect your bucketAV Scan Queue from overload. Every 30 seconds (controlled by parameter PagingWaitInSeconds
), 50 files (controlled by parameter PagingBatchSize
) are enqueued for scanning.
We recommend tweaking the two parameters in the following cases:
- You want to speed up the bucket scan.
- In the Step Functions Management Console
, click on the state machine (if you followed our docs, the name is
bucketav-scheduled-bucket-scan
), and the last execution status is Timed out. - In the Step Functions Management Console
, click on the state machine (if you followed our docs, the name is
bucketav-scheduled-bucket-scan
), and the last execution status is Failed. The last event errored with “The execution reached the maximum number of history events (25000)” error.- To check for this error, open a AWS CloudShell .
- Ensure that you are in the correct region where bucketAV runs.
- Execute the following commands(replace
EXECUTION_ARN
with the Execution ARN; In the Step Functions Management Console , click on the last execution to get the value):aws stepfunctions get-execution-history --execution-arn EXECUTION_ARN --reverse-order --max-items 3
Copy
We recommend optimizing by:
- Doubling
PagingBatchSize
(maximum 1,000) and halvingPagingWaitInSeconds
(minimum 0). - Execute the scan again:
- Visit the Step Functions Management Console .
- Click on the state machine (if you followed our docs, the name is
bucketav-scheduled-bucket-scan
. - Click on the last execution.
- Click on New Execution.
- Confirm with Start execution.
- If you still see the error, go to step 1.
Running a full bucket scan only once
To run a full bucket scan only once, set the ScheduleExpression parameter to a value in 10 minutes from now (UTC timezone
) by using the expression cron(mm hh dd MM ? yyyy)
filled with:
mm
: Minute (0-59).hh
: Hour (0-23).dd
: Day (1-31).MM
: Month (1-12).?
: Please leave the question mark as it is.yyyy
: Year (1970-2199).
Update
- To update this add-on to version v2.5.0, go to the AWS CloudFormation Management Console .
- Double-check the region at the top right.
- Search for
bucketav-scheduled-bucket-scan
(ors3-virusscan-scheduled-bucket-scan
for older installations), otherwise search for the name you specified. - Select the stack and click on Update.
- Select Replace current template and set the Amazon S3 URL to
https://bucketav-add-ons.s3.eu-west-1.amazonaws.com/scheduled-bucket-scan/v2.5.0/bucketav-add-on-scheduled-bucket-scan.yaml
Copy - Click on Next.
- Scroll to the bottom of the page and click on Next.
- Scroll to the bottom of the page and click on Next.
- Scroll to the bottom of the page, enable I acknowledge that AWS CloudFormation might create IAM resources, and click on Update stack.
- While the update runs, the stack status is UPDATE_IN_PROGRES. Reload the table from time to time and …
- … wait until the CloudFormation stack status switches to UPDATE_COMPLETE.
Architecture
The following AWS services are used:
- StepFunction State Machine to orchestrate the S3 bucket scan.
- Lambda Function to fetch the list of files from the S3 bucket and push them to the Scan Queue.
- EventBridge Cron Rule to trigger the bucket scan at regular intervals.
- CloudWatch Alarms to monitor the used AWS services.