Scheduled bucket scan
New malware shows up daily, and existing malware can be modified to evade detection. Zero-day attacks are new threats that have yet to be identified or need to be added to the signatures database. Keeping the signature database up-to-date to detect the latest threats is an ongoing fight.
But you can fight back with periodic malware scanning. That’s why we recommend you run regular full bucket scans to ensure that zero-day attackers are detected as soon as the signature database is updated.
Setup
Install Add-On (requires a running bucketAV installation)
- Set the Stack name to
bucketav-scheduled-bucket-scan
. - Set the BucketAVStackName parameter to the stack name of bucketAV (if you followed the docs, the name is
bucketav
). - Set the BucketName parameter to the name of the S3 bucket that you want to scan. You can also enter multiple bucket names separated by a comma (e.g.,
bucketa,bucketb
) or use a wildcard character (e.g.,mycompany-*-prod,bucketa
). - Set the ScheduleExpression parameter to a valid expression. E.g.,
rate(1 day)
,rate(7 days)
, orrate(1 hour)
. - The PagingBatchSize, PagingWaitInSeconds, and TimeoutInSeconds parameters depend on the number of objects/versions in your bucket(s) (see the
NumberOfObjects
CloudWatch metric of your bucket). The following table provides sample values optimized to enqueue files as steadily as possible over 12 hours. Choose the row that fits your objects (e.g., if your bucket has 6 mio objects, choose the values from the row with 10 mio objects).
For ExcludeScannedObjects set to false
(default):
Number of objects | PagingBatchSize | PagingWaitInSeconds | TimeoutInSeconds |
---|---|---|---|
50000 | 50 | 30 | 43200 (12 hours) |
100000 | 100 | 30 | 43200 (12 hours) |
1 mio | 500 | 15 | 43200 (12 hours) |
10 mio | 5000 | 15 | 43200 (12 hours) |
50 mio | 20000 | 7 | 43200 (12 hours) |
100 mio | 40000 | 0 | 43200 (12 hours) |
200 mio | 60000 | 0 | 86400 (24 hours) |
350 mio | 100000 | 0 | 172800 (48 hours) |
For ExcludeScannedObjects set to true
:
Number of objects | PagingBatchSize | PagingWaitInSeconds | TimeoutInSeconds |
---|---|---|---|
50000 | 50 | 30 | 43200 (12 hours) |
100000 | 100 | 30 | 43200 (12 hours) |
1 mio | 500 | 15 | 43200 (12 hours) |
10 mio | 5000 | 7 | 43200 (12 hours) |
50 mio | 20000 | 0 | 172800 (2 days) |
100 mio | 40000 | 0 | 259200 (3 days) |
200 mio | 60000 | 0 | 360000 (5 days) |
- If you are interested in a scan report after each scheduled bucket scan:
- Install the reporting add-on.
- Set the ReportingAddOnStackName parameter to the stack name of the reporting add-on (if you followed the docs, the name is
bucketav-reporting
).
- Select I acknowledge that AWS CloudFormation might create IAM resources.
- Click on the Create stack button to save.
Constraints
If your bucket contains more than 350 million objects/versions (see the
NumberOfObjects
CloudWatch metric), please send us an email for guidance!
- If you set PagingWaitInSeconds to a value greater than
0
, the add-on enqueues not more than2777*PagingBatchSize
objects/versions. With the maximum PagingBatchSize of100000
, the add-on handles up to ~270 million objects/versions. - If you set PagingWaitInSeconds to
0
, the add-on enqueues not more than3570*PagingBatchSize
objects/versions. With the maximum PagingBatchSize of100000
, the add-on handles up to 350 million objects/versions.
Terraform
resource "aws_cloudformation_stack" "bucketav_add_on_scheduled_bucket_scan" {
name = "bucketav-scheduled-bucket-scan"
template_url = "https://bucketav-add-ons.s3.eu-west-1.amazonaws.com/scheduled-bucket-scan/v2.11.0/bucketav-add-on-scheduled-bucket-scan.yaml"
capabilities = ["CAPABILITY_IAM"]
parameters = {
BucketAVStackName = "bucketav" # if you followed the docs, the name is bucketav
BucketName = "mycompany-*-prod,bucketa,bucketb"
ScheduleExpression = "rate(7 days)" # see https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/ScheduledEvents.html
PagingBatchSize = "50" # get value from table above
PagingWaitInSeconds = "30" # get value from table above
TimeoutInSeconds = "43200" # get value from table above
}
}
Insights
To get insights into running and completed bucket scan runs:
- Visit the Step Functions Management Console.
- Click on the state machine (if you followed the docs, the name is
bucketav-scheduled-bucket-scan-orchestrator
).
You will see a list of Executions. The most recent execution is at the top and represents the latest bucket scan. If the status equals Succeeded, the bucket scan is complete. If the status equals Running, the bucket scan is running.
Remember that Succeeded means that all files are enqueued for scanning. It does not mean that all files are already scanned. You can observe the Scan Queue in the CloudWatch Dashboard. An empty (or mostly empty queue if new objects are uploaded in parallel) indicates that all files are scanned.
Update
Version 2.7.0 included buckets from all regions using the wildcard character in the
BucketName
parameter. We fixed this in version 2.7.1!
- To update this add-on to version v2.11.0, go to the AWS CloudFormation Management Console.
- Double-check the region at the top right.
- Search for
bucketav-scheduled-bucket-scan
, otherwise search for the name you specified. - Select the stack and click on Update.
- Select Replace current template and set the Amazon S3 URL to
https://bucketav-add-ons.s3.eu-west-1.amazonaws.com/scheduled-bucket-scan/v2.11.0/bucketav-add-on-scheduled-bucket-scan.yaml
Copy - Click on Next.
- Scroll to the bottom of the page and click on Next.
- Scroll to the bottom of the page and click on Next.
- Scroll to the bottom of the page, enable I acknowledge that AWS CloudFormation might create IAM resources, and click on Update stack.
- While the update runs, the stack status is UPDATE_IN_PROGRES. Reload the table from time to time and …
- … wait until the CloudFormation stack status switches to UPDATE_COMPLETE.
Architecture
The following AWS services are used:
- StepFunction State Machine to orchestrate the S3 bucket scan.
- Lambda Function to fetch the list of files from the S3 bucket and push them to the Scan Queue.
- EventBridge Cron Rule to trigger the bucket scan at regular intervals.
- CloudWatch Alarms to monitor the used AWS services.
Limitations
- Cross-account access of S3 buckets is not supported when using the wildcard character
*
within theBucketName
parameter. Please contact us in case you need to scan a bunch of buckets owned by another AWS account.