Scheduled bucket scan

New malware shows up daily, and existing malware can be modified to evade detection. Zero-day attacks are new threats that have yet to be identified or need to be added to the signatures database. Keeping the signature database up-to-date to detect the latest threats is an ongoing fight.

But you can fight back with periodic malware scanning. That’s why we recommend you run regular full bucket scans to ensure that zero-day attackers are detected as soon as the signature database is updated.

Setup

Install Add-On

  1. Set the Stack name to bucketav-scheduled-bucket-scan.
  2. Set the BucketAVStackName parameter to the stack name of bucketAV (if you followed the docs, the name is bucketav).
  3. Set the BucketName parameter to the name of the S3 bucket that you want to scan. You can also enter multiple bucket names separated by a comma (e.g., bucketa,bucketb) or use a wildcard character (e.g., mycompany-*-prod,bucketa).
  4. Set the ScheduleExpression parameter to a valid expression. E.g., rate(1 day), rate(7 days), or rate(1 hour).
  5. The PagingBatchSize, PagingWaitInSeconds, and TimeoutInSeconds parameters depend on the number of objects/versions in your bucket(s) (see the NumberOfObjects CloudWatch metric of your bucket). The following table provides sample values optimized to enqueue files as steadily as possible over 12 hours. Choose the row that fits your objects (e.g., if your bucket has 6 mio objects, choose the values from the row with 10 mio objects).

For ExcludeScannedObjects set to false (default):

Number of objectsPagingBatchSizePagingWaitInSecondsTimeoutInSeconds
50000503043200 (12 hours)
1000001003043200 (12 hours)
1 mio5001543200 (12 hours)
10 mio50001543200 (12 hours)
50 mio20000743200 (12 hours)
100 mio40000043200 (12 hours)
200 mio60000086400 (24 hours)
350 mio1000000172800 (48 hours)

For ExcludeScannedObjects set to true:

Number of objectsPagingBatchSizePagingWaitInSecondsTimeoutInSeconds
50000503043200 (12 hours)
1000001003043200 (12 hours)
1 mio5001543200 (12 hours)
10 mio5000743200 (12 hours)
50 mio200000172800 (2 days)
100 mio400000259200 (3 days)
200 mio600000360000 (5 days)
  1. If you are interested in a scan report after each scheduled bucket scan:
    1. Install the reporting add-on.
    2. Set the ReportingAddOnStackName parameter to the stack name of the reporting add-on (if you followed the docs, the name is bucketav-reporting).
  2. Select I acknowledge that AWS CloudFormation might create IAM resources.
  3. Click on the Create stack button to save.

Constraints

If your bucket contains more than 350 million objects/versions (see the NumberOfObjects CloudWatch metric), please send us an email for guidance!

  • If you set PagingWaitInSeconds to a value greater than 0, the add-on enqueues not more than 2777*PagingBatchSize objects/versions. With the maximum PagingBatchSize of 100000, the add-on handles up to ~270 million objects/versions.
  • If you set PagingWaitInSeconds to 0, the add-on enqueues not more than 3570*PagingBatchSize objects/versions. With the maximum PagingBatchSize of 100000, the add-on handles up to 350 million objects/versions.

Terraform

resource "aws_cloudformation_stack" "bucketav_add_on_scheduled_bucket_scan" {
  name         = "bucketav-scheduled-bucket-scan"
  template_url = "https://bucketav-add-ons.s3.eu-west-1.amazonaws.com/scheduled-bucket-scan/v2.11.0/bucketav-add-on-scheduled-bucket-scan.yaml"
  capabilities = ["CAPABILITY_IAM"]
  parameters = {
    BucketAVStackName = "bucketav" # if you followed the docs, the name is bucketav
    BucketName = "mycompany-*-prod,bucketa,bucketb"
    ScheduleExpression = "rate(7 days)" # see https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/ScheduledEvents.html
    PagingBatchSize = "50" # get value from table above
    PagingWaitInSeconds = "30" # get value from table above
    TimeoutInSeconds = "43200" # get value from table above
  }
}

Insights

To get insights into running and completed bucket scan runs:

  1. Visit the Step Functions Management Console.
  2. Click on the state machine (if you followed the docs, the name is bucketav-scheduled-bucket-scan-orchestrator).

You will see a list of Executions. The most recent execution is at the top and represents the latest bucket scan. If the status equals Succeeded, the bucket scan is complete. If the status equals Running, the bucket scan is running.

Remember that Succeeded means that all files are enqueued for scanning. It does not mean that all files are already scanned. You can observe the Scan Queue in the CloudWatch Dashboard. An empty (or mostly empty queue if new objects are uploaded in parallel) indicates that all files are scanned.

Update

Which version am I using?

Version 2.7.0 included buckets from all regions using the wildcard character in the BucketName parameter. We fixed this in version 2.7.1!

  1. To update this add-on to version v2.11.0, go to the AWS CloudFormation Management Console.
  2. Double-check the region at the top right.
  3. Search for bucketav-scheduled-bucket-scan, otherwise search for the name you specified.
  4. Select the stack and click on Update.
  5. Select Replace current template and set the Amazon S3 URL to https://bucketav-add-ons.s3.eu-west-1.amazonaws.com/scheduled-bucket-scan/v2.11.0/bucketav-add-on-scheduled-bucket-scan.yaml Copy
  6. Click on Next.
  7. Scroll to the bottom of the page and click on Next.
  8. Scroll to the bottom of the page and click on Next.
  9. Scroll to the bottom of the page, enable I acknowledge that AWS CloudFormation might create IAM resources, and click on Update stack.
  10. While the update runs, the stack status is UPDATE_IN_PROGRES. Reload the table from time to time and …
  11. … wait until the CloudFormation stack status switches to UPDATE_COMPLETE.

Architecture

The following AWS services are used:

  • StepFunction State Machine to orchestrate the S3 bucket scan.
  • Lambda Function to fetch the list of files from the S3 bucket and push them to the Scan Queue.
  • EventBridge Cron Rule to trigger the bucket scan at regular intervals.
  • CloudWatch Alarms to monitor the used AWS services.

Limitations

  • Cross-account access of S3 buckets is not supported when using the wildcard character * within the BucketName parameter. Please contact us in case you need to scan a bunch of buckets owned by another AWS account.

Need more help?

Write us, and we'll get back to you as soon as we can.

Send us an email