Scan bucket at regular intervals Add-On

This add-on enqueues all files from one or multiple buckets for scanning at regular intervals or once—the EC2 instances of bucketAV scan the files.

Table of Contents

Setup

Install Add-On

  1. Set the Stack name to bucketav-scheduled-bucket-scan.
  2. Set the BucketAVStackName parameter to the stack name of bucketAV (if you followed our docs, the name is bucketav, or s3-virusscan for older installations).
  3. Set the BucketName parameter to the name of the S3 bucket that you want to scan.
  4. Set the ScheduleExpression parameter to a valid expression . E.g., rate(1 day), rate(7 days), or rate(1 hour).
  5. The PagingBatchSize, PagingWaitInSeconds, and TimeoutInSeconds parameters depend on the number of objects/versions in your bucket (see the NumberOfObjects CloudWatch metric of your bucket). The following table provides sample values optimized to enqueue files as steady as possible over 12 hours. Choose the row that fits your objects (e.g., if your bucket has 6 mio objects, choose the values from the row with 10 mio objects).
Number of objectsPagingBatchSizePagingWaitInSecondsTimeoutInSeconds
50000503043200 (12 hours)
1000001003043200 (12 hours)
1 mio5001543200 (12 hours)
10 mio50001543200 (12 hours)
50 mio20000743200 (12 hours)
100 mio40000043200 (12 hours)
200 mio60000086400 (24 hours)
350 mio1000000172800 (48 hours)
  1. Select I acknowledge that AWS CloudFormation might create IAM resources.
  2. Click on the Create stack button to save.

Constraints

If your bucket contains more than 350 million objects/versions (see the NumberOfObjects CloudWatch metric), please send us an email for guidance!

  • If you set PagingWaitInSeconds a value greater than 0, the add-on enqueues not more than 2,777*PagingBatchSize objects/versions. With the maximum PagingBatchSize of 100000, the add-on handles up to ~270 million objects/versions.
  • If you set PagingWaitInSeconds to 0, the add-on enqueues not more than 3570*PagingBatchSize objects/versions. With the maximum PagingBatchSize of 100000, the add-on handles up to 350 million objects/versions.

Running a full bucket scan only once

To run a full bucket scan only once, set the ScheduleExpression parameter to a value 10 minutes from now (UTC timezone ) by using the expression cron(mm hh dd MM ? yyyy) filled with:

  1. mm: Minute (0-59).
  2. hh: Hour (0-23).
  3. dd: Day (1-31).
  4. MM: Month (1-12).
  5. ?: Please leave the question mark as it is.
  6. yyyy: Year (1970-2199).

Insights

To get insights into running and completed bucket scan runs:

  1. Visit the Step Functions Management Console .
  2. Click on the state machine (if you followed our docs, the name is bucketav-scheduled-bucket-scan-orchestrator).

You will see a list of Executions, the most recent execution is at the top and represents the latest bucket scan. If the status equals Succeeded, the bucket scan is complete. If the status equals Running, the bucket scan is running.

Update

Which version am I using?

  1. To update this add-on to version v2.7.0, go to the AWS CloudFormation Management Console .
  2. Double-check the region at the top right.
  3. Search for bucketav-scheduled-bucket-scan (or s3-virusscan-scheduled-bucket-scan for older installations), otherwise search for the name you specified.
  4. Select the stack and click on Update.
  5. Select Replace current template and set the Amazon S3 URL to https://bucketav-add-ons.s3.eu-west-1.amazonaws.com/scheduled-bucket-scan/v2.7.0/bucketav-add-on-scheduled-bucket-scan.yaml Copy
  6. Click on Next.
  7. Scroll to the bottom of the page and click on Next.
  8. Scroll to the bottom of the page and click on Next.
  9. Scroll to the bottom of the page, enable I acknowledge that AWS CloudFormation might create IAM resources, and click on Update stack.
  10. While the update runs, the stack status is UPDATE_IN_PROGRES. Reload the table from time to time and …
  11. … wait until the CloudFormation stack status switches to UPDATE_COMPLETE.

Architecture

The following AWS services are used:

  • StepFunction State Machine to orchestrate the S3 bucket scan.
  • Lambda Function to fetch the list of files from the S3 bucket and push them to the Scan Queue.
  • EventBridge Cron Rule to trigger the bucket scan at regular intervals.
  • CloudWatch Alarms to monitor the used AWS services.

Limitations

  • Cross-account access of S3 buckets is not supported when using the wildcard character * within the BucketName parameter. Please contact us in case you need to scan a bunch of buckets owned by another AWS account.

Need more help?

Write us, and we'll get back to you as soon as we can.

Send us an email