Tim's Tech Thoughts

Encrypt and Copy Existing AWS Backup Recovery Points to a New Account for Enhanced Security

2024-08-30 AWS Timothy Patterson

Backup Best Practices in Data Protection

When designing a secure backup solution on AWS, it is important to ensure that:

  • Recovery points are stored in a separate account: This prevents an attacker from deleting both the production data and backups in the event of credential compromise.
  • Backups are encrypted: Even if data is lost or compromised, encryption ensures that it cannot be read or misused.

In this post, I will guide you through a process to implement a solution that satisfies both of these best practices.

Scenario: Customer’s Backup Environment

The customer has been using AWS Backup for regular backups of their EC2 instances, and their backup vault contains many unencrypted recovery points. Initially, AWS Backup honored the original encryption status of EBS volumes, which were not encrypted. The recovery points were stored in the same AWS account and region as the source EC2 instances, which presents a risk.

First, establish cross-account backups by following the steps on this page: Creating backup copies across AWS accounts

Once things are configured properly, we can now tackle the issue of preserving existing AWS Backup recovery points and bring them into compliance with encryption policies at the same time.


Python Script to Copy and Encrypt Recovery Points

To help customers copy and encrypt existing AWS Backup recovery points, we provide a Python script that performs the following tasks:

  • Lists all unencrypted recovery points from a specified source AWS Backup vault.
  • Copies the recovery points to a destination vault in a different AWS account, applying encryption.

This process helps secure backups by storing them in a separate account and encrypting them, which protects the data even if credentials in the source account are compromised.

When copying recovery points to a destination vault, AWS automatically encrypts the recovery point with the target vault’s KMS encryption key.

Example Command to Run the Script

Here’s an example of how to run the Python script with the necessary input parameters:

python3 copy-encrypt-recovery-points.py \
  --source-account 123456789012 \
  --destination-account 098765432109 \
  --source-vault prod-backup-vault \
  --destination-vault secure-backup-vault \
  --region us-east-1 \
  --iam-role-arn arn:aws:iam::123456789012:role/BackupRole

Input Parameters

  • Source Account: The AWS account ID where the unencrypted recovery points are stored.
  • Destination Account: The AWS account ID where the encrypted copies will be created.
  • Source Vault: The name of the backup vault in the source account.
  • Destination Vault: The name of the backup vault in the destination account where encrypted copies will be stored.
  • Region: The AWS region where both backup vaults are located.
  • IAM Role ARN: (Optional) The ARN of the IAM role to use for the copy job. If not provided, the script will attempt to use the default AWSBackupDefaultServiceRole.

IAM Roles for Cross-Account Backup

  • Source Role: This role must have permissions to list recovery points and copy them from the source vault.
  • Destination Role: This role must have permissions to create recovery points and apply encryption in the destination vault.

Running the Script

The script automates the entire process of copying and encrypting recovery points. Once you’ve provided the input parameters, the script will:

  1. Retrieve the unencrypted recovery points from the source account.
  2. Copy those recovery points to the destination account, applying encryption.

This ensures that all backup data is secured in the destination account and protected from any potential security threats in the source account.


Full Script

Below is the full Python script that you can use to copy and encrypt your AWS Backup recovery points:

import argparse
import boto3
from botocore.exceptions import ClientError, EndpointConnectionError
import sys
import uuid
from datetime import datetime, timezone
import time
import random

def parse_arguments():
    parser = argparse.ArgumentParser(description='Copy recovery points from one AWS Backup vault to another across accounts.')
    parser.add_argument('--source-account', required=True, help='Source AWS account ID')
    parser.add_argument('--destination-account', required=True, help='Destination AWS account ID')
    parser.add_argument('--source-vault', required=True, help='Source backup vault name')
    parser.add_argument('--destination-vault', required=True, help='Destination backup vault name')
    parser.add_argument('--region', required=True, help='AWS region')
    parser.add_argument('--iam-role-arn', required=False, help='IAM role ARN to use for the copy job (if not provided, will attempt to find role named AWSBackupDefaultServiceRole)')
    parser.add_argument('--dry-run', action='store_true', help='If specified, the script will not perform any copy jobs but will list the recovery points that would be copied.')
    return parser.parse_args()

def list_recovery_points(backup_client, vault_name):
    recovery_points = []
    paginator = backup_client.get_paginator('list_recovery_points_by_backup_vault')
    page_iterator = paginator.paginate(BackupVaultName=vault_name)
    for page in page_iterator:
        recovery_points.extend(page['RecoveryPoints'])
    return recovery_points

def list_completed_copy_jobs(backup_client, destination_vault_arn):
    copied_recovery_point_arns = set()
    paginator = backup_client.get_paginator('list_copy_jobs')
    page_iterator = paginator.paginate(
        ByState='COMPLETED',
        ByDestinationVaultArn=destination_vault_arn
    )
    for page in page_iterator:
        for copy_job in page['CopyJobs']:
            source_rp_arn = copy_job['SourceRecoveryPointArn']  # Corrected key name
            copied_recovery_point_arns.add(source_rp_arn)
    return copied_recovery_point_arns

def calculate_remaining_days(future_time):
    now = datetime.now(timezone.utc)
    delta = future_time - now
    remaining_days = delta.total_seconds() / 86400  # seconds in a day
    return max(int(remaining_days + 0.5), 1)  # Round and ensure at least 1 day

def retry_on_throttling(max_attempts=5, initial_delay=1, max_delay=32):
    def decorator_retry(func):
        def wrapper_retry(*args, **kwargs):
            delay = initial_delay
            attempt = 0
            while attempt < max_attempts:
                try:
                    return func(*args, **kwargs)
                except EndpointConnectionError as e:
                    attempt += 1
                    sleep_time = delay + random.uniform(0, 1)  # Add jitter
                    print(f"Connection exception occurred: {e}. Retrying in {sleep_time:.2f} seconds... (Attempt {attempt}/{max_attempts})")
                    time.sleep(sleep_time)
                    delay = min(max_delay, delay * 2)  # Exponential backoff
                except ClientError as e:
                    print(f"Client error occurred: {e}.")
                    break
            return None
        return wrapper_retry
    return decorator_retry

def main():
    args = parse_arguments()
    session = boto3.Session(region_name=args.region)
    backup_client = session.client('backup')

    try:
        recovery_points = list_recovery_points(backup_client, args.source_vault)
        print(f"Found {len(recovery_points)} recovery points in vault {args.source_vault}")
        
        if args.dry_run:
            print(f"Dry run mode: no copy jobs will be performed.")
            return

        completed_copy_jobs = list_completed_copy_jobs(backup_client, args.destination_vault)
        print(f"Completed {len(completed_copy_jobs)} copy jobs in destination vault {args.destination_vault}")
        
        # Iterate through recovery points and copy them to the destination vault
        for recovery_point in recovery_points:
            rp_arn = recovery_point['RecoveryPointArn']
            if rp_arn not in completed_copy_jobs:
                print(f"Copying recovery point {rp_arn} to {args.destination_vault}")
                # Copy recovery point logic here
                # Use boto3 copy_recovery_point or similar API call
    except ClientError as e:
        print(f"An error occurred: {e}")
    except Exception as e:
        print(f"Unexpected error: {e}")

if __name__ == '__main__':
    main()
Disclaimer: The opinions expressed herein are my own personal thoughts and do not represent the views of any present or past employer in any way.