Shibboleth Containerization Working Notes

Shibboleth Containerization Working Notes

this is a work in progress; just rough notes for now

Table of Contents

Shibboleth Resources

Gemini responses

EFS vs S3

The primary difference between Amazon Elastic File System (EFS) and Amazon Simple Storage Service (S3) lies in how they present data to your applications: EFS is a file system (like a hard drive you plug into a computer), while S3 is object storage (like a web-based repository for files).

Feature

Amazon EFS

Amazon S3

Storage Type

File Storage (Hierarchical folders)

Object Storage (Flat structure with keys)

Access Protocol

NFSv4 (Standard network file system)

HTTP/HTTPS (REST API)

Connectivity

Mounted directly to EC2 or Fargate

Accessed via CLI, SDK, or URL

Performance

Low latency, consistent "local" feel

Higher latency, but massive throughput

Cost

Generally more expensive per GB

Significantly cheaper per GB

Best Use Case

Shared config, home directories, CMS

Media assets, backups, data lakes

How They Are Used

  • EFS (The "Shared Drive"): EFS is designed to be mounted onto multiple compute instances (like EC2 or Fargate tasks) simultaneously. To the application, it looks like a standard directory (e.g., /mnt/data). If one instance writes a file, all other instances see that change instantly.

  • S3 (The "API Storage"): S3 doesn't "mount" like a drive. To get a file, your code must "request" it using an API call. It is better for "write once, read many" scenarios.

Performance and Latency

  • EFS provides the low-latency response times required for active application files, databases, or WordPress installations where the server needs to read/write small files quickly and often.

  • S3 is built for "eventual consistency" and scale. While it can handle petabytes of data, there is a slight overhead for every request, making it less ideal for running a live application's internal file operations.

Scalability

  • EFS scales automatically in terms of storage. As you add files, the "drive" grows; as you delete them, it shrinks. You pay for what you use.

  • S3 is virtually infinite. It is designed for 99.999999999% (11 nines) of durability and is the industry standard for storing backups and static website assets (images, videos).

Integration with Shibboleth

In your Shibboleth on Fargate setup:

  • You would use EFS to store the dynamic metadata files because the Shibboleth software expects to find those files at a specific local path on the server.

  • You would use S3 if you wanted to store long-term logs or backups of your configuration images, but the running IdP wouldn't "talk" to S3 directly for its core operations.

To set up a Shibboleth IdP on Fargate using CloudFormation, you need to define three primary resources: the EFS File System, the Mount Targets (so Fargate can actually "see" the drive), and the ECS Task Definition that maps those volumes.

Cloud Implementation

The EFS Infrastructure

First, you create the file system and ensure it's accessible within your VPC subnets.

Resources: # 1. The actual File System ShibbolethFileSystem: Type: AWS::EFS::FileSystem Properties: Encrypted: true FileSystemTags: - Key: Name Value: Shibboleth-Config-Storage # 2. Mount Targets (Must be created for EACH subnet your Fargate tasks run in) EFSMountTargetA: Type: AWS::EFS::MountTarget Properties: FileSystemId: !Ref ShibbolethFileSystem SubnetId: !Ref PrivateSubnetA SecurityGroups: - !Ref EFSSecurityGroup # 3. Security Group for EFS (Allows NFS traffic on port 2049) EFSSecurityGroup: Type: AWS::EC2::SecurityGroup Properties: GroupDescription: Allow Fargate to talk to EFS VpcId: !Ref MyVPC SecurityGroupIngress: - IpProtocol: tcp FromPort: 2049 ToPort: 2049 SourceSecurityGroupId: !Ref FargateServiceSecurityGroup

The Fargate Task Definition

In the Task Definition, you define the Volumes (the link to AWS) and the MountPoints (where it appears inside your container).

ShibbolethTaskDefinition: Type: AWS::ECS::TaskDefinition Properties: Family: shibboleth-idp NetworkMode: awsvpc RequiresCompatibilities: - FARGATE Cpu: '1024' Memory: '2048' ExecutionRoleArn: !Ref ECSExecutionRole TaskRoleArn: !Ref ECSTaskRole # Link the EFS File System to the Task Volumes: - Name: "shib-metadata-volume" EFSVolumeConfiguration: FileSystemId: !Ref ShibbolethFileSystem TransitEncryption: ENABLED ContainerDefinitions: - Name: "shib-idp-container" Image: "your-account-id.dkr.ecr.us-east-1.amazonaws.com/shibboleth-idp:latest" PortMappings: - ContainerPort: 8080 # Or 443 depending on your container setup # Mount the volume into the specific Shibboleth folder MountPoints: - SourceVolume: "shib-metadata-volume" ContainerPath: "/opt/shibboleth-idp/metadata" ReadOnly: false

Why this matters for Shibboleth

  • The Metadata Folder: By mounting EFS to /opt/shibboleth-idp/metadata, your IdP can download huge metadata files from federations (like InCommon) and keep them there. If Fargate restarts the container, the new container sees the files already sitting in EFS and doesn't have to re-download/validate them, which speeds up boot time significantly.

  • Permissions Note: If your container runs as a non-root user (which is a security best practice), you should also look into EFS Access Points. They allow you to "force" a specific User ID (UID) on the EFS files so you don't run into "Permission Denied" errors when the container tries to write to the mount.

To implement a Shibboleth Identity Provider (IdP) on AWS Fargate with proper user permissions using EFS Access Points, you can add an AccessPoint resource to your CloudFormation template. This ensures that even if your container runs as a non-root user (e.g., UID 1000), it has the necessary permissions to read and write to the mounted volume.

CloudFormation Snippet: EFS Access Point & Task Mapping

This addition creates a specific entry point into your file system that "forces" a specific user and group ID for all operations.

Resources: # 1. Create an EFS Access Point ShibbolethEFSAccessPoint: Type: AWS::EFS::AccessPoint Properties: FileSystemId: !Ref ShibbolethFileSystem PosixUser: Uid: "1000" # Match the UID used by your Shibboleth container Gid: "1000" RootDirectory: Path: "/shibboleth-metadata" CreationInfo: OwnerUid: "1000" OwnerGid: "1000" Permissions: "0755" # 2. Update the Task Definition to use the Access Point ShibbolethTaskDefinition: Type: AWS::ECS::TaskDefinition Properties: # ... (Other properties like Family, Cpu, Memory) Volumes: - Name: "shib-metadata-volume" EFSVolumeConfiguration: FileSystemId: !Ref ShibbolethFileSystem TransitEncryption: ENABLED AuthorizationConfig: AccessPointId: !Ref ShibbolethEFSAccessPoint # Maps the volume to the Access Point IAM: ENABLED ContainerDefinitions: - Name: "shib-idp-container" # ... (Image and PortMappings) MountPoints: - SourceVolume: "shib-metadata-volume" ContainerPath: "/opt/shibboleth-idp/metadata" ReadOnly: false

Key Implementation Details for your IdP

  • Handling Metadata Access: Your team has recently discussed the need for firewall rules to allow IdP nodes to reach metadata files. Using an EFS Access Point ensures that once the file is reached and downloaded, the container has the persistent permission to store and read it across restarts.

  • Permissions for Containers: By specifying the PosixUser and CreationInfo in the Access Point, AWS automatically creates the directory with the correct ownership (e.g., UID 1000). This solves common "permission denied" errors when running containerized Shibboleth without root privileges.

  • Subnet Scoping: For your Fargate deployment, ensure you use the subnets identified for your environment (e.g., 10.17.133.56/31 for TEST) to ensure proper network routing to the EFS mount targets.

What is not baked into the image

When deploying a Shibboleth IdP to Fargate, you primarily interact with the ${idp.home} directory structure (typically /opt/shibboleth-idp). Because Fargate is ephemeral, you need to be very deliberate about which of these files are "baked" into your Docker image versus which are mounted via EFS or injected via Secrets Manager.

Here are the essential files and directories you need to reference:

Primary Configuration (/conf)

These files define how your IdP behaves. Most of these are usually baked into the Docker image because they change only when you deploy a new version of your configuration.

  • idp.properties: The "master" config. It contains your entityID, scope, and links to other property files (like LDAP or Metadata).

  • attribute-resolver.xml: Defines how the IdP fetches data from your backend (LDAP/AD/Database) and maps them to SAML attributes.

  • attribute-filter.xml: Controls which attributes are released to specific Service Providers (SPs).

  • metadata-providers.xml: Configures where the IdP finds the metadata for the SPs it trusts.

  • relying-party.xml: Configures SAML-specific settings (like signing/encryption requirements) for different SPs.

  • services.xml: Controls which configuration files the IdP should watch for changes and reload automatically.


Credentials and Secrets (/credentials)

These are highly sensitive. In Fargate, it is a best practice to not store these in the image. Instead, use an entrypoint script to pull them from AWS Secrets Manager at runtime.

  • idp-signing.crt / idp-signing.key: Used for signing SAML assertions.

  • idp-encryption.crt / idp-encryption.key: Used for decrypting messages from SPs.

  • sealer.jks / sealer.kver: Crucial for session encryption. All Fargate tasks in your cluster must share these exact files to allow a user to stay logged in if their request hits a different container.


Dynamic Data (/metadata and /logs)

This is the data that changes frequently or needs to persist across container restarts. These should be mounted to EFS.

  • /metadata: Where the IdP stores downloaded federation metadata (e.g., the InCommon XML file).

  • /logs: While Fargate usually streams logs to CloudWatch via the awslogs driver, some admins mount a persistent volume for the idp-audit.log or idp-process.log for specialized analysis.


Summary for your Fargate Task Definition:

Reference Type

Best AWS Home

Why?

Static Config (/conf)

Docker Image

Changes are tied to deployment versioning.

Secrets (/credentials)

Secrets Manager

Keeps sensitive keys out of the container image.

Heavy Metadata (/metadata)

EFS

Prevents redownloading 100MB+ files on every boot.

Session Keys (sealer.*)

EFS or Secrets

Must be identical across all running Fargate tasks.