Refer to Shared Storage to learn more about Shared Storage configuration on IDEA.
Apps and Data Storage (Required)
For the IDEA Cluster to function, shared storage configuration must include Apps and Data storage configurations. Both Apps and Data are cluster scoped file systems and are mounted automatically on all applicable infrastructure hosts, eVDI linux sessions and SOCA Compute Nodes.
Apps
Apps shared-storage is used to save critical cluster configuration scripts, files and logs.
For Scale-Out computing workloads, additional Applications (eg. OpenMPI or IntelMPI, Python, Solvers etc) can be installed on shared-storage, and can be leveraged by Compute Nodes.
Default Configuration:
Apps storage is mounted on /apps mount path, and is configurable.
Amazon EFS is used as the default storage provider for Apps storage.
A custom CloudWatch monitoring rules and Lambda function is deployed for EFS Apps storage volumes, which help monitor the throughput of the file system and dynamically adjust the throughput mode to provisioned or bursting.
Data
Data storage is primarily used to store User Home Directories.
Additional directories for project/group level file shares can be created on Data Storage.
Default Configuration:
Data storage is mounted on /data mount path, and is configurable.
Amazon EFS is used as default storage provider for Apps.
To save cost, EFS Lifecycle policy is set to move data to Infrequently Accessed storage class after 30 days.
Scope
A notion of scope is introduced in IDEA to enable cluster administrators manage multiple file systems and specify mount criteria based on access, use-case and workload needs. Shared Storage mounts can be scoped based on:
Cluster
Cluster scoped shared storage mounts are applied across all nodes in the cluster. These include applicable infrastructure nodes, SOCA Compute Nodes and eVDI Hosts.
Module
Module scoped shared storage mounts are applicable across all nodes for the module, including applicable infrastructure nodes and eVDI or Compute Nodes.
Project
Project scoped shared storage mounts are applicable for Compute Nodes or eVDI Hosts, if they are launched for an applicable Project.
Scale-Out Computing: Queue Profiles
Queue Profile scoped shared storage mounts are applicable for all Compute Nodes launched for Jobs submitted to the queues configured under a Queue Profile.
Project and Module scopes can be combined to create an AND condition.
Queue Profile and Project scopes can be combined to create an AND condition.
Add or Attach Shared Storage to Cluster
The idea-admin.sh shared-storage utility enables admins to generate configurations for:
Provisioning new file systems
Re-use existing file systems
Either of the use-cases can be executed prior to initial cluster deployment OR after cluster deployment.
If shared storage configurations are updated after an IDEA Cluster is deployed, depending upon the Scope, manual actions will be required to mount the file system on applicable existing cluster nodes. All new hosts launched after the configuration update will automatically mount the configured file systems. See below for example(s)
Provision new File System
Shared Storage config generation for provisioning new file systems is only supported for Amazon EFS at the moment.
To generate configurations for provisioning new file systems you can use the idea-admin.sh shared-storage add-file-system command as below:
$ ./idea-admin.sh shared-storage add-file-system --help
Usage: idea-admin shared-storage add-file-system [OPTIONS]
add new shared-storage file-system
Options:
--cluster-name TEXT Cluster Name
--aws-region TEXT AWS Region [required]
--aws-profile TEXT AWS Profile Name
--kms-key-id TEXT KMS Key ID
-h, --help Show this message and exit.
Example
./idea-admin.sh shared-storage add-file-system \
--aws-region <REGION> \
--cluster-name <CLUSTER_NAME>
Add Shared Storage to an IDEA Cluster
Shared Storage Settings
? [Name] Enter the name of the shared storage file system (Must be all lower case, no spaces or special characters) testefs
? [Title] Enter a friendly title for the file system "New Shared EFS for Project A"
? [Shared Storage Provider] Select a provider for the shared storage file system Amazon EFS
? [Mount Directory] Location of the mount directory. eg. /my-mount-dir /custom_path
? [Mount Scopes] Select the mount scope for file system Cluster
New Amazon EFS Settings
? [Throughput Mode] Select the throughput mode Bursting
? [Performance Mode] Select the performance mode General Purpose
? [Enable CloudWatch Monitoring] Enable cloudwatch monitoring to manage throughput? No
? [Lifecycle Policy] Transition to infrequent access (IA) storage? Transition to IA Disabled
? [EFS Mount Options] Enter mount options nfs4 nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport 0 0
Shared Storage Config ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
testefs:
title: '"New Shared EFS for Project A"'
provider: efs
scope:
- cluster
mount_dir: /custom_path
mount_options: nfs4 nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport 0 0
efs:
kms_key_id: ~
encrypted: true
throughput_mode: bursting
performance_mode: generalPurpose
removal_policy: DESTROY
cloudwatch_monitoring: false
transition_to_ia: ~
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
? How do you want to proceed further? Update Cluster Settings and Exit
sync config entries to db. overwrite: True
updating config: shared-storage.testefs.title = "New Shared EFS for Project A"
updating config: shared-storage.testefs.provider = efs
updating config: shared-storage.testefs.scope = ['cluster']
updating config: shared-storage.testefs.mount_dir = /custom_path
updating config: shared-storage.testefs.mount_options = nfs4 nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport 0 0
updating config: shared-storage.testefs.efs.kms_key_id = None
updating config: shared-storage.testefs.efs.encrypted = True
updating config: shared-storage.testefs.efs.throughput_mode = bursting
updating config: shared-storage.testefs.efs.performance_mode = generalPurpose
updating config: shared-storage.testefs.efs.removal_policy = DESTROY
updating config: shared-storage.testefs.efs.cloudwatch_monitoring = False
updating config: shared-storage.testefs.efs.transition_to_ia = None
idea-admin.sh utility will automatically update your IDEA cluster environment if you select " Update Cluster Settings and Exit". You can also choose to automatically "Deploy" the cluster which will automatize the steps mentioned below. For this demo, we are just Updating Cluster Settings and will proceed to a manual deployment afterwards.
Once done, you can validate your new mount point in the web interface via "Cluster Management" > "Settings" > "Shared Storage"
At this point, the FileSystem ID is empty because you asked to provision a brand new EFS. To update the backend infrastructure and trigger the EFS creation, you must run deploy command (see this page for more details about deploy utility).
First, run the idea-admin.sh cdk diff to confirm the new EFS will be created:
Now that the deployment command is complete, go back to the web interface and validate the new EFS has been created and now has valid FileSystem ID assigned.
To further validate our new mount point, we can submit a test job which will output df command
qsub -- /bin/df -h
The job output should display the mount point (custom/path) for your new filesystem
To generate configurations for attaching an existing file system, you can use the idea-admin.sh shared-storage attach-file-system command as below. This utility will automatically search for existing backed storage (FSx for Lustre/NetApp/OpenZFS/Windows, EFS) running in your VPC.
./idea-admin.sh shared-storage attach-file-system --help
Usage: idea-admin shared-storage attach-file-system [OPTIONS]
attach existing shared-storage file-system
Options:
--cluster-name TEXT Cluster Name
--aws-region TEXT AWS Region [required]
--aws-profile TEXT AWS Profile Name
--kms-key-id TEXT KMS Key ID
-h, --help Show this message and exit.
Example
$ ./idea-admin.sh shared-storage attach-file-system \
--aws-region <REGION> \
--cluster-name <CLUSTER_NAME>
Add Shared Storage to an IDEA Cluster
Shared Storage Settings
? [Name] Enter the name of the shared storage file system (Must be all lower case, no spaces or special characters) demo
? [Title] Enter a friendly title for the file system Demo FS
? [VPC] Select the VPC from which an existing file system can be used vpc-0cb462f0bfc14526b (10.0.0.0/16) [<CLUSTER_NAME>-vpc]
? [Shared Storage Provider] Select a provider for the shared storage file system Amazon FSx for Lustre
? [Mount Directory] Location of the mount directory. eg. /my-mount-dir /demo
? [Mount Scopes] Select the mount scope for file system Cluster
Existing FSx for Lustre Settings
? [Existing FSx for Lustre] Select an existing Lustre file system fsx-lustre (FileSystemId: fs-01a2ccc035f0f007c, Provider: fsx_lustre)
? [Mount Options] Enter /etc/fstab mount options lustre defaults,noatime,flock,_netdev 0 0
Shared Storage Config -----------------------------------------------------------------------------------------------------------------
demo:
title: Demo FS
provider: fsx_lustre
scope:
- cluster
mount_dir: /demo
mount_options: lustre defaults,noatime,flock,_netdev 0 0
fsx_lustre:
use_existing_fs: true
file_system_id: fs-01a2ccc035f0f007c
dns: fs-01a2ccc035f0f007c.fsx.us-east-1.amazonaws.com
mount_name: drohpbev
version: '2.10'
----------------------------------------------------------------------------------------------------------------------------------------
? How do you want to proceed further?
Remove a File System
Run ./idea-admin.sh config delete shared-storage.<filesystem_name> to remove a shared filesystem from IDEA.
Removing a file system from IDEA won't trigger a file system deletion. Make sure to re-deploy the shared-storage module if you want to remove a filesystem previously created by IDEA