{#user} # User guide ```{toctree} :maxdepth: 2 :hidden: user/onboarding user/logging-in user/quick-start user/pipeline user/systems ``` This is intended to be read by typical users following our recommended usage. While it is written for general audiences, some pointers specific to NERSC users will be given to adapt their pipelines at NERSC to the SO:UK Data Centre. We will start by pointing out main differences between NERSC and SO:UK Data Centre that will have important implications to how to deploy workflows here. | Facility | NERSC | SO:UK Data Centre | |:-----------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | Nature | HPC | HTC | | Configuration | Homogeneous within a pool | Heterogeneous by default | | Workload Manager | SLURM | HTCondor | | Job Classification Model | Different QoS can be selected, such as debug, interactive, regular, premium, etc., categorized by priority and charge factors. They shares exactly the same software environments. | Different "universe" can be selected, like vanilla, parallel, docker, container, etc., based on software environments and job launch methods. Universes are mutually exclusive and hence a job cannot be configured to multiple universes simultaneously. Interactive job is only available in vanilla universe. | | Login Node Designation | Login nodes reachable via ssh with 2-factor authentication. Passwordless login can be achieved by using `sshproxy` service to create temporary ssh keys. | Called Submit Node in HTCondor. Tentatively, a special login node named `vm77` is reachable via ssh. Users are required to submit ssh keys to maintainer, and is passwordless by default. | | Compute Node Designation | Compute nodes | Worker nodes | | Home Directory | Globally mounted home directory, backed up periodically | Not available on worker nodes | | Archive Filesystem | HPSS | Not available | | Scratch Filesystem | Parallel distributed file system (LUSTRE) with all SSD. Purged once every few months. | Local to each worker node. Data doesn't persist post job completion. | | Software Distribution Filesystem | Read-only global common | Read-only CVMFS | | Large Storage Pool | CFS with a filesystem interface | Grid storage system without a filesystem interface | | Job Configuration | SLURM directives within the batch script | ClassAd in a separate, ini-like format | | Wallclock Time | Must be specified in job configuration | Not applicable | | Sharing Physical Nodes | Requested via interactive QoS | Always shared by default | | Exclusive Physical Node Allocation | Requested via regular QoS | Not applicable | | Utilizing Multiple Nodes | Available by default | Must specify parallel universe in ClassAd | | Priority | Different levels permitted with various charge factors and restrictions | Not applicable | | Fair-Share System | Fixed amount of NERSC hours allocated to be used within an allocation year. Proposal required to request and is renewed on a year-to-year basis. | More flexible with no strict quota limit | | MPI Support | Native | Parallel universe is not exclusively for MPI. We maintain custom wrappers to start MPI processes within the parallel universe. | | Container Support | Officially supported | Only officially supported in the docker/container universe. Jobs cannot belong to both a container universe and a parallel universe. |