1 SC17 Collaboration Support Task ForceAzher Mughal / John Graham / John Hess / Tom Hutton Montana State University, Bozeman August 8th, 2017
2 Why this initiative Goals of PRP / NRP is to encourage and discuss implementation strategies for deployment of interoperable Science DMZs at a national scale To identify potential collaborators, their demonstrations planned for SC17 and what other resources would benefit their research For this purpose, we have reached to few potential researchers and asked to provide details about their demonstration plans. Then our job is to reach out to community and see if there is common fit, and they can utilize each others resources.
3 CENIC Network Layout Planned for SC17Los Angeles: 4 x 100G Sunnyvale: 2 x 100G Seattle: 3 x 100G AMPATH: 3 x 100G
4 Los Angeles Connected Campuses
5 Sunnyvale Connected Campuses
6 USC Remote Microscopy Provides remote audience to play with the microscope, zoom in to see the tiny creatures Video display from Microscope is available in 4K format. Planned demonstration at SC17 Possibly a 100GE connection to Caltech booth for Data transfers to AMPATH (Brazil)
7 San Diego State University Chris PaoliniWorking on new BeeGFS storage deployment Roughly 2.1 PB storage available Connected to CENIC Los Angeles at 100Gbps
8 CC* Storage: Implementation of a Distributed, Shareable, and Parallel Storage Resource at San Diego State University to Facilitate High-Performance Computing for Climate Science Christopher Paolini, San Diego State University [OAC ]
9 CC* Storage: Implementation of a Distributed, Shareable, and Parallel Storage Resource at San Diego State University to Facilitate High-Performance Computing for Climate Science Christopher Paolini, San Diego State University [OAC ] BeeGFS Parallel File System Cluster Provides scalable, high-throughput, low-latency, parallel I/O to network accessible HPC clusters at SDSU used for climate science Scientific discovery advanced through improved runtime performance of I/O intensive workloads Scales linearly to a sustained throughput of 25 GB/s Coastal Ocean Modeling Parallel 3-D General Curvilinear Coastal Ocean Modeling (Castillo et al.) SDSU code: GCCOM Nonhydrostatic large eddy simulation (LES) model designed specifically for high-resolution (meter) simulations Simulations executed on 136-core cluster dulcinea.sdsu.edu I/O Bound: simulation time is a function of grid resolution (mesh size) and number of data arrays and time slices to be saved. Multiphase Turbulent Combustion Direct numerical simulations (DNS) of turbulent combustion in two phase mixtures (Abraham et al.) SDSU code: HOLOMAC (High-Order LOw-MAch number Combustion Simulations executed on 340-core cluster fermi.sdsu.edu I/O Bound: data needs to be stored at frequent time intervals since flows are transient Real-time CO2 and CH4 Data Acquisition Understanding the role of terrestrial, atmospheric, and marine systems in global change (Kalhori et al.) Study seasonal and inter-annual controls on carbon fluxes in arctic Alaska, air-sea CO2 exchange in coastal seas, arid and semi-arid ecosystems, and land-atmosphere carbon fluxes Real-time data acquisition from network of CO2 and CH4 flux-eddy covariance towers across North Slope of Alaska (I/O latency constrained) Geologic CO2 Sequestration Parallel 3-D multiphase coupled Thermal–Hydraulic–Mechanical–Chemical ("THMC") water-rock interaction during subsurface CO2 injection (Paolini et al.) Numerical simulation of microfracture evolution during CO2 injection Heat transfer from solute-solute interaction during CO2 injection SDSU code: Subflow Simulations executed on 256-core cluster mixcoatl.sdsu.edu
10 CALTECH Harvey Newman Consistent Operations paradigm, SENSE and SDN NGenIA CALTECH HEP connected with CENIC Los Angeles at 100Gbps
11 Caltech + Starlight/OCC Booths and SCinet at SC17
12 Caltech, OCC and Partners at SC17~3 Tbps each at the Caltech and OCC booths Connection to the Dell booth 1+ Tbps between the Booths and ~3Tbps to the WAN Caltech Booth: 200G dedicated to Caltech campus; 300G to PRP (UCSD, Stanford, UCSC, et al); 300G to Brazil+Chile via Miami; 200G to ESnet Waveserver Ai + other DCIs in the booths: N X 100GE to 400G, 200G waves Wavelength Sensitive Switching in the Ciena 6500 platform Microcosm: Creating the Future SCinet and the Future of Networks for Science
13 UC Santa Cruz Shaw DTN (NVME) Server connected via Arista 7060CX2 to Brocade MLXe 100G switch, connecting to CENIC. John Graham is managing the DTNs Kubernetes
14 UCSD
15 Multi-Institution, Hyper-Converged ScienceDMZTHREDDS ZFS THREDDS ZFS 8 GPU Jupyter 100G Ceph DTN Caltech PATTERN-LAB Ceph Kubernetes Centos7 Jupyter 100G Ceph DTN SDSC 100G Ceph DTN UCSC Rook/Ceph - Block/Object/FS Swift API compatible with SDSC, AWS and Rackspace Labeling each node with its resources allows for intelligent pod placement. This insures if you need a certain flavor of system you get it automatically. Calit2 STARLIGHT Jupyter 100G Ceph DTN SDSU Jupyter EDEX DTN DTN Ceph MicroCloud Jupyter March 2017
16 KubeSpawner (jupyterhub-kubernetes-spawner)The JupyterHub Kubernetes Spawner enables JupyterHub to spawn single-user notebook servers on a Kubernetes cluster. Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications. If you want to run a JupyterHub setup that needs to scale across multiple nodes (anything with over ~50 simultaneous users), Kubernetes is a wonderful way to do it. Features include: Easily and elastically run anywhere between 2 and thousands of nodes with the same set of powerful abstractions. Scale up and down as required by simply adding or removing nodes. Run JupyterHub itself inside Kubernetes easily. This allows you to manage many JupyterHub deployments with only Kubernetes, without requiring an extra layer of Ansible / Puppet / Bash scripts. This also provides easy integrated monitoring and failover for the hub process itself Spawn multiple hubs in the same kubernetes cluster, with support for namespaces. You can limit the amount of resources each namespace can use, effectively limiting the amount of resources a single JupyterHub (and its users) can use. This allows organizations to easily maintain multiple JupyterHubs with just one kubernetes cluster, allowing for easy maintenance & high resource utilization. Provide guarantees and limits on the amount of resources (CPU / RAM) that single-user notebooks can use. Kubernetes has comprehensive resource control that can be used from the spawner. Mount various types of persistent volumes onto the single-user notebook's container. Control various security parameters (such as userid/groupid, SELinux, etc) via flexible Pod Security Policies. Run easily in multiple clouds (or on your own machines). Helps avoid vendor lock-in. You can even spread out your cluster across multiple clouds at the same time. From: https://github.com/jupyterhub/kubespawner Scales up and down Supports namespaces Guarantees limits on CPU and RAM Persistent volumes Pod Security Policies Run on multiple clouds at the same time.
17 Next Generation GPU JupyterHubMulti-Tenant Containerized GPU JupyterHub Running Kubernetes / CoreOS Eight GTX-1080 Ti GPUs Kunernetes and the latest versions of Docker can now manage access to GPUs ~$13K Dual 8 core 2.1Ghz CPU, 32GB RAM, 6 480GB SSD, 2x NVMe bays, 40G ConnectX-4, Dual 10G ports, TPM
18 Scheduling GPUs on KubernetesGPUs can be specified in the limits section only. Containers (and pods) do not share GPUs. Each container can request one or more GPUs. It is not possible to request a portion of a GPU. Nodes are expected to be homogenous, i.e. run the same GPU hardware
19 Rook Cloud-Native Storage
20 Rook is Ceph ‘inside’ Hyper-converged KubernetesGPU + NVMe + 40G NIC Rook is a Ceph Pod ( collections of Docker Containers ) with all the containers needed to deploy a full Ceph cluster in just minutes using simple YAML config files. Rook automatically discovers and partitions storage devices using the kubectl command $ kubectl create -f rook-operator.yaml $ kubectl create -f rook-cluster.yaml We are building a hyper-converged cluster where GPU, compute and storage are combined in the same chassis. We can then deploy Pods on the K8S cluster where replication and autoscaling make sure the services are always available. Our SDSU Rook/Ceph pool is currently built from: 1 SDSU FIONA DTN ( master ) 1 HPWREN Ceph testbed ) with 240TB of SAS3 and 6TB SSD PRP 8 MicroCloud Blades with 3.7TB NMVe and 1TB SSD 3 FIONA-ML with 3TB M.2 NVMe and 3TB SSD 4 100G DTN with 8TB NVMe 1TB SSD We currently have two M40’s and one P40 GPU installed in one server but we will soon be running with 8 TITAN X in one and GTX 1080ti cards in a second and 8 AMD R9 Nano cards. = FIONA-ML
21 rook-operator The Rook operator is a simple container containing the rook-operator binary that has all that is needed to bootstrap and monitor the storage cluster. The operator will start and monitor ceph monitor pods and a daemonset for the OSDs, which provides basic RADOS storage as well as a deployment for a RESTful API service. When requested through the api service, object storage (S3/Swift) is enabled by starting a deployment for RGW, while a shared file system is enabled with a deployment for MDS. The operator will monitor the storage daemons to ensure the cluster is healthy. Ceph mons will be started or failed over when necessary, and other adjustments are made as the cluster grows or shrinks. The operator will also watch for desired state changes requested by the api service and apply the changes. The Rook daemons (Mons, OSDs, RGW, and MDS) are compiled to a single binary rookd, and included in a minimal container. rookd uses an embedded version of Ceph for storing all data -- there are no changes to the data path. Rook does not attempt to maintain full fidelity with Ceph. Many of the Ceph concepts like placement groups and crush maps are hidden so you don't have to worry about them. Instead Rook creates a much simplified UX for admins that is in terms of physical resources, pools, volumes, filesystems, and buckets. Rook is implemented in golang. Ceph is implemented in C++ where the data path is highly optimized. We believe this combination offers the best of both worlds. From: https://github.com/rook/rook/blob/master/Documentation/kubernetes.md
22 Rook running on KubernetesRook now supports Separate Storage Groups This allows us to create separate pools grouped by storage performance. SAS3, SSD and NVMe...