IBM Spectrum Scale™ User Group Spectrum Protect on Spectrum Scale

1 IBM Spectrum Scale™ User Group Spectrum Protect on Spec...
Author: Shannon Nicholson
0 downloads 3 Views

1 IBM Spectrum Scale™ User Group Spectrum Protect on Spectrum ScaleJason Basler IBM Spectrum Protect Development

2 About the speaker: Jason Basler is the test architect responsible for IBM Spectrum Protect™. He has been part of the development team for over twenty years, and has expertise in various Spectrum Protect technologies as well as related storage technologies. He is currently driving the test activities around new releases of Spectrum Protect with a focus on scalability and publishing blueprints based on best practices derived from experience in the test labs. Footnote goes here

3 IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice at IBM’s sole discretion. Information regarding potential future products is intended to outline our general product direction and it should not be relied on in making a purchasing decision. The information mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver any material, code or functionality. Information about potential future products may not be incorporated into any contract. The development, release, and timing of any future features or functionality described for our products remains at our sole discretion. © Copyright IBM Corporation 2017

4 Using IBM Spectrum Protect to protect data in IBM Spectrum Scale Agenda Overview of solutions combining IBM Spectrum Scale™ and IBM Spectrum Protect ™ Using IBM Spectrum Protect to protect data in IBM Spectrum Scale Using IBM Spectrum Scale as Storage for IBM Spectrum Protect Blueprints for disk-based data protection solutions The IBM Elastic Storage Server blueprint © Copyright IBM Corporation 2017

5 Reduces backup and recovery infrastructure costs by up to 38 percent IBM Spectrum Protect™ – (formerly IBM Tivoli® Storage Manager) Comprehensive backup and recovery suite for physical, virtual and cloud environments In case you haven’t heard, IBM Spectrum Protect, formerly TSM, formerly ADSM … Comprehensive data protection and recovery solution for virtual, physical and cloud data Over 20 years experience protecting some of the world’s largest data centers Over 20,000 active clients Key Capabilities Protects virtual, physical, bare metal and cloud data with one solution Reduces backup and recovery infrastructure costs by up to 38 percent Delivers greater visualization and administrator productivity Simplifies backups by consolidating administration tasks Client values: Application-aware and VM-aware data protection for any size organization Built-in efficiency features: Deduplication, incremental ‘forever’ backup Integrated multi-site replication and disaster recovery Simplified administration © Copyright IBM Corporation 2015

6 Spectrum Protect / Spectrum Scale Integration Overview... for data protection of IBM Spectrum Scale Spectrum Scale Spectrum Protect backup archive client Spectrum Protect Snapshot SOBAR (Scale out backup and restore) Spectrum Protect Server Spectrum Scale Spectrum Scale ... as storage for IBM Spectrum Protect Spectrum Protect for Space Management Spectrum Scale There are 4 solutions combining Scale and Protect … I will be discussing two of them in detail Brief mention of 1) space management 2) snapshot management (Spectrum Protect Snapshot, formerly FCM) Space management, uses DMAPI interface in Spectrum Scale to provide Policy based migration of files into Spectrum Protect Transparent recall of migrated data as it is accessed Spectrum Protect Snapshot provides management of Scale snapshots at the fileset level, and is combined with the FCM custom application support. © Copyright IBM Corporation 2017

7 IBM Spectrum Scale data protection using IBM Spectrum Protect© Copyright IBM Corporation 2017

8 IBM Spectrum Protect progressive incremental backupbackup (GUI or CLI) Spectrum Protect backup archive client typically installed on one cluster nodes Spectrum Scale Cluster restore (GUI or CLI) Spectrum Protect Server Environment: Small IBM Spectrum Scale installations with a small number of nodes and file systems. IBM Spectrum Protect backup archive client installed on one or more cluster nodes Scalability: Millions of files, Terrabytes of data, up to Objects (empirical value) Processing: Standard IBM Spectrum Protect backup archive client progressive incremental is used to perform file system backup. Potentially a second node for a second file system backup Pros: Simple setup and usage Cons: Limited performance and scalability Simplest approach uses standard “dsmc incremental” The scalability is limited by full file system scans to perform incremental processing. Improved when there are multiple file systems, and taking advantage of resourceutilization for parallel sessions © Copyright IBM Corporation 2017

9 IBM Spectrum Scale mmbackup on file system levelSpectrum Protect backup archive client typically installed on serveral cluster nodes backup (mmbackup) Spectrum Protect Server Spectrum Scale mmbackup tool coordinates processing Spectrum Scale Cluster restore (GUI or CLI) Environment: Medium IBM Spectrum Scale installations with a single digit number of nodes and file systems. IBM Spectrum Protect backup archive client installed on several cluster nodes Scalability: Tens of millions of files, Tens of terrabytes of data, up to Objects (empirical value) Processing: IBM Spectrum Scale mmbackup scans file system and IBM Spectrum Protect data base and generates list of backup candidates. IBM Spectrum Protect backup archive client used from mmbackup to perform file system backup. Pros: Simple setup and usage, Good performance and scalability Cons: All data goes to one IBM Spectrum Protect server Further scalability provided when using mmbackup facility provided by Spectrum Scale. Mmbackup uses Scale’s mmapplypolicy engine to quickly scan the file system for backup candidates Handles both backup and expiration The mmbackup command provides: A full backup of all files in the specified scope. An incremental backup of only those files that have changed or been deleted since the last backup. Files that have changed since the last backup are updated and files that have been deleted since the last backup are expired from the TSM server . Utilization of a fast scan technology for improved performance. . The ability to perform the backup operation on a number of nodes in parallel. . Multiple tuning parameters to allow more control over each backup. . The ability to backup the read/write version of the file system or specific global snapshots. . Storage of the files in the backup server under their GPFS root directory path independent of whether backing up from a global snapshot or the live file system. . Handling of unlinked filesets to avoid inadvertent expiration Note: no windows node support for mmbackup © Copyright IBM Corporation 2017

10 M: IBM Spectrum Scale mmbackup on file system levelBackup cycle: After start mmbackup evaluates the cluster environment and verifies product versions and settings Optional the Spectrum Protect server is queried for existing backup information. In other cases existing shadow DB is used for processing The policy engine is used to generate a list files currently eligible for backup activities Compare existing shadow DB and scan result to calculate file lists for required backup activities Expire all files deleted in the file system since last backup run Incremental backup all files with changed metadata in the file system since last backup run Selective backup all files with changed data in the file system since last backup run While backup activities ongoing update shadow DB inline Analyse backup results from all used cluster nodes and finish backup cycle by selective backup the current shadow DB initiate mmbackup Evaluate environment Optional: query Spectrum Protect server Perform file system scan Calculate backup activities Expire deleted files Backup new and changed files Analyse result and finish backup run invoke policy engine invoke policy engine © Copyright IBM Corporation 2017

11 Peta Scale Data Protection – Architecturefilesets single Spectrum Scale file system Spectrum Scale cluster Spectrum Protect backup-archive client, optional: Spectrum Protect for Space Management Network Spectrum Protect Server Extreme scalability due to multiple Spectrum Protect servers to protect a single Spectrum Scale file system High backend storage media flexibility due to multiple supported storage technologies (disk, tape, cloud) for a single file system High QoS flexibility due to fine grain data protection approach (fileset level) Integration between Spectrum Protect client products warranted (inline copy) Ultra fast disaster recovery with SOBAR supported © Copyright IBM Corporation 2017

12 Petascale Data ProtectionThe singificant growth of data faces storage providers with new challanges. Beside the administration and maintenance of disk pools for large file systems the data protection and data archiving of big data clusters causes serious demands. The following slides describe a solution for data protection for large scaling environments with IBM Spectrum Protect and IBM Spectrum Scale. This slide deck corresponds to the whitepaper „Peta Scale Data Protection“ Link to the paper: https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/Tivoli%20Storage%20Manager/page/Petascale%20Data%20Protection The paper describes a data protection approach scaling up to hundreds of petabytes for an IBM Spectrum Scale file systems using IBM Spectrum Protect backup-archive client and IBM Spectrum Protect for Space Management. The focus of this paper is to provide configuration guidance for the setup and operation of the data protection processes in such an environment. This paper also introduces the concept of different service levels for data protection on file system and fileset level. © Copyright IBM Corporation 2017

13 Peta Scale Data Protection – TechnologyKey technology behind the solution is Spectrum Protect „active server binding“ that is implemented from Spectrum Protect for Space Management and used from Specturm Protect backup-archive client. Usage of Spectrum Protect for Space Management (HSM) for file migration is optional, but file system management is required for active server binding. HSM is mandatory if fast disaster recovery with SOBAR is planned. The first time a file will be send from file system to the Spectrum Protect server (backup or HSM) it will be bound to the specified server. Granularity of backup and HSM processing is Spectrum Scale fileset level. The backup and HSM processing for each fileset is independent from others. Active server binding is visible for Spectrum Scale policy engine scans With a first backup FileN was bound to ServerA and can‘t be send to a different server now Spectrum Protect „active server binding“ FileN:ServerA Spectrum Scale cluster Spectrum Protect backup-archive client, Spectrum Protect for Space Management Spectrum Protect ServerA © Copyright IBM Corporation 2017

14 Using IBM Spectrum scale as storage for IBM Spectrum Protect© Copyright IBM Corporation 2017

15 IBM Spectrum Protect™ BlueprintsFaster deployments saving time and money What is it? Prescriptive hardware and build detail to handle defined workloads with build automation Designed around small, medium, or large workloads Server and storage references optimized for deduplication and disk-only environments Automated validation for hardware, file system, and operating system setup Automated configuration for IBM Spectrum Protect database, storage pool, policy, and schedules Platforms: Linux and Windows on Intel x86_64 AIX on Power Linux on Power, big endian Storage Storwize V7000, V5000, V3700 (direction to replace with V5010) IBM Elastic Storage Server GL4, GL2 (powered by IBM Spectrum Scale) Benefits Significantly improved time to value Faster server deployments than ever before. Setup in as little as 3 hours. Systematic design and build that aligns with software defined data protection Reduced risk Automated best-practice configurations Less guesswork for performance optimization 1) Blueprints for designing SP servers given in three standard sizes that scale S,M,L 2) They implement disk-only SP solutions optimized for container pools with deduplication and replication 3) Reference architectures for servers and storage. OS’s include Linux and Windows for x86, AIX for Power, with Linux on power coming later this year (big endian) 4) Automation to speed deployment time on developerWorks:

16 Disk-based backup solution with replication1) We are standardizing on 4 solutions for implementing Spectrum Protect 2) The single and multi-site disk solutions are the target design of the blueprint 3) Two blueprint systems can easily be connected to provide a disk-only solution which includes node replication. 4) Also a suitable design for HSM and Archive.

17 Spectrum Protect on Spectrum Scale - OverviewMultiple Spectrum Protect instances store DB and storage pools in a Spectrum Scale file system Scale provides a global name space for all Protect instances Protect instances share Scale file system resources Protect instances run on Scale cluster nodes accessing the Scale file system and disk directly Scale file systems balance the workload and capacity for all Protect instances on disk Provides standardized, scalable and easy to use storage infrastructure for the multiple Protect instances Better storage utilization – multiple TSM server share the same storage Better operational efficiency with one storage for all TSM server Seamless scalability of storage capacity and performance Highly scalable performance with intelligent striping across all disk devices Disaster protection with TSM or GPFS replication or GPFS native RAID Cost efficient by utilizing standard infrastructure components High availability in clustered file system

18 With Elastic Storage ServerDeployment options On Scale server As Scale client With Elastic Storage Server Protect client Protect clients Protect client LAN LAN LAN Protect servers Protect servers Scale client Scale client Scale Scale Scale cluster Protect servers Scale server ESS server Protect server runs on Scale servers Direct SAN storage access Protect runs on Scale client connected to Scale / ESS NSD server SAN or LAN access from Protect server to Scale / ESS server

19 Blueprint configurations, IBM® POWER8™IBM POWER8 System S822 based Operating Systems AIX 7.1 Power Linux (S822L) small IBM POWER8 S822 1 x 10core p8 3.42Ghz (6 cores used) 64 GB RAM Dual port 8Gb FC Dual port 10Gb Ethernet 1TB database 128GB active log 1TB archive log 45TB storage pool medium IBM POWER8 S822 1 x 10core p8 3.42Ghz 128 GB RAM Dual port 8Gb FC Dual port 10Gb Ethernet 2TB database 128GB active log 3TB archive log 200TB storage pool large IBM POWER8 S822 2 x 10core p8 3.42Ghz 256 GB RAM 2 x Dual port 8Gb FC 2 x Dual port 10Gb Ethernet 6TB database 256GB active log 4TB archive log 1PB storage pool 1) IBM Power models based on all based on power 8 S822 model (AIX or Power Linux) 2) pLinux uses S822L 3) CPU cores and memory scale up S,M,L

20 IBM Elastic Storage Server configurationsWith IBM Spectrum Scale™ software medium IBM ESS GL-2 2 enclosures, 12u 116 x 6TB NL-SAS (stgpool, archlog, db backup) 2 x SSD caching disks 10GbE, 40GbE, or Infiniband 430 TB usable large IBM ESS GL-4 4 enclosures, 20u 232 x 6TB NL-SAS (stgpool, archlog, db backup) 2 x SSD caching disks 10GbE, 40GbE, or Infiniband 900 TB usable IBM Elastic Storage Server is a bundled hardware, software, and services offering that provides: A scalable, fast, and low-cost software defined storage platform Dense JBOD expansions (4U x 60, 3.5” disks) IBM Spectrum Scale RAID (GPFS native raid) Data and redundancy info distributed across all disks in the JBOD Array sizes not limited to spindle counts Very fast rebuild times for failed drives 5146 Machine Type E DCS3700 Expansion Chassis 2TB, 4TB, or 6TB NL-SAS Drives Note: SSD or Flash for the database is required from another storage system such as the IBM Flash System, PCI Flash adapter, or SSD’s in internal server drive bays 1) ESS is a bundled solution built upon IBM Spectrum Scale 2) Scale Raid provides very scalable and high performing disk solution ideally suited to SP stgpool workload 3) ESS GL-2 and GL-4 models 4) Very dense expansions with 60 NL-SAS drives per 4U expansion 5) NOTE: another storage location providing SSD/flash is needed for the TSM DB!

21 Protect + ESS blueprint designThe blueprint configuration script automates Protect server deployment with ESS ESS used for storage pool, archive log, database backup copies, and instance home directory files SSD/Flash external to ESS used to hold the Protect database and active log Simplified directory layout with sub-directories under a single Scale file system Storage pool does not require many separate file systems as is the case with Storwize blueprint Protect server options DIRECTIO and DIOENABLED turned off to maximize throughput with Scale ESS file system configuration: 6TB NL-SAS drives in the GL-4 provide ~900TB usable capacity ESS file system for Protect storage pool created using 8+2p RAID code and 2MB block size Recent testing shows improvement with GL-6 and larger 8MB block size 3WayReplication raid code used for metadata and 256K block size Only 1 TB reserved for metadata due to container storage pool creating a relatively small number of large files Client-side Scale cache increased on the Protect server to 24 GB © Copyright IBM Corporation 2017

22 Protect + ESS configuration detailsESS is configured with an existing Scale cluster that is ready to use via a service offering On the IBM Spectrum Protect server: Install IBM Spectrum Scale and IBM Spectrum Protect software On Linux, prepare the Scale kernel portability cd /usr/lpp/mmfs/src make Autoconfig make World make InstallImages mmstartup Setup passwordless ssh between the ESS storage and management nodes and the IBM Spectrum Protect server Add the Spectrum Protect server as a node in the existing cluster (from one of the storage nodes) mmaddnode -N Tune Spectrum Scale for the container pool workload on the Spectrum Protect server node mmchconfig disableDIO=yes,aioSyncDelay=10,pagepool=24G -N server_ip_address © Copyright IBM Corporation 2017

23 Protect + ESS configuration details (continued)Create the Spectrum Scale file system Create a stanza file to use for vdisk, nsd, and file system creation # cat /tmp/ess_vdisk %vdisk: vdiskName=GL2_A_L_meta_2m_1 rg=GL2_A_L da=DA1 blocksize=256k size=500g raidCode=3WayReplication diskUsage=metadataOnly pool=system %vdisk: vdiskName=GL2_A_R_meta_2m_1 rg=GL2_A_R da=DA1 blocksize=256k size=500g raidCode=3WayReplication diskUsage=metadataOnly pool=system %vdisk: vdiskName=GL2_A_L_data_2m_1 rg=GL2_A_L da=DA1 blocksize=2m raidCode=8+2p diskUsage=dataOnly pool=data %vdisk: vdiskName=GL2_A_R_data_2m_1 rg=GL2_A_R da=DA1 blocksize=2m raidCode=8+2p diskUsage=dataOnly pool=data Create NSD disks using the stanza file mmcrvdisk-F /tmp/ess_vdisk mmcrnsd -F /tmp/ess_vdisk Create and mount the file system mmcrfs esstsm1 -F /tmp/ess_vdisk -D nfs4 -B 2m --metadata-block-size 256k -A yes -L 128M -k nfs4 -m 1 -M 2 -Q no -r 1 -R 2 -S relatime -T /esstsm1 -z no mmmount /esstsm1 Caution: the stanza file is modified by the intermediate steps, so create a backup copy before starting The number of data vdisks to create depends on the available capacity. The stanza example shown does not define the size= parameter for the data disks, so they will take all available space in the recovery groups. With larger ESS sizes, we recommend creating creating enough vdisks such that the size of any single vdisk does not exceed 50TB. With recent testing on a GL-6, sequential write throughput improve by increasing the data vdisk and file system block size up to 8MB. © Copyright IBM Corporation 2017

24 Protect + ESS configuration details (continued)Edit the blueprint configuration response file cat responsefile_ess.txt serverscale L db2user tsminst1 db2userpw passw0rd db2userhomedir /esstsm1/tsminst1/tsminst1 db2group tsmsrvrs instdirmountpoint /esstsm1/tsminst1 dbdirpaths /ssd/tsminst1/database/db01,/ssd/tsminst1/database/db02 actlogpath /ssd/tsminst1/database/alog tsmstgpaths /esstsm1/tsminst1/deduppool archlogpath /esstsm1/tsminst1/database/archlog dbbackdirpaths /esstsm1/tsminst1/dbback backupstarttime 02:00 tsmsysadminid admin tsmsysadminpw passw0rd tcpport servername CLIENT21 Run the blueprint configuration script using the response file perl TSMserverconfig.pl responsefile_ess.txt Run the blueprint configuration script perl TSMserverconfig.pl responsefile_ess.txt Although the configuration script can be run interactively, easiest to use the response file. Note: path to SSD for the database volumes and the active log © Copyright IBM Corporation 2017

25 References IBM Knowledge Center IBM Spectrum Scale: IBM Spectrum Protect: IBM Spectrum Protect blueprints Petascale Data Protection https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/Tivoli%20Storage%20Manager/page/Petascale%20Data%20Protection Overview on Spectrum Protect – Spectrum Scale Integration https://www.ibm.com/developerworks/community/wikis/home/wiki/Tivoli%20Storage%20Manager/page/Integrating%20IBM%20Tivoli%20Storage%20Manager%20with%20IBM%20Elastic%20Storage Configuration of Spectrum Protect for Spectrum Scale AFM https://www.ibm.com/developerworks/community/wikis/home/wiki/Tivoli%20Storage%20Manager/page/Configuring%20IBM%20Spectrum%20Scale%20Active%20File%20Management Spectrum Protect for Space Manangement whitepaper Setup policy driven threshold migration: Setup cross platform cluster: YouTube IBM Spectrum Protect - mmbackup general functions https://youtu.be/3PMO4Sdegs0 IBM Spectrum Protect - mmbackup tweaks for max performance https://youtu.be/sg4FrZHi99Y IBM Spectrum Protect using Scale for db, logs & storage pools https://youtu.be/vIobC2MDIlE © Copyright IBM Corporation 2017

26 Thank you © Copyright IBM Corporation 2017