Transcript: SnapCenter for SAP HANA Data Protection
Welcome to the Webinar
Hello everyone, good morning & good evening depending on where you are located. We welcome you all to the webinar on “NetApp Snap Center tool to protect SAP HANA Database.” We have successfully deployed and configured SnapCenter tool to protect our on-Prem SAP HANA database and leveraged useful features that are available as part of SnapCenter tool. SAP HANA database is critical for SAP CRM application which is used by our global users 24/7 and 365 days.
In this webinar, we will share insights on the points that led us to the selection of SnapCenter tool along with few success stories. Highlight of this webinar is the demonstration of some common use cases using our SnapCenter tool. We hope you will find this session useful and appreciate any questions, we would be happy to address them.
Hosts
My name is Rajan Vaidheeswaran, Systems architect in the Application Platform Delivery team of NetApp IT. I am mainly focused on identifying, strategizing, and delivering critical initiatives within Application Platforms Delivery team. I joined NetApp 10 years back and had about 15 years of experience in IT prior to joining NetApp.
At this point, I would like to introduce our team Shreyas and Silar who have contributed to the successful deployment of SnapCenter. Shreyas will go first followed by Silar.
Shreyas: I am Shreyas, I have 8 years of rich overseas experience in SAP projects. I joined Platform and Engineering team in NETAPP a year back. As a System Architect, my key focus is on delivering architectural solutions for SAP Ecosystem which includes planning, assessments of SAP deliverables and evaluating new solutions which is having business and IT benefits. I am closely working with SAP competency center team on the adoption of SnapCenter tool for NetApp’s SAP infrastructure and leverage its benefits on Backup and restore operations, HA/DR solutions, and cloning options.
Agenda
As you see, we have a packed agenda to cover. We will start with the brief explanation of key elements of NetApp Storage system such as snapshots, SnapMirror, SnapVault and FlexClone. We think that it would help you relate to the topics that we will be covering in this webinar.
We will talk about business challenges and the IT charter typically with the data protection needs. We will share an overview of our on-Prem SAP CRM system and the ecosystem, data protection requirements, and the possible solution options that we assessed before deciding on SnapCenter. We will then share highlights of SnapCenter and how we deployed at NetApp. We will share our backup architecture and backup & retention schedule. We will be covering backup & retention policies during our demonstration.
Before starting system demonstration, we will share key metrics as part of our success stories.
Our system Demonstration section will include 4 key areas: a) how to setup/configure SnapCenter, b) how to take snapshot c) how to restore complete system and a tenant d) how to clone a new system. We will have 2 Q&A check points: 1 before demonstration starts and the final one after the demonstration. We will share few useful reference materials before ending this session.
Key elements of NetApp Storage System
As I mentioned earlier, we will start with the brief explanation of key elements of NetApp Storage system such as snapshots, SnapMirror, SnapVault and FlexClone.
What is a snapshot? Snapshot is point in time status of data. Note that we are not copying the entire data (for ex. 3TB in our case); we are only recording the pointers to the data blocks, as a result, storage overhead is minimal and time to recovering the system is reduced drastically.
What is SnapMirror? In simple terms, it is replication. We can either copy data at local site (intra cluster replication) or to a remote site (inter cluster replication) for different use cases such HA or DR, etc.
What is SnapVault? It is also another form of replication, primarily for archiving purposes. As SnapMirror is a replica of source, which means, when we delete a snapshot in source, it gets removed from target as well. Whereas, in SnapVault, snapshots can be retained for a longer duration, for example to recover a system with last year data for Sox audit, etc.
What is FlexClone? This is a powerful feature which helps us, create a new test or dev system quickly by using one of the snapshots. We can create multiple systems using one snapshot. Storage overhead is very minimal
Hope you find these definitions useful.
The Challenge
We all know that with the advent of IOT devices, data are growing exponentially. To cap it all, users expect continuous availability and consistent performance levels 24/7, even in the face of ever-increasing volumes of data. Also, we observe more frequent releases (thanks to Agile / scrum methodologies) which calls for quick provisioning of Test & Dev systems with production data. Also, organizations need to adhere to the conformance standards, such as demonstrating / documenting disaster recovery of business-critical applications periodically.
IT Charter
How do we address these challenges? IT organizations understand the downside of downtime caused by HW failure, disasters, etc. System backups cannot paralyze operations or enterprise applications. As a result, backup windows are shrinking, while the amount of data to be backed up is increasing. It is important to come up with a backup strategy that will use easy-to-use tools to deliver instant backup, recovery, scheduling and managing all these aspects from any physical location in a secured way. Design backup solution to ensure minimal or Zero data loss based on the company’s tolerance for the loss of data.
It is equally important to provision a test or development system with production data quickly. These are the common objectives of any IT function these days.
NetApp IT Use Case
Minimizing downtime was a top concern when NetApp upgraded our SAP CRM on-prem landscape in late 2018. We chose NetApp SnapCenter®, a data protection solution that encompasses NetApp Snapshot™, NetApp SnapMirror®, and NetApp SnapVault® technologies.
We have assessed other possible solution options and decided to use SC for the reasons that are covered in the upcoming slides.
About SAP CRM Deployment
SAP customer relationship management (CRM) software is a critical business application for tracking and managing NetApp customer support cases. It is used 24/7 by approximately 7,000 users ranging from engineers to support teams to external users across geographical locations. It is tightly integrated with other critical business systems such as NetApp® AutoSupport®, Oracle E-Business Suite, the NetApp Support Site, the data warehouse, mobile device management, and Oracle Fusion Middleware, to name a few.
We will take a deep dive into SAP CRM Ecosystem which is coming up in the next slide to see the systems that integrate with CRM.
As mentioned earlier, SAP CRM on-prem landscape went through a major upgrade in December 2018 from IBM AIX operating system to Red Hat Enterprise Linux operating system, Oracle to SAP HANA, and CRM from 5.2 to 7.0 EHP4. The current SAP CRM landscape is running on SAP NetWeaver 7.5.
We have 1 production system for each of CRM & other SAP applications and 5 non-production systems for each of its production counterpart. We periodically clone CRM non-production systems from production.
SAP CRM On-Prem Production & Boundary Systems
As mentioned earlier, CRM is a business-critical application used by global teams 24/7 365 days. You will see multiple application servers to provide redundancy and scalability. Users access CRM through a web application after they are successfully authenticated by our corporate identity application.
Application in green and blue boxes show how these applications exchange data from & to CRM. Application in Green boxes use SAP process orchestration and applications in blue boxes use Oracle fusion middleware, which in turn is integrated with SAP PO. In nutshell, SAP PO is a middleware application that integrates SAP CRM with other applications.
Our data warehouse systems pull data from HANA database for analytical & reporting purposes. In addition, CRM application pushes required data to another data warehouse database called PDW which is used by custom applications that need CRM data for read only purposes.
Please note that SAP HANA Primary database is in scope for the data protection use case.
Data Protection Requirements
We will review our data protection requirements for SAP HANA
Our Key objective of a backup and recovery solution is that it should not impact performance and availability of production. It should be a comprehensive solution, which means that it should protect both production & nonproduction systems.
It should have features such as cloning of nonproduction systems on demand, provisioning a new system and testing disaster recovery. Above all, it should be cost-effective, scalable, reliable, easy to use/learn and support. We have assessed 3 possible solutions which are discussed at length in the upcoming slide.
Comparison of Possible Options
As you see in the table, we pursued 3 possible solution options: A) Native HANA backup & recovery B) Implement HANA replication C) NetApp SC tool. We assessed our Data Protection requirements against each of these 3 options.
While Native Backup addressed all requirements, we observed performance impact when the backup was running. Also, it took around 6 hours for the backup to complete at the local site. It took another 12-16 for this backup to be available at non-production site for addressing cloning / DR testing requirements as the data need to be copied over wide area network
HANA replication option addressed one requirement (HA). However, there is significant cost for hardware, licensing & maintenance / support. We chose SnapCenter as our solution. We will share the details on how we arrived at this option in the next slide.
Why We Chose SnapCenter
NetApp SnapCenter addressed all requirements and incurred no additional licensing cost. It did not require a dedicated host and did not affect performance during backups. Time to take & replicate backups was quick, in minutes rather than hours.
It encompasses NetApp Technologies like Snapshot, SM and SV under its cover. We will see how these technologies enable us to address our data protection requirements.
About SnapCenter Tool
SnapCenter is a self-service web application which can be accessed securely from any place, provided there is internet connection. If necessary, one can enable single sign on with an identity provider for authentication.
We have enabled single sign on with our corporate Identity provider application.
SnapCenter integrates with SAP HANA databases through plug-ins to deliver data protection and flexible provisioning of SAP environments. It provides critical functions such as scheduling and managing backups, cloning a sub-production environment with production data, and testing disaster recovery without affecting system availability and performance
Tool is bundled with customizable dashboards and reports. It has built in workflows and some of the common workflows are a) Backup & Restore b) SAP System refresh c) SAP System Copy d) SAP System Clone.
Backup & Restore: Backup WF allows us take snapshots. Restore WF allows us to recover a complete system and recovering a tenant.
System refresh delivers copying data to an existing host. Common use case is cloning an existing system such Dev or QA or Stage with production data
System Copy delivers provisioning a new system. Common use case is sandbox system
System Clone delivers an identical copy of source system. Common use case is delivering a new production system when current production is unusable, for example due to hardware issues. Connection attributes such as hostname, system identifier (SID) remains same as host system so that users can continue to access without any changes
In our demonstration, we will be showing SnapCenter setup, backup & restore (both full system and single tenant) and System Refresh use cases.
SnapCenter Deployment at NetApp
Shreyas: Though there are other deployment options available, we chose a standalone deployment. We have deployed SnapCenter version 4.3.1 on a windows server.
Plug-ins are pushed by SnapCenter to the HANA Database server hosts that are configured in the SnapCenter. We are using one SnapCenter app to maintain and manage all our SAP CRM production & non-production systems. We configured required storage systems, hosts, data protection policies, resource groups and resources.
We needed to engage our storage team to create SnapMirror and SnapVault relationships for the storage volumes.
Backup Architecture
We have 2 data centers which are geographically separated to ensure business continuity at times of disaster. One Data center hosts production systems and the other one for non-production systems. As part of our data center and business continuity strategy, stage systems in the non-production data center are sized same as production which will be repurposed as production as necessary.
We have implemented SAP HANA replication for providing High Availability to SAP HANA database in production. We take snapshots and file system HANA backup using SnapCenter and transaction backup using Unix System Scheduler, which is called cronjob.
We will go over backup schedule in the next slide.
We have 2 dedicated storage systems in the non-production data center. DR storage is exclusively maintained for supporting business continuity in case of disaster at production site, Non-production storage is for supporting test/stage/dev/ systems which developers and testers primarily use.
We have configured asynchronous SnapMirror, aka, replication to sync data from production to non-production data center. In addition, we have configured SnapVault to store snapshots. As I shared earlier, SnapVault provides flexibility to maintain different retention policy so that we can keep snapshots for an extended period of time.
We sync DR storage every 4 hours; syncing non-production storage gets triggered as soon as the snapshot is created in the production site. We sync transaction backup to nonproduction storage every 30 min. In the next slide, we will review backup schedule we have configured in Snap Center.
Current Backup and Retention Schedule
We use Operating System Scheduler, aka cronjob to backup SAP HANA transactions every 15 min. For the rest, we use SnapCenter for protecting both Database and non-database storage volumes. As you see, we have configured Database snapshots every 4 hours and HANA full backup once a week.
Transaction backups are synced every 30 min at 10th and 40th min.
WE have also configured SnapCenter to protect non-data volumes such as CRM app servers, SAP HANA software and its libraries, etc. This table also shows which snapshots need to be replicated to DR storage versus non-production storage and snapshot retention polices.
Key Metrics
For a 3 TB HANA database, we have observed that SnapCenter took around 2-3 mins to take a snapshot as compared with 5-6 hours that HANA database tool typically takes to backup.
To recover complete system, including all tenants, SnapCenter took around 27 min. For a single tenant, it took around 27 min as well, in that 3 min was required to restore the snapshot. Rest of the time was necessary to bring up HANA database as part of recovery, which was performed outside SnapCenter
SnapCenter has built-in workflows to take care of all steps in 1 click to recover complete system. Whereas in the tenant recovery, we need to perform few tasks outside SnapCenter such as recovery of tenant after SC has restored the snapshot.
We will go over recovery process as part of our demonstration.
As the snapshots are replicated to non-production site, using SnapCenter workflow, we can clone an existing system with production data in 45 min. We need around 3 min for restoring snapshot and the rest for HANA related tasks, such as recovery and bring up HANA database
We will go over System Refresh as part of our demonstration
We successfully conducted DR simulation test without impacting production using SAP System Refresh workflow as explained just now. We did point in time recovery up to the latest transaction that was available in nonproduction site during recovery.
Before we go over next metric, let me clarify: what is RPA and RTA?
RPA is a metric that represents actual amount of data an organization would lose there be any disaster when application would be restored from backups
RTA is a metric that represents actual time taken an organization would spend to recover an application
We documented an RPA of 13 min of data loss and RTA of 3h and 15. Out of this 3h 15 min, we spent 2 hours for post recovery steps, technical & functional validations. Note that it took only 15 min for restoring database from the snapshot.
System Demonstration
We are now moving into the system demonstration section of this webinar. We are planning to show you a) how to setup SnapCenter with master data such as adding storage system, hosts, backup policies, etc., b) show monitoring, reporting and dashboard capability, c) demonstrate how to take a snapshot d) demonstrate recovering full system followed by recovering a tenant and last, demonstrate cloning an existing system from a production snapshot
My colleague Shreyas will drive these demonstrations. Over to Shreyas.
Shreyas: Let’s have a look at how we can do setup in SnapCenter tool. We are going to login SnapCenter tool, where you will be landed on dashboard tab. On left you can see many important tabs.
Let me walk you through how we can setup storage system to SnapCenter tool. Click on storage system tab, where you see pre-configured storage server and its IP address. We can add new storage system. Just click on New.
Let’s see how this existing system is configured. Here we can see storage system name need to be provided at the time of configuration and you to have a username with proper privilege to connect to storage system. For example here we have used snapctr. Under auto-settings you can see Event management system and AutoSupport settings both are using send error notification and also record event logs.
By submitting we can setup one time configuration for storage systems in SnapCenter tool and anytime you can alter these settings by clicking reset tab.
Adding Host:
Let’s see how to add host in SnapCenter tool. Click on host tab then click “add” on top right corner. Here we can see options for host type, host name or IP and credentials to connect our host. Below we can see plug-ins where we are going select SAP HANA.
Then click submit and it will add host to SnapCenter tool and deploy plug-in accordingly.
This specialty of SnapCenter tool where plug-ins will be installed automatically once host is detected no manual interference needed here. In monitor tab, we can see Job status of installation. Know you can see overall status “Running”.
Defining Policies:
Click on setting, Click on “New”. It will open workflow, define policy name for example “daily backup” and description as your wish, here we are keeping blank. Setting backup type as snapshot based and schedule frequency, we can select daily. Click next.
Here Retention setting for on demand backups, we can keep 7 snapshot and daily retention settings. Also, we select 7 days for snapshot backup. Click next.
Replication setting:
Here we can configure SnapMirror and SnapVault settings. These two terminologies we will come to know more on Refresh demo topic. Since we are not doing any replication here, we uncheck and we proceed.
Summary:
Summary will show you all settings which we have selected in previous workflow. Click next to configure policy. Now policy is created.
Adding resources:
Click on Resources. Here we can notice HANA system is auto detected. This is one of main feature of SnapCenter tool. Click on detected HANA system.
Now we need to configure HANA database. This step is very simple where host and OS user is auto detected, we need to give HDB user store key for the same.
Normally HDB user store key in configured by HANA administrator and this key is needed for SnapCenter for read backups and update backup catalog. Here we have configured “SNAP” as key and click “ok”. Database is configured. This is one-time setup.
This is the page which shows all settings for this particular HANA system.
Now let’s configure Resource protection and attach policy for this HANA resource.
There will be five workflows. Let’s start with resource. Here we can customize policy label. If you don’t want, just leave it blank and click next.
In application workflow, we can add commands, scripts, custom configuration to be executed before and after snapshot backups and snapshot custom tools can be selected based on requirements. More we can explore on NetApp standard documentations.
Click Next for adding policies. Select policy by drop down box. These policies are defined earlier in settings tab, or else we can define policy here as well. Configure schedule for the policy, where we have privilege to end this policy after certain frequency. click ok and click next.
Notification we can trigger mails to defined mail ID in case of errors or information about all event can be informed team members and click next and summary will show overview of selected parameters in previous workflow. click finish, Resource will be configured.
Here you can see summary card: records backup events, Primary backups, Show available backups. Above you can see other useful tabs.
Snapshot Recovery Workflow
This WF depicts various steps for the available 3 use cases. A) Recover a Single tenant with manual recovery b) Complete with automated recovery c) Complete resource with manual recovery.
Box in dark blue depicts Snap center steps and light blue, manual steps. As you see, when you follow Complete with automated recovery, you are provided with 3 additional options a) recover to a recent state b) point in time recovery c) specific backup. These options will be presented in SAP HANA studio when you recover for other WFs.
For the purposes of this demo, we will show tenant recovery and Complete resource with automated recovery.
Recovery Demo: Full (Shreyas)
Complete Resource:
Let’s have a look on how we can Restore HANA with Help of SnapCenter tool and Local snapshot backup.
Select your back based on the need and situation. Click on Restore tab.
We can see 6 workflows for this procedure,
In Restore scope, we need to select Restore Type, for use case we will select complete resource. Complete resource is will includes all database tenants. If you choose Volume Revert option old snapshot backups attached that volume will be deleted. So, its depends on use case to choose. since we will not have option to restore to another backup. click next
Recovery scope:
Choose your type of recovery option based on use case. Here we have selected “recover to most recent state”.
Log backup location is auto detected. One of highlighted feature of SnapCenter tool, Its detected log path automatically with help of plug-ins.
Pre-Ops and Post-Ops workflow talks about restore commands specifically for your system, if need you can enter here. This actually avoids user to login OS to perform OS tasks. One of the SnapCenter feature I want to highlight here. Click next.
In Notification, where you can mention email ID, Subject to notify team member about progress. Click next.
Summary will show you all parameters which we have entered in previous workflow. Click on finish.
Restore job is started, let us monitor this job. You see Validation of plugins, stopping HANA DB from SnapCenter tool itself. As you will see, Recovery of system database started now and it took about 3 mins.
Recovery of tenant database stated, it took 17 mins and HANA system is started now.
Restore took 20mins time and we notice entire activity is performed from SnapCenter tool itself. SnapCenter took is intelligent enough to take care complete resource restore activity.
Let’s verify HANA DB now, we can see all services started here and Tenant Database services also started successfully.
Recovery Demo: Tenant (Shreyas)
Let’s see how we can restore Tenant database from SnapCenter tool.
Select you are desire backup and click on Restore tab on top right corner
it will open workflow of 6 steps:
Let us see each step:
- Restore scope: Here we need to select tenant database. under tenant database we can see detected tenants in HANA system. This is auto detected from SnapCenter tool. Before you proceed further please stop tenant database. In HANA studio, Stop tenant database. Click next.
- Restore scope: No recovery we need to select, because as of today we need to do recovery of tenant database from HANA studio. Click next.
- PreOps and PostOps: In these steps we can enter command specific to your system before and after restore. it depends on customers to opt this option.
- Notification: Here we can add details for sending mails to team members for notifications
- Summary: Here we can see all parameters entered before steps
Restore Job started:
Even at operating system level, we can see data volume is created. Now login to HANA studio or graphical interface. Click on Recover Tenant database. Select tenant database which we have to recover. Click next. Here, specify recovery type.
Since we are demonstrating recovery, Lets select “Recover the database to its most recent state.” We need logs to recover right. Click next.
Here we need to specify backup catalog path for recovery. Backup catalog enables SAP HANA to determine the following: Whether recovery is possible and which backups to use to recover the database. Click next.
We need to select a backup. Snapshot back will be detected here.
Here we can see external backup ID, which we have selected from SnapCenter tool. Select it and click next. Specify log backup location, click next. Review recovery settings and click finish.
Recovery of tenant database started now. We can see recovery progress in Operating system level. Recovery completed successfully. We can see recovery time and log position information. Click on close.
Let’s verify system from HANA studio, we can see all process started in tenant DB.
SAP System Refresh
As you see, box in green depicts steps outside SnapCenter and the blue within SnapCenter.
It starts with exporting any data such as user master, transports, etc.
We need to stop SAP system to start the clone process. We then mount cloned volume and recover SAP HANA database and start SAP system.
Next few steps are for importing data and start post steps that are relevant for the application
Note that we would be using HANA studio to show recovery and starting of HANA Database, though we could have scripted them and use in SnapCenter under post recovery steps. We chose to use manual method for now and have a plan to automate these steps in the future.
Shreyas, you can take over from me for the demonstration of SAP system Refresh use case.
Shreyas: Under location select target system and NFS export IP. This is required because clone volume will be created in storage system which is linked to our target system. Click next.
Script section, we need to enter mount command, as you can see, we are telling SnapCenter tool to create clone volume under location /root/sc-mount.sh file. This setting enables us to get clone volume name at operating system level and no need to login to storage system to get this information. Click next. Notification tab can be used to send notification.
Summary will show you overview of all parameters.
Let’s monitor cloning job, Cloning completed in 1.5 mins.
Login to operating system and check sc-mount.sh script and go to path /tmp/sc-mount.txt.
We can see clone volume is created. Now stop target database.
Unmount data volume and mount data volume with newly created clone volume, Next step change the owner ship of data filesystem to target system SID.
This completes Operating system activities.
Now login to HANA studio or graphical interface.
Step 1: Here recover system database. Recover the database to a specific data backup.
Select Recover without backup catalog.
Under specify backup type, select snapshot. Available snapshot will be fetched, and we can start recovery of system DB.
After this, reset systemdb password at operating system level. Now recover tenant DB. Tenant recover is completed in 20mins.
Now let us check system status in HANA studio, where you can see all service stated successfully.
Reference Materials
Hope you found this session useful.
We are sharing few reference materials. First 2 links are for the documents that we followed to configure SnapCenter. We found these documents to be very useful for setting up and configuring SnapCenter and understanding different built-in workflows.
In the other 2 links, you will see our blog and a white paper on how we protected our SAP HANA system using SnapCenter. We would like to thank you all for attending this session and hope to see you all in one our future sessions soon. Wising you all a very happy holiday season and new year.