Computing at Cornell Servers and Hosts

Cornell's Enterprise Server (CornellC)
Backup Documentation

Ken Frost
June 2007

Introduction

This document explains the current scheme used by CIT for backing up the CornellC Mainframe system.

For production MVS we do three kinds of backups – dataset backups, media failure/disaster recovery backups, and ADABAS database backups. We separately back up the RACF database dataset – SYS1.RACFPRIM – as part of the daily and weekly Vanguard review of RACF changes and violations. For production VM we do file backups and media failure/disaster recovery backups.

For the A51MVS test MVS and A51VM test VM LPARs we do media failure/disaster recovery backups only.

Dataset (MVS) and file (VM) backups are done so that we can restore lost, incorrectly deleted, or corrupted files. The separate backup of the RACF security database dataset makes sure that we have a good backup of this important system resource. Media failure/disaster recovery backups are done so that we can recover in the event of catastrophic loss of a disk volume or volumes. Special backups of the VM directory and the current VM NSSs/DCSSs are done to make sure we have a good backup of these important system components. ADABAS database backups are done so that we can restore lost, incorrectly deleted or corrupted database files or records, so that we can recover in the event of catastrophic loss of a disk volumes or volumes containing database data, and also for legal reasons.

In addition to the above there are regular application level backups that are done by various user areas, e.g. the Financial Aid Office. Those backups are beyond the scope of this document.

Production MVS Backups

Dataset backups

At around 4:30 AM each morning Monday through Friday we do a backup of disk datasets that have changed since the previous backup run; see Daily incremental dataset backups, below. With this scheme if a dataset is accidentally destroyed or deleted, at most one day of changes to the dataset will be lost, except if the dataset is changed on Friday and Saturday or Saturday and Sunday, in which case two days of changes can be lost, or Friday, Saturday and Sunday, in which case three days of changes can be lost. In addition we can restore to previous versions (up to 12) of a dataset if those versions were changed within the previous 3 months. Note though that only the last version per day is saved (Sunday through Thursday), and only the last version produced Friday, Saturday or Sunday is saved. If a dataset is deleted only the last backup copy is kept, and it is retained for 90 days. However if another dataset is created with the name of the deleted dataset and this second dataset is subsequently deleted, the backup of the original dataset is lost.

We also do a “dump” of all datasets on nine operating system disk volumes each Sunday night around 10 PM, and a daily backup of the RACF security database dataset, SYS1.RACFPRIM. See Weekly System volumes dataset dumps and Daily MVS RACF database backups, below. Once a week a copy of the most recent RACF database backup is made. See Weekly MVS RACF database backups, below.

Daily incremental dataset backups

We do incremental disk dataset backups every Monday through Friday starting at 4:30 AM. Up to 12 previous versions of a dataset are kept (1 version a day, no more than 12 versions). These are done to the 3494-tape robot on the 6th floor of Rhodes Hall. These backups are kept for no more than 90 days. When a dataset is deleted all but the last backup copy is deleted. The last copy is retained for 90 days.

Some datasets are not included in these incremental backups: Any datasets that are open when these backups are done and all datasets in management class NOBACKUP (ADABAS & Operating System support files).

Notes about migration and backup

Inactive datasets are migrated to the 3494-tape robot at CCC by the MVS Hierarchical Storage Manager. Before the dataset is migrated a backup copy of the dataset is automatically created (if a backup copy doesn’t already exist) so that we don’t lose the only copy in the event of media failure. The migrated copy of a dataset is in the 3494 robot in CCC and the backup copy is in the 6th floor 3494 robot in Rhodes. The migrated copy is kept until the dataset is recalled or deleted. The backup copy is retained for at least 90 days after the dataset is deleted. Backup of the migration tapes managed by HSM is done daily for yet a third copy of migrated data stored in the 3494 on the 6th floor of Rhodes and the original data is at CCC.

A backup copy is retained for at least 90 days unless there are 12 later backups made. For example, if MTG.ONE.TIME is created and never modified, the backup will be kept for at least 90 days after the dataset is deleted. After 90 days, the dataset may be deleted, depending on tape usage. If a data set called MTG.DAILY.LOG is deleted and recreated (or modified) every day, there will only be 12 backups – less than 90 days worth.

Weekly System volumes dataset dumps

Weekly on Sunday night at 10 PM we do “VTOC dumps” - a backup of all datasets - on nine MVS System disk volumes. These are done to the 6th floor 3494 tape robot in Rhodes, and are retained for 3 months.

Daily MVS RACF database backups

Every night we make a backup copy of the production MVS RACF security database, SYS1.RACFPRIM. This is done as part of the VANDAILY job that produces a report of the RACF commands and violations on the MVS production system. The purpose of doing the RACF database backup as part of VANDAILY is to coordinate the backup with the Vanguard report of changes to the RACF database (the RACF command report). This combination allows us to restore the production MVS RACF database to a point in time. VANDAILY runs daily at 11:50pm.

Weekly MVS RACF database backups

Once a week we make and save a copy of the most recent production MVS RACF database backup. This is done as part of the VANGWKLY job that rolls up the daily Vanguard reports into a weekly archive. Again, the purpose of doing this as part of VANGWKLY is to keep the RACF database backups and the Vanguard RACF command reports in synch. 52 weeks of production MVS RACF database backups are kept. VANGWKLY runs each Monday morning at 3 AM.

MVS Media Failure/disaster recovery backups

Weekly volume backups

We do Shark FlashCopy backups of all the production MVS volumes weekly on Sunday morning, which are then copied onto stacked 3590 volumes currently in the CCC 3494 robot. (“Stacked” means more than one disk volume backup on a single tape volume.) This provides us with offsite backups of current MVS datasets. We keep five generations (five weeks) of these backups.

Similar backups are done once a week on Saturday morning for the A51MVS (test) MVS volumes. Note that the A51MVS backups are done from the production MVS system.

Daily volume backups

We do Shark FlashCopy backups of all the production MVS volumes daily (except for Sunday, see Weekly volume backups), which are then copied onto stacked 3590 volumes currently in the CCC 3494 robot. (“Stacked” means more than one disk volume backup on a single tape volume.) This provides us with offsite backups of current MVS datasets. We keep five generations (five days) of these backups.

As part of the daily MVS volume backups we pause the ADABAS images, do the Flashcopy, then resume the ADABAS images.

We do not do daily backups of the A51MVS (test) MVS volumes.

Year-end volume backups

We copy the MVS weekly disaster recovery backups run the last weekend in June (the end of the fiscal year) to 3590 tapes on MVS by ADRDSSU (the DSS utility). These become our year-end backups for MVS. VM year-end disaster recovery backups are copied from the VM “All Track CP Volume Type” Dump Backups created the last weekend in June (see below). In addition we keep the VMBACKUP weekly complete file backups (see below) created on that weekend. A complete year-end backup set consists of about 12 3590 tapes for MVS and VM combined. We keep two year-end sets, the most recent in the CCC tape robot. Copies of the JCL listings from the jobs that created these tapes are kept in Ken Frost's office.

ADABAS database backups

We do nightly full backups of our production ADABAS databases. The databases are up while the backups are done. An online backup copies all files, as does an offline backup. During the course of the online backup, records to some number of files are updated. ADABAS updates the Checkpoint file when the backup starts and finishes. The Checkpoint file is also updated when any utility runs (read, Protection Log copies in this case). So, from that information, ADABAS knows which PLOG is active and would contain the Before/After images of any records updated during the backup.

If you need to run a restore from an online backup, you would include the appropriate PLOGs in the job stream with DD cards. Technically, you don't need the PLOGs if the file being restored was not updated, but it is often just easier to include the PLOGs in all cases. ADABAS can figure out if it needs them. This information is available in the output of the backup job.

These backups are run to tapes in the 3494 at CCC providing us with daily offsite database backups. Ten generations of these backups are kept.

We do full backups of our production ADABAS databases on Sunday mornings with the databases down to ensure that we have a complete backup of the databases that is viable for a full restore. These backups are run to tapes in the 3494 at CCC providing us with offsite database backups. Five generations of these backups are kept.

Additionally we do a monthly backup of the production databases (ADA1, ADA6) the last Saturday night / Sunday morning of each month to the 3494 Tape robot in CCC. We keep 18 months worth of these monthly backups.

We also run a yearly copy of the production database backups (ADA1, ADA6) the last Saturday night/Sunday morning of June to the 3494 Tape robot ROBBIE in Rhodes hall on the 6th floor. We keep these backups for 7 years.

 

Production VM backups

VM minidisk file backups

As with MVS, we do a backup of all VM files once a week and a nightly backup of files that have changed that day. With this scheme if a file is accidentally destroyed, at most one day of changes to the file will be lost. In addition we can restore to previous versions of a file created within the previous 35 days. Both the Weekly and Daily VMBACKUPs are initiated by the VMSCHED service virtual machine at 23:00.

Weekly complete file backups

Weekly, on Friday night, a complete VMBACKUP file backup is run. The VMBACKUPs are based on the VM object directory database. These are done to both the 3494 tape robots in CCC and Rhodes Hall with the primary backup tapes in CCC and the duplicate copies in the Rhodes Hall 3494. The backup media are 3590 cartridges.

Daily incremental file backups

Incremental VMBACKUP file level backups are done every night with the exception of Friday and are based on changes found since the previous VMBACKUP full backup. Like the Friday night full backups, the backups are written to 3590 cartridges in the tape robots in CCC and Rhodes Hall with the primary on the CCC robot. All previous versions of files copied during the Weekly and Daily backups are kept for the full 35 day retention period.

Daily backup of system information

Every night a backup of the VMBACKUP catalog, the VM source directory, and the NSS’s and DCSS’s (Named Saved Systems and DisContiguous Saved Systems, software images that are kept in the VM spool) is done.

The NSS’s and DCSS’s are dumped to a 3590 tape cartridge in the Rhodes Hall 3494. The nightly backups are each stacked on a tape cartridge that is alternated every month. The service machine doing this backup is DCS which is XAUTOLOG'd by the VMSCHED scheduling service machine at 23:45.

The VM source directory is backed up to a USER BACKUP disk file on the DIRMAINT 1DB minidisk and that minidisk is backed up with the next VMBACKUP full or incremental backup. The backup of the VM object directory is done by DIRMAINT at 22:58 and is triggered by a timed event that is set in DIRMAINT’s DIRMAINT DATADVH file on the DIRMAINT 1DB disk.

The VMBACKUP catalog information is copied nightly, after the completion of the nightly VMBACKUP full or incremental dump, is done. The copy is triggered by the Programmable Operator (PROP) code running in the RTX userid when the VMBMNP0167I message is issued, signaling the end of the backup run. The copy is triggered by RTX XAUTOLOGging the VMBCOPY userid which does a DDR copy of all of the VMBACKUP minidisks to it’s own disks. These copied disks are the images of the VMBACKUP minidisks after that night's backups but are not dumped to tape until the next night’s backup. In the event of a loss of the drive containing the VMBACKUP 1B0 disk, which holds the VMBACKUP catalog, the most recent copy would be on the VMBCOPY 1B0 disk on a different volume. In the event, of the loss of both volumes, the previous night's copy would be on tape.

VM media failure/disaster recovery backups

Weekly volume backups

We do an “All Track CP Volume Type” Dump of all production VM system and storage volumes once a week on Sunday morning from production MVS. These dumps are stacked on 3590 volumes currently in the CCC 3494 robot. (“Stacked” means more than one disk volume backup on a single tape volume.) This provides us with offsite backups of current VM data. We keep five generations of these backups.

Similar backups are done once a week on Saturday morning for the test VM LPAR volumes. These are also done from the production MVS LPAR.

Daily volume backups

We do an “All Track CP Volume Type” Dump of all production VM system and storage volumes daily, except on Sunday morning from production MVS. These dumps are stacked on 3590 volumes currently in the CCC 3494 robot. (“Stacked” means more than one disk volume backup on a single tape volume.) This provides us with offsite backups of current VM data. We keep five generations of these backups.

We do not do daily backups of the test VM LPAR volumes.

Year-end volume backups

The set of “All Track CP Volume Type” Dump backups run the last weekend in June is copied to 3590 tapes and saved as our year-end set. MVS year-end backups are done at the same time (see above). A complete year-end backup set currently consists of about 13 3590 tapes for MVS and VM combined. We keep two year-end sets, with the most recent set currently being in the tape robot ROBBIE on the 6th floor of Rhodes Hall.

LPAR mode can be used for recovery

Since we now have the test LPARs on CornellC, we can, if necessary, use them for recovery purposes, rebuilding the production systems from the test LPARs.



Computing at Cornell arrow CIT Services arrow Servers and Host Accounts arrow Backup Documentation

Computing at Cornell Homepage CUinfo CIT Contact List Send Us Feedback

Last modified: June 12, 2007