Backup and Restore Procedure

This article provides best practice recommendations and guidelines for the following procedures:

  • Backing up and restoring the BaseSpace Clarity LIMS database

  • Creating an archive of the Audit Trail database (for details on this feature, see the Enabling, Validating and Disabling Audit Trail), including performing a 'vacuum and analyze' database maintenance task to reclaim space and optimize performance.

Note that BaseSpace Clarity LIMS works with a variety of industry-standard operating systems and databases, and leverages their inherent file management systems. You should follow the guidelines that best meet the needs of your specific environment.

Assumptions

  • The user performing the procedures described below must:

    • Be a Linux administrator who can create and restore database backups and associated file system backups, using standard Linux tools.

    • Know how to start and stop BaseSpace Clarity LIMS and its installed services.

    • Have access to the source and destination servers.

  • The BaseSpace Clarity LIMS file store is located in /home/glsftp or /opt/gls/clarity/users/glsftp. If a different location is being used, substitute that directory in the relevant steps below.

  • The destination server has been integrated with the BaseSpace Clarity LIMS repository.

  • BaseSpace Clarity LIMS has been successfully validated, and is using an independent database.

  • The Audit Trail feature is enabled on the BaseSpace Clarity LIMS server (required for Audit Trail archiving procedure only).

Recommendations

Database backup process

Your database backup process will depend on the size of your installation, your hardware environment, and how much data your laboratory processes. However, we recommend that you:

  • Perform a full backup of the BaseSpace Clarity LIMS database at least once per week, and configure the database server to use archived logging.

  • Perform a full backup of the BaseSpace Clarity LIMS file system at least once per week, and perform incremental backups daily.

  • Use a file storage solution that has fail-over capabilities. For example, a RAID array.

You do not need to back up any other BaseSpace Clarity LIMS-specific data. The LIMS does not modify files once imported into the system and configuration information is stored in the database.

Incremental backups

An incremental backup is a recovery log (or redo log) that records database changes. Once configured, the relational database management system automatically updates the logs.

When performing incremental backups, there are a number of backup strategies and configurations, including the archival of recovery logs, which will depend on your environment. Your database administrator should establish a robust backup process that suits your organization’s environment and follows the database vendor’s administrative guidelines.

To improve performance and reduce risk, we recommend that you segregate the recovery logs onto a physical disk channel or device separate from where you store the data.

Audit Trail archive process

For best performance and to mitigate potential maintenance issues associated with disk space and large database size, it is recommended that you periodically archive the Audit Trail database. The frequency of performing this operation will differ depending on regulatory requirements.

Backing up the Basespace Clarity LIMS Database

1. Creating the backup copy

The following steps should be executed by the root user, on the source server.

To create the backup copy on the source server:

  1. Print a list of BaseSpace Clarity LIMS installed components and note the version numbers (the version number should match the version of the LIMS previously installed):

    yum list installed "Clarity*" "BaseSpace*" > file.txt  Note:The contents of file.txt for yum are formatted as follows:  Package     Version     Repo

    For example:

    ClarityLIMS-PreReqs.x86_64   5.0.2   @ClarityLIMS-ADS-QA-repo 
  2. Create an export of the database that can be restored on the destination system.

  3. Archive the customextensions directory: /opt/gls/clarity/customextensions

  4. Back up the configuration file: /etc/httpd/conf.d/clarity.conf

    • To back up configuration and all attached files, archive the BaseSpace Clarity LIMS file store: /home/glsftp -or- /opt/gls/clarity/users/glsftp \

    • To back up configuration only, archive the following:

      • /home/glsftp/*Scripts

      • /home/glsftp/ProcessType

      • /home/glsftp/Protocol \

  5. Backup SSL certificates: /etc/httpd/sslcertificates/.

2. Transferring the archives to the destination server

Once the archives have been created, you can transfer them to the destination server.

3. Restoring the copy

Execute the following steps on the destination server.

To restore the copy on the destination server:

  1. Make sure the BaseSpace Clarity LIMS repository file exists in /etc/yum.repos.d/

  2. As the root user, install the components listed in file.txt.

```
yum --enablerepo=<repo text ID> install <component_name>-<version>

For example:
```
```
yum --enablerepo=GLS_Clarity install ClarityLIMS-App-5.0.2 
```

3. Configure and validate the destination system, following the procedure outlined in Installation Procedure. 4. Stop the application server stack:

```
/opt/gls/clarity/bin/run_clarity.sh stop
```
  1. Restore the database exported from the source server.

  2. Restore the archives created.

  3. As the glsjboss user, update configuration for external interface points (api, glsftp):

    /opt/gls/clarity/config/update_claritylims_endpoints.sh
  4. As the glsjboss user, manually migrate the database forward:

    /opt/gls/clarity/config/migrate_claritylims_database.sh
  5. (Optional) To ensure that you have the correct Automated Informatics (AI) credentials, you may also want to rerun the AI configuration script. As the glsai user, from the /opt/gls/clarity/config/ directory run:

    /opt/gls/clartity/config/configure_limsserver_ai.sh
  6. Restart the application stack:

     /opt/gls/clarity/bin/run_clarity.sh start

Troubleshooting

During the restoration steps, if the migrate_claritylims_operationsinterface_database.sh script fails, try dropping and recreating the database, and then restoring from the backup (as in step 5 above) and rerunning the migration script (step 8).

For assistance with backing up and restoring your database, consult with your database administrator or IT group.

4. Possible scenarios

After restoring a system, the following scenarios may apply:

  • The database and file system are from the same point in time

  • The database is newer than the file system

  • The file system is newer than the database

When the file system and database are from the same point in time

When you have synchronized file system and database backups, the system has maintained its integrity and no further action is required.

When the database is newer than the file system

When the file system backup is older than the database backup, files referenced in the database may no longer exist in the file system.

For example, suppose you have a database backup from 10am and a file system backup from 8am. To update the database, roll back the database and manually re-import the files into BaseSpace Clarity LIMS from their source locations.

To roll back and update the database:

  1. Roll back the file system to the point of the last backup. In the example above, this is 8am.

  2. In the current database, select all the files whose creation date is after the last backup of the file system. This finds all the files imported into the database that will not be in the file system backup. In the example above, these are the files created between 8am and 10am.

  3. For each of the missing files found in step 2, find and record the associated LIMS projects, samples, and processes.

  4. Roll back the database to the time of the last file system backup. In the example above, this is earlier than 8am.

  5. Recreate the associated projects, samples, and processes as recorded in step 3.

  6. Re-import the files. You can find the origin of the files by looking at the database records queried in step 2.

  7. The database and file system are once again synchronized.

Finding missing files

It can sometimes be difficult to find missing files as there may be no way to recover them from their source locations (such as instrument computers).

In this case, we recommend that you retain database entries for the missing files. This will display an error message notifying the user when they try to retrieve the file in the LIMS, yet it will not affect the operation of the system.

If the files are present in their source locations, and you use the Automated Informatics Automatic Data Capture (ADC) plug-in, you can reset the ADC to re-capture the files and reference them in the database automatically.

To do this, delete the corresponding entries from the file transfer log file stored in the logs folder of the instrument’s ADC installation.

When the file system is newer than the database

When the database backup is older than the file system backup, files residing in the file system may not be referenced in the database. To update the database, you'll need to manually re-import the files into the LIMS.

To update the database:

  1. In the file system, search for files created after the last database backup.

  2. In BaseSpace Clarity LIMS, re-import the files into the applicable projects.

It is sometimes difficult to determine the location(s) in which to re-import files in the LIMS. You can manually check for files created by processes that ran during the outage, but ultimately, there is no easy way to do this.

Archiving the Audit Trail Database

The Audit Trail Archiver tool archives the audit tables from a given date, saves them to a file, and remove those records from the Audit Trail database.

For instructions on using the tool, see the Enabling, Validating and Disabling Audit Trail article in the Audit Trail section.

Rules & constraints

  • The tool currently only supports Postgres databases. Oracle is not supported at this time.

  • The tool prompts for a Postgres superuser as the Postgres COPY command depends upon it

  • When archiving audit data that was created in a version of the LIMS prior to version 4.1, the tool will temporarily disable the audit trigger on the loginaudit table. This prevents the loginaudit delete operations performed by the tool being included in the auditchangelog. Once the deletes have been performed, the trigger is re-enabled.

Arguments

Name

Description

Required

-db

Path to tenant lookup jdbc.properties file

Yes

-date

All audit entries older than this date (yyyy-MM-dd) will be archived

Yes

-F

Fully Qualified Domain Name (FQDN) of the BaseSpace Clarity LIMS server (may be omitted if there is only 1 server)

No

-dir

Absolute path of the destination directory where the audit archive will be written. This directory must be writable by the Postgres user.

Yes

-U

Database superuser username. May be prompted for otherwise.

No

-P

Database superuser password. May be prompted for otherwise.

No

-f

Proceed without prompting for confirmation.

No

Running the Archiver tool

Follow the steps below to create the Audit Trail archive. Once you have successfully created the archive, perform a vacuuming and analyzing database maintenance task to reclaim database space and optimize performance.

To create the Audit Trail database archive:

  1. Run the Audit Trail Archiver tool using the following command:

```
java -jar ClarityAuditTrailArchiver-0.0.#-jar-with-dependencies.jar -db <path-to-tenant-lookup>/
jdbc.properties -date <yyyy-MM-dd> -F <domain-name> -dir </path-to-destination-directory/> 
```
  1. A warning message displays:

```
"WARNING: This operation will remove all audit entries earlier than <yyyy-MM-dd> and export them to a file. 
This operation cannot be undone. Are you sure you would like to proceed? [Y/N]"
```
  1. An answer of anything other than Y or Yes (case does not matter) will abort the tool.

  2. All audit entries in the loginaudit, auditeventlog, and auditchangelog tables older than the supplied date are archived to a zip file.

    • The file contains comma-separated values (CSV) files of exported audit table data.

    • The name of the zip file indicates the 'before date' that was used to archive the audit data and the date on which the archive was generated.

    For example, suppose that on January 15, 2017 you generate an Audit Trail archive and supply the date December 31, 2016 to the tool:

    java -jar ClarityAuditTrailArchiver-0.0.3-jar-with-dependencies.jar -db ../src/conf/jdbc.properties 
    -date <2016-12-31> -F localhost -dir /opt/gls/clarity/backups/
  • The name of the generated zip file will be: ClarityAuditTrailArchive_before2016-12-31_generated2017-01-15-213246.zip

  • After the archive is complete and records have been deleted, an audit entry is added to the auditeventlog. The message entry in the auditeventlog indicates where the archive was generated. For example:

Audit trail archive complete. File written to /opt/gls/clarity/backups/ClarityAuditTrailArchive_before2016-
12-31_generated2017-01-15-213246.zip. Deleted audit records before 2016-12-31 for tables 
[loginaudit, auditeventlog, auditchangelog]  

Example output

The Audit Trail Archiver tool does not produce a log file. All output is directed to stdout and stderr. An example output is shown below:

$ java -jar ClarityAuditTrailArchiver-0.0.3-jar-with-dependencies.jar -db 
../src/conf/jdbc.properties -date 2016-12-31 -F localhost -dir /opt/gls/clarity/backups/
Tenant Lookup DB connection established
Using FQDN: localhost
> Postgres superuser name: proteo
> Postgres superuser password:
Tenant DB connection established
> WARNING: This operation will remove all audit entries earlier than 2016-12-31 and 
export them to a file. This operation cannot be undone. Are you sure you would like 
to proceed? [Y/N]
Y
Audit trail archive in progress...
Archiving loginaudit...
Finished archiving loginaudit
Archiving auditeventlog...
Finished archiving auditeventlog
Archiving auditchangelog...
Finished archiving auditchangelog
Audit trail archive complete. File written to 
/opt/gls/clarity/backups/ClarityAuditTrailArchive_before2016-12-31_generated2017-
01-15-213246.zip.
Deleting audit records from tenant DB...
Deleting records from loginaudit...
Finished deleting records from loginaudit
Deleting records from auditeventlog...
Finished deleting records from auditeventlog
Deleting records from auditchangelog...
Finished deleting records from auditchangelog
Finished deleting audit records from tenant DB 

Error conditions

The Archiver tool will exit with a -1 return code and an informative message for the following error conditions:

  • One or more of the required arguments are not specified

  • Invalid date format provided for the -date argument

  • No servers found in the tenant lookup database<

  • Multiple servers found, but the -F argument was not supplied

  • The destination directory either does not exist or is not writable for the database user

  • The Archiver tool cannot obtain the command line console to prompt for the warning

  • Could not open the jdbc.properties file.

  • Could not find tenant lookup database server information: jdbc.url, jdbc.username, jdbc.password, jdbc.driverClassName, jdbc.tenantUrl

  • Template properties are not specified in jdbc.properties

  • Could not establish database connection

  • Unrecognized fully qualified domain name

  • Could not find tenant database server information for the FQDN

  • The zip file was not created

Vacuuming and analyzing the database

After completing an Audit Trail archive, you can use the vacuumdb operation described below to vacuum and analyze the database. This operation will reclaim space and optimize performance.

Note:

The vacuumdb operation is a database maintenance task that should be performed by a database administrator. Ideally, no backup or other database-specific job should be running while performing the operation. You may wish to include the operation in your routine schedule of database maintenance tasks.

Before running the vacuumdb operation, please note the following:

  • The steps below are provided for illustration purposes only and assume a Software as a Service (SaaS) installation.

  • Paths shown assume a SaaS installation. In a typical on-premise installation, replace **/opt/gls/pgsql/**paths with /var/lib/pgsql/ However, note that this path may be changed. Confirm the correct path for your system.

  • Other options may vary depending on your Postgres version and database setup.

To vacuum and analyze the database:

  1. Check the database size on disk:

    du -sh /opt/gls/pgsql/<<db_version>>/
  2. Stop BaseSpace Clarity LIMS, BaseSpace Clarity LIMS Reporting (if applicable), and all sequencing services (if applicable).

  3. Restart PostgreSQL to drop any remaining connections to the database.

  4. Run the vacuum command with Full (-f) and Analyze (-z) options in verbose (-v) mode:

    vacuumdb -v -f -z -d <<DB_NAME>> -U <<DB_USER>>
  5. Re-check the database size on disk:

    du -sh /opt/gls/pgsql/<<db_version>>/
  6. Restart BaseSpace Clarity LIMS, BaseSpace Clarity LIMS Reporting, and all sequencing services, as applicable.

Re-importing archived data

You can import archived data back into a database using the Postgres COPY command.

To re-import data:

  1. Unzip the archive.

  2. Run the COPY command in Postgres for each archived table. For example:

```
COPY loginaudit from '/opt/gls/clarity/backups/ClarityAuditTrailArchive_before2016-12-31_generated2017-
01-15-111334/ClarityAuditTrailArchive_loginaudit_before2016-12-31_generated2017-01-15-111334.csv' 
WITH DELIMITER ',' CSV HEADER;

COPY auditeventlog from '/opt/gls/clarity/backups/ClarityAuditTrailArchive_before2016-12-31_generated2017-
01-15-111334/ClarityAuditTrailArchive_auditeventlog_before2016-12-31_generated2017-01-15-111334.csv' 
WITH DELIMITER ',' CSV HEADER;

COPY auditchangelog from '/opt/gls/clarity/backups/ClarityAuditTrailArchive_before2016-12-31_generated2017
-01-15-111334/ClarityAuditTrailArchive_auditchangelog_before2016-12-31_generated2017-01-15-111334.csv' 
WITH DELIMITER ',' CSV HEADER;
```

Last updated