Backup and Restore Procedure
This article provides best practice recommendations and guidelines for the following procedures:
Backing up and restoring the BaseSpace Clarity LIMS database
Creating an archive of the Audit Trail database (for details on this feature, see the Enabling, Validating and Disabling Audit Trail), including performing a 'vacuum and analyze' database maintenance task to reclaim space and optimize performance.
Note that BaseSpace Clarity LIMS works with a variety of industry-standard operating systems and databases, and leverages their inherent file management systems. You should follow the guidelines that best meet the needs of your specific environment.
Assumptions
The user performing the procedures described below must:
Be a Linux administrator who can create and restore database backups and associated file system backups, using standard Linux tools.
Know how to start and stop BaseSpace Clarity LIMS and its installed services.
Have access to the source and destination servers.
The BaseSpace Clarity LIMS file store is located in /home/glsftp or /opt/gls/clarity/users/glsftp. If a different location is being used, substitute that directory in the relevant steps below.
The destination server has been integrated with the BaseSpace Clarity LIMS repository.
BaseSpace Clarity LIMS has been successfully validated, and is using an independent database.
The Audit Trail feature is enabled on the BaseSpace Clarity LIMS server (required for Audit Trail archiving procedure only).
Recommendations
Database backup process
Your database backup process will depend on the size of your installation, your hardware environment, and how much data your laboratory processes. However, we recommend that you:
Perform a full backup of the BaseSpace Clarity LIMS database at least once per week, and configure the database server to use archived logging.
Perform a full backup of the BaseSpace Clarity LIMS file system at least once per week, and perform incremental backups daily.
Use a file storage solution that has fail-over capabilities. For example, a RAID array.
You do not need to back up any other BaseSpace Clarity LIMS-specific data. The LIMS does not modify files once imported into the system and configuration information is stored in the database.
Incremental backups
An incremental backup is a recovery log (or redo log) that records database changes. Once configured, the relational database management system automatically updates the logs.
When performing incremental backups, there are a number of backup strategies and configurations, including the archival of recovery logs, which will depend on your environment. Your database administrator should establish a robust backup process that suits your organization’s environment and follows the database vendor’s administrative guidelines.
To improve performance and reduce risk, we recommend that you segregate the recovery logs onto a physical disk channel or device separate from where you store the data.
Audit Trail archive process
For best performance and to mitigate potential maintenance issues associated with disk space and large database size, it is recommended that you periodically archive the Audit Trail database. The frequency of performing this operation will differ depending on regulatory requirements.
Backing up the Basespace Clarity LIMS Database
1. Creating the backup copy
The following steps should be executed by the root user, on the source server.
To create the backup copy on the source server:
Print a list of BaseSpace Clarity LIMS installed components and note the version numbers (the version number should match the version of the LIMS previously installed):
For example:
Create an export of the database that can be restored on the destination system.
Archive the customextensions directory: /opt/gls/clarity/customextensions
Back up the configuration file: /etc/httpd/conf.d/clarity.conf
To back up configuration and all attached files, archive the BaseSpace Clarity LIMS file store: /home/glsftp -or- /opt/gls/clarity/users/glsftp \
To back up configuration only, archive the following:
/home/glsftp/*Scripts
/home/glsftp/ProcessType
/home/glsftp/Protocol \
Backup SSL certificates: /etc/httpd/sslcertificates/.
2. Transferring the archives to the destination server
Once the archives have been created, you can transfer them to the destination server.
3. Restoring the copy
Execute the following steps on the destination server.
To restore the copy on the destination server:
Make sure the BaseSpace Clarity LIMS repository file exists in /etc/yum.repos.d/
As the root user, install the components listed in file.txt.
3. Configure and validate the destination system, following the procedure outlined in Installation Procedure. 4. Stop the application server stack:
Restore the database exported from the source server.
Restore the archives created.
As the glsjboss user, update configuration for external interface points (api, glsftp):
As the glsjboss user, manually migrate the database forward:
(Optional) To ensure that you have the correct Automated Informatics (AI) credentials, you may also want to rerun the AI configuration script. As the glsai user, from the /opt/gls/clarity/config/ directory run:
Restart the application stack:
4. Possible scenarios
After restoring a system, the following scenarios may apply:
The database and file system are from the same point in time
The database is newer than the file system
The file system is newer than the database
When the file system and database are from the same point in time
When you have synchronized file system and database backups, the system has maintained its integrity and no further action is required.
When the database is newer than the file system
When the file system backup is older than the database backup, files referenced in the database may no longer exist in the file system.
For example, suppose you have a database backup from 10am and a file system backup from 8am. To update the database, roll back the database and manually re-import the files into BaseSpace Clarity LIMS from their source locations.
To roll back and update the database:
Roll back the file system to the point of the last backup. In the example above, this is 8am.
In the current database, select all the files whose creation date is after the last backup of the file system. This finds all the files imported into the database that will not be in the file system backup. In the example above, these are the files created between 8am and 10am.
For each of the missing files found in step 2, find and record the associated LIMS projects, samples, and processes.
Roll back the database to the time of the last file system backup. In the example above, this is earlier than 8am.
Recreate the associated projects, samples, and processes as recorded in step 3.
Re-import the files. You can find the origin of the files by looking at the database records queried in step 2.
The database and file system are once again synchronized.
Finding missing files
It can sometimes be difficult to find missing files as there may be no way to recover them from their source locations (such as instrument computers).
In this case, we recommend that you retain database entries for the missing files. This will display an error message notifying the user when they try to retrieve the file in the LIMS, yet it will not affect the operation of the system.
If the files are present in their source locations, and you use the Automated Informatics Automatic Data Capture (ADC) plug-in, you can reset the ADC to re-capture the files and reference them in the database automatically.
To do this, delete the corresponding entries from the file transfer log file stored in the logs folder of the instrument’s ADC installation.
When the file system is newer than the database
When the database backup is older than the file system backup, files residing in the file system may not be referenced in the database. To update the database, you'll need to manually re-import the files into the LIMS.
To update the database:
In the file system, search for files created after the last database backup.
In BaseSpace Clarity LIMS, re-import the files into the applicable projects.
It is sometimes difficult to determine the location(s) in which to re-import files in the LIMS. You can manually check for files created by processes that ran during the outage, but ultimately, there is no easy way to do this.
Archiving the Audit Trail Database
The Audit Trail Archiver tool archives the audit tables from a given date, saves them to a file, and remove those records from the Audit Trail database.
For instructions on using the tool, see the Enabling, Validating and Disabling Audit Trail article in the Audit Trail section.
Rules & constraints
The tool currently only supports Postgres databases. Oracle is not supported at this time.
The tool prompts for a Postgres superuser as the Postgres COPY command depends upon it
When archiving audit data that was created in a version of the LIMS prior to version 4.1, the tool will temporarily disable the audit trigger on the loginaudit table. This prevents the loginaudit delete operations performed by the tool being included in the auditchangelog. Once the deletes have been performed, the trigger is re-enabled.
Arguments
Running the Archiver tool
Follow the steps below to create the Audit Trail archive. Once you have successfully created the archive, perform a vacuuming and analyzing database maintenance task to reclaim database space and optimize performance.
To create the Audit Trail database archive:
Run the Audit Trail Archiver tool using the following command:
A warning message displays:
An answer of anything other than Y or Yes (case does not matter) will abort the tool.
All audit entries in the loginaudit, auditeventlog, and auditchangelog tables older than the supplied date are archived to a zip file.
The file contains comma-separated values (CSV) files of exported audit table data.
The name of the zip file indicates the 'before date' that was used to archive the audit data and the date on which the archive was generated.
For example, suppose that on January 15, 2017 you generate an Audit Trail archive and supply the date December 31, 2016 to the tool:
The name of the generated zip file will be: ClarityAuditTrailArchive_before2016-12-31_generated2017-01-15-213246.zip
After the archive is complete and records have been deleted, an audit entry is added to the auditeventlog. The message entry in the auditeventlog indicates where the archive was generated. For example:
Example output
The Audit Trail Archiver tool does not produce a log file. All output is directed to stdout and stderr. An example output is shown below:
Error conditions
The Archiver tool will exit with a -1 return code and an informative message for the following error conditions:
One or more of the required arguments are not specified
Invalid date format provided for the -date argument
No servers found in the tenant lookup database<
Multiple servers found, but the -F argument was not supplied
The destination directory either does not exist or is not writable for the database user
The Archiver tool cannot obtain the command line console to prompt for the warning
Could not open the jdbc.properties file.
Could not find tenant lookup database server information: jdbc.url, jdbc.username, jdbc.password, jdbc.driverClassName, jdbc.tenantUrl
Template properties are not specified in jdbc.properties
Could not establish database connection
Unrecognized fully qualified domain name
Could not find tenant database server information for the FQDN
The zip file was not created
Vacuuming and analyzing the database
After completing an Audit Trail archive, you can use the vacuumdb operation described below to vacuum and analyze the database. This operation will reclaim space and optimize performance.
Before running the vacuumdb operation, please note the following:
The steps below are provided for illustration purposes only and assume a Software as a Service (SaaS) installation.
Paths shown assume a SaaS installation. In a typical on-premise installation, replace **/opt/gls/pgsql/**paths with /var/lib/pgsql/ However, note that this path may be changed. Confirm the correct path for your system.
Other options may vary depending on your Postgres version and database setup.
To vacuum and analyze the database:
Check the database size on disk:
Stop BaseSpace Clarity LIMS, BaseSpace Clarity LIMS Reporting (if applicable), and all sequencing services (if applicable).
Restart PostgreSQL to drop any remaining connections to the database.
Run the vacuum command with Full (-f) and Analyze (-z) options in verbose (-v) mode:
Re-check the database size on disk:
Restart BaseSpace Clarity LIMS, BaseSpace Clarity LIMS Reporting, and all sequencing services, as applicable.
Re-importing archived data
You can import archived data back into a database using the Postgres COPY command.
To re-import data:
Unzip the archive.
Run the COPY command in Postgres for each archived table. For example:
Last updated