Skip to main content
Skip to main content
Edit this page

Backup and Restore in ClickHouse

This section broadly covers backups and restores in ClickHouse. For a more detailed description of each backup method, see the pages for specific methods in the sidebar.

Introduction

While replication provides protection from hardware failures, it does not protect against human errors: accidental deletion of data, deletion of the wrong table or a table on the wrong cluster, and software bugs that result in incorrect data processing or data corruption.

In many cases mistakes like these will affect all replicas. ClickHouse has built-in safeguards to prevent some types of mistakes, for example, by default you can't just drop tables with a MergeTree family engine containing more than 50 Gb of data. However, these safeguards do not cover all possible cases and problems can still occur.

To effectively mitigate possible human errors, you should carefully prepare a strategy for backing up and restoring your data in advance.

Each company has different resources available and business requirements, so there's no universal solution for ClickHouse backups and restores that will fit every situation. What works for one gigabyte of data likely won't work for tens of petabytes of data. There are a variety of possible approaches with their own pros and cons, which are presented in this section of the docs. It is a good idea to use several approaches instead of just one such as to compensate for their various shortcomings.

Note

Keep in mind that if you backed something up and never tried to restore it, chances are that the restore will not work properly when you actually need it (or at least it will take longer than the business can tolerate). So whatever backup approach you choose, make sure to automate the restore process as well, and practice it on a spare ClickHouse cluster regularly.

The following pages detail the various backup and restore methods available in ClickHouse:

PageDescription
Backup/restore using local disk or S3 diskDetails backup/restore to or from a local disk or S3 disk
Backup/restore using S3 endpointDetails backup/restore to or from an S3 endpoint
Backup/restore using AzureBlobStorageDetails backup/restore to or from Azure blob storage
Alternative methodsDiscusses alternative backup methods

Backups can:

Backup types

Backups can be either full or incremental. Full backups are a complete copy of the data, while incremental backups are a delta of the data from the last full backup.

Full backups have the advantage of being a simple, independent (of other backups) and reliable recovery method. However, they can take a long time to complete and can consume a lot of space. Incremental backups, on the other hand, are more efficient in terms of both time and space, but restoring the data requires all the backups to be available.

Depending on your needs, you may want to use:

  • Full backups for smaller databases or critical data.
  • Incremental backups for larger databases or when backups need to be done frequently and cost effectively.
  • Both, for instance, weekly full backups and daily incremental backups.

Synchronous vs asynchronous backups

BACKUP and RESTORE commands can also be marked ASYNC. In this case, the backup command returns immediately, and the backup process runs in the background. If the commands are not marked ASYNC, the backup process is synchronous and the command blocks until the backup completes.

Concurrent vs non-concurrent backups

By default, ClickHouse allows concurrent backups and restores. This means you can initiate multiple backup or restore operations simultaneously. However, there are server-level settings that let you disallow this behavior. If you set these settings to false, only one backup or restore operation is allowed to run on a cluster at a time. This can help avoid resource contention or potential conflicts between operations.

To disallow concurrent backup/restore, you can use these settings respectively:

<clickhouse>
    <backups>
        <allow_concurrent_backups>false</allow_concurrent_backups>
        <allow_concurrent_restores>false</allow_concurrent_restores>
    </backups>
</clickhouse>

The default value for both is true, so by default concurrent backup/restores are allowed. When these settings are false on a cluster, only a single backup/restore is allowed to run on a cluster at a time.

Compressed vs uncompressed backups

ClickHouse backups support compression through the compression_method and compression_level settings.

When creating a backup, you can specify:

BACKUP TABLE test.table
  TO Disk('backups', 'filename.zip')
  SETTINGS compression_method='lzma', compression_level=3

Using named collections

Named collections allow you to store key-value pairs (like S3 credentials, endpoints, and settings) that can be reused across backup/restore operations. They help to:

  • Hide credentials from users without admin access
  • Simplify commands by storing complex configuration centrally
  • Maintain consistency across operations
  • Avoid credential exposure in query logs

See "named collections" for further details.

Backing up system, log or access management tables

System tables can also be included in your backup and restore workflows, but their inclusion depends on your specific use case.

System tables that store historic data, such as those with a _log suffix (e.g., query_log, part_log), can be backed up and restored like any other table. If your use case relies on analyzing historic data - for example, using query_log to track query performance or debug issues - it's recommended to include these tables in your backup strategy. However, if historic data from these tables is not required, they can be excluded to save backup storage space.

System tables related to access management, such as users, roles, row_policies, settings_profiles, and quotas, receive special treatment during backup and restore operations. When these tables are included in a backup, their content is exported to a special accessXX.txt file, which encapsulates the equivalent SQL statements for creating and configuring the access entities. Upon restoration, the restore process interprets these files and re-applies the SQL commands to recreate the users, roles, and other configurations. This feature ensures that the access control configuration of a ClickHouse cluster can be backed up and restored as part of the cluster's overall setup.

This functionality only works for configurations managed through SQL commands (referred to as "SQL-driven Access Control and Account Management"). Access configurations defined in ClickHouse server configuration files (e.g. users.xml) are not included in backups and cannot be restored through this method.

General syntax

-- core commands
BACKUP | RESTORE [ASYNC]
--- what to backup/restore (or exclude)
TABLE [db.]table_name           [AS [db.]table_name_in_backup] |
DICTIONARY [db.]dictionary_name [AS [db.]name_in_backup] |
DATABASE database_name          [AS database_name_in_backup] |
TEMPORARY TABLE table_name      [AS table_name_in_backup] |
VIEW view_name                  [AS view_name_in_backup] |
[EXCEPT TABLES ...] |
ALL [EXCEPT {TABLES|DATABASES}...] } [,...]
--- 
[ON CLUSTER 'cluster_name']
--- where to backup or restore to or from
TO|FROM 
File('<path>/<filename>') | 
Disk('<disk_name>', '<path>/') | 
S3('<S3 endpoint>/<path>', '<Access key ID>', '<Secret access key>', '<extra_credentials>') |
AzureBlobStorage('<connection string>/<url>', '<container>', '<path>', '<account name>', '<account key>')
--- additional settings
[SETTINGS ...]

See "command summary" for more details of each command.

Command summary

Each of the commands above is detailed below:

CommandDescription
BACKUPCreates a backup of specified objects
RESTORERestores objects from a backup
[ASYNC]Makes the operation run asynchronously (returns immediately with an ID you can monitor)
TABLE [db.]table_name [AS [db.]table_name_in_backup]Backs up/restores a specific table (can be renamed)
[PARTITION[S] partition_expr [,...]]Only backup/restore specific partitions of the table
DICTIONARY [db.]dictionary_name [AS [db.]name_in_backup]Backs up/restores a dictionary object
DATABASE database_name [AS database_name_in_backup]Backs up/restores an entire database (can be renamed)
TEMPORARY TABLE table_name [AS table_name_in_backup]Backs up/restores a temporary table (can be renamed)
VIEW view_name [AS view_name_in_backup]Backs up/restores a view (can be renamed)
[EXCEPT TABLES ...]Exclude specific tables when backing up a database
ALLBacks up/restores everything (all databases, tables, etc.). Prior to version 23.4 of ClickHouse, ALL was only applicable to the RESTORE command.
[EXCEPT {TABLES|DATABASES}...]Exclude specific tables or databases when using ALL
[ON CLUSTER 'cluster_name']Execute the backup/restore across a ClickHouse cluster
TO|FROMDirection: TO for backup destination, FROM for restore source
File('<path>/<filename>')Store to/restore from local file system
Disk('<disk_name>', '<path>/')Store to/restore from a configured disk
S3('<S3 endpoint>/<path>', '<Access key ID>', '<Secret access key>')Store to/restore from Amazon S3 or S3-compatible storage
[SETTINGS ...]See below for complete list of settings

Settings

Generic backup/restore settings

SettingDescriptionDefault value
idID of backup or restore operation, randomly generated UUID is used if not specified. If there's already a running operation with the same ID, an exception is thrown.
compression_methodSpecifies the compression method for the backup. See section "column compression codecs"
compression_levelSpecifies the compression level for the backup
passwordPassword for the file on disk.
base_backupThe destination of the base backup used for incremental backups. For example: Disk('backups', '1.zip')
use_same_password_for_base_backupWhether base backup archive should inherit the password from the query.
structure_onlyIf enabled, only backs up or restores the CREATE statements without the actual table data.
storage_policyStorage policy for the tables being restored. See "using multiple block devices for data storage. Only applicable to the RESTORE command. Applies only to tables with an engine from the MergeTree family.
allow_non_empty_tablesAllows RESTORE TABLE to insert data into non-empty tables. This will mix earlier data in the table with the data extracted from the backup. This setting can therefore cause data duplication in the table, use with caution.0
backup_restore_keeper_max_retriesMax retries for [Zoo]Keeper operations in the middle of a BACKUP or RESTORE operation. Should be big enough so the whole operation won't fail because of a temporary [Zoo]Keeper failure.1000
backup_restore_keeper_retry_initial_backoff_msInitial backoff timeout for [Zoo]Keeper operations during backup or restore100
backup_restore_keeper_retry_max_backoff_msMax backoff timeout for [Zoo]Keeper operations during backup or restore5000
backup_restore_failure_after_host_disconnected_for_secondsIf a host during a BACKUP ON CLUSTER or RESTORE ON CLUSTER operation doesn't recreate its ephemeral 'alive' node in ZooKeeper for this amount of time then the whole backup or restore is considered as failed. This value should be bigger than any reasonable time for a host to reconnect to ZooKeeper after a failure. Zero means unlimited.3600
backup_restore_keeper_max_retries_while_initializingMax retries for [Zoo]Keeper operations during the initialization of a BACKUP ON CLUSTER or RESTORE ON CLUSTER operation.20
backup_restore_keeper_max_retries_while_handling_errorMax retries for [Zoo]Keeper operations while handling an error of a BACKUP ON CLUSTER or RESTORE ON CLUSTER operation.20
backup_restore_finish_timeout_after_error_secHow long the initiator should wait for other host to react to the 'error' node and stop their work on the current BACKUP ON CLUSTER or RESTORE ON CLUSTER operation.180
backup_restore_keeper_value_max_sizeMaximum size of data of a [Zoo]Keeper's node during backup1048576
backup_restore_batch_size_for_keeper_multiMaximum size of batch for multi request to [Zoo]Keeper during backup or restore1000
backup_restore_batch_size_for_keeper_multireadMaximum size of batch for multiread request to [Zoo]Keeper during backup or restore10000
backup_restore_keeper_fault_injection_probabilityApproximate probability of failure for a keeper request during backup or restore. Valid value is in interval [0.0f, 1.0f]0
backup_restore_keeper_fault_injection_seed0 for a random seed, otherwise the setting value0
backup_restore_s3_retry_attemptsSetting for Aws::Client::RetryStrategy, Aws::Client does retries itself, 0 means no retries. It takes place only for backup/restore.1000
max_backup_bandwidthThe maximum read speed in bytes per second for particular backup on server. Zero means unlimited.0
max_backups_io_thread_pool_sizeClickHouse uses threads from the Backups IO Thread pool to do S3 backup IO operations. max_backups_io_thread_pool_size limits the maximum number of threads in the pool.1000
max_backups_io_thread_pool_free_sizeIf the number of idle threads in the Backups IO Thread pool exceeds max_backup_io_thread_pool_free_size, ClickHouse will release resources occupied by idling threads and decrease the pool size. Threads can be created again if necessary.0
backups_io_thread_pool_queue_sizeThe maximum number of jobs that can be scheduled on the Backups IO Thread pool. It is recommended to keep this queue unlimited due to the current S3 backup logic. Note: A value of 0 (default) means unlimited.0
backup_threadsThe maximum number of threads to execute BACKUP requests.
max_backup_bandwidth_for_serverThe maximum read speed in bytes per second for all backups on server. Zero means unlimited.0
shutdown_wait_backups_and_restoresIf set to true ClickHouse will wait for running backups and restores to finish before shutdown.1

S3 specific settings

SettingDescriptionDefault value
use_same_s3_credentials_for_base_backupWhether base backup to S3 should inherit credentials from the query. Only works with S3.
s3_storage_classThe storage class used for S3 backup. For example: STANDARD

Azure specific settings

SettingDescriptionDefault value
azure_attempt_to_create_containerWhen using Azure Blob Storage, whether to attempt creating the specified container if it doesn't exist.true

Administration and troubleshooting

The backup command returns an id and status, and that id can be used to get the status of the backup. This is very useful to check the progress of long ASYNC backups. The example below shows a failure that happened when trying to overwrite an existing backup file:

BACKUP TABLE helloworld.my_first_table TO Disk('backups', '1.zip') ASYNC
┌─id───────────────────────────────────┬─status──────────┐
│ 7678b0b3-f519-4e6e-811f-5a0781a4eb52 │ CREATING_BACKUP │
└──────────────────────────────────────┴─────────────────┘

1 row in set. Elapsed: 0.001 sec.
SELECT
*
FROM system.backups
WHERE id='7678b0b3-f519-4e6e-811f-5a0781a4eb52'
FORMAT Vertical
Row 1:
──────
id:                7678b0b3-f519-4e6e-811f-5a0781a4eb52
name:              Disk('backups', '1.zip')
#highlight-next-line
status:            BACKUP_FAILED
num_files:         0
uncompressed_size: 0
compressed_size:   0
#highlight-next-line
error:             Code: 598. DB::Exception: Backup Disk('backups', '1.zip') already exists. (BACKUP_ALREADY_EXISTS) (version 22.8.2.11 (official build))
start_time:        2022-08-30 09:21:46
end_time:          2022-08-30 09:21:46

1 row in set. Elapsed: 0.002 sec.

Along with the system.backups table, all backup and restore operations are also tracked in the system log table system.backup_log:

SELECT *
FROM system.backup_log
WHERE id = '7678b0b3-f519-4e6e-811f-5a0781a4eb52'
ORDER BY event_time_microseconds ASC
FORMAT Vertical
Row 1:
──────
event_date:              2023-08-18
event_time_microseconds: 2023-08-18 11:13:43.097414
id:                      7678b0b3-f519-4e6e-811f-5a0781a4eb52
name:                    Disk('backups', '1.zip')
status:                  CREATING_BACKUP
error:
start_time:              2023-08-18 11:13:43
end_time:                1970-01-01 03:00:00
num_files:               0
total_size:              0
num_entries:             0
uncompressed_size:       0
compressed_size:         0
files_read:              0
bytes_read:              0

Row 2:
──────
event_date:              2023-08-18
event_time_microseconds: 2023-08-18 11:13:43.174782
id:                      7678b0b3-f519-4e6e-811f-5a0781a4eb52
name:                    Disk('backups', '1.zip')
status:                  BACKUP_FAILED
#highlight-next-line
error:                   Code: 598. DB::Exception: Backup Disk('backups', '1.zip') already exists. (BACKUP_ALREADY_EXISTS) (version 23.8.1.1)
start_time:              2023-08-18 11:13:43
end_time:                2023-08-18 11:13:43
num_files:               0
total_size:              0
num_entries:             0
uncompressed_size:       0
compressed_size:         0
files_read:              0
bytes_read:              0

2 rows in set. Elapsed: 0.075 sec.