iceberg Table Function
Provides a read-only table-like interface to Apache Iceberg tables in Amazon S3, Azure, HDFS or locally stored.
Syntax
Arguments
Description of the arguments coincides with description of arguments in table functions s3
, azureBlobStorage
, HDFS
and file
correspondingly.
format
stands for the format of data files in the Iceberg table.
Returned value
A table with the specified structure for reading data in the specified Iceberg table.
Example
ClickHouse currently supports reading v1 and v2 of the Iceberg format via the icebergS3
, icebergAzure
, icebergHDFS
and icebergLocal
table functions and IcebergS3
, icebergAzure
, IcebergHDFS
and IcebergLocal
table engines.
Defining a named collection
Here is an example of configuring a named collection for storing the URL and credentials:
Schema Evolution
At the moment, with the help of CH, you can read iceberg tables, the schema of which has changed over time. We currently support reading tables where columns have been added and removed, and their order has changed. You can also change a column where a value is required to one where NULL is allowed. Additionally, we support permitted type casting for simple types, namely:
- int -> long
- float -> double
- decimal(P, S) -> decimal(P', S) where P' > P.
Currently, it is not possible to change nested structures or the types of elements within arrays and maps.
Partition Pruning
ClickHouse supports partition pruning during SELECT queries for Iceberg tables, which helps optimize query performance by skipping irrelevant data files. Now it works with only identity transforms and time-based transforms (hour, day, month, year). To enable partition pruning, set use_iceberg_partition_pruning = 1
.
Time Travel
ClickHouse supports time travel for Iceberg tables, allowing you to query historical data with a specific timestamp or snapshot ID.
Basic usage
Note: You cannot specify both iceberg_timestamp_ms
and iceberg_snapshot_id
parameters in the same query.
Important considerations
-
Snapshots are typically created when:
- New data is written to the table
- Some kind of data compaction is performed
-
Schema changes typically don't create snapshots - This leads to important behaviors when using time travel with tables that have undergone schema evolution.
Example scenarios
All scenarios are written in Spark because CH doesn't support writing to Iceberg tables yet.
Scenario 1: Schema Changes Without New Snapshots
Consider this sequence of operations:
Query results at different timestamps:
- At ts1 & ts2: Only the original two columns appear
- At ts3: All three columns appear, with NULL for the price of the first row
Scenario 2: Historical vs. Current Schema Differences
A time travel query at a current moment might show a different schema than the current table:
This happens because ALTER TABLE
doesn't create a new snapshot but for the current table Spark takes value of schema_id
from the latest metadata file, not a snapshot.
Scenario 3: Historical vs. Current Schema Differences
The second one is that while doing time travel you can't get state of table before any data was written to it:
In Clickhouse the behavior is consistent with Spark. You can mentally replace Spark Select queries with Clickhouse Select queries and it will work the same way.
Metadata File Resolution
When using the iceberg
table function in ClickHouse, the system needs to locate the correct metadata.json file that describes the Iceberg table structure. Here's how this resolution process works:
Candidate Search (in Priority Order)
-
Direct Path Specification:
- If you set
iceberg_metadata_file_path
, the system will use this exact path by combining it with the Iceberg table directory path. - When this setting is provided, all other resolution settings are ignored.
- If you set
-
Table UUID Matching:
- If
iceberg_metadata_table_uuid
is specified, the system will:- Look only at
.metadata.json
files in themetadata
directory - Filter for files containing a
table-uuid
field matching your specified UUID (case-insensitive)
- Look only at
- If
-
Default Search:
- If neither of the above settings are provided, all
.metadata.json
files in themetadata
directory become candidates
- If neither of the above settings are provided, all
Selecting the Most Recent File
After identifying candidate files using the above rules, the system determines which one is the most recent:
-
If
iceberg_recent_metadata_file_by_last_updated_ms_field
is enabled:- The file with the largest
last-updated-ms
value is selected
- The file with the largest
-
Otherwise:
- The file with the highest version number is selected
- (Version appears as
V
in filenames formatted asV.metadata.json
orV-uuid.metadata.json
)
Note: All mentioned settings are table function settings (not global or query-level settings) and must be specified as shown below:
Note: While Iceberg Catalogs typically handle metadata resolution, the iceberg
table function in ClickHouse directly interprets files stored in S3 as Iceberg tables, which is why understanding these resolution rules is important.
Metadata cache
Iceberg
table engine and table function support metadata cache storing the information of manifest files, manifest list and metadata json. The cache is stored in memory. This feature is controlled by setting use_iceberg_metadata_files_cache
, which is enabled by default.
Aliases
Table function iceberg
is an alias to icebergS3
now.