Big Lake Storage: An Open Data Lakehouse on Google Cloud
Large Lake Storage
Built open, high-performance, enterprise Big Lake storage Lakehouses iceberg native
Businesses may use Apache Iceberg to develop open, high-performance, enterprise-grade data lakehouses on Google Cloud with recent Big Lake storage engine improvements. Customers no longer have to choose between completely managed, enterprise-grade storage management and open formats like Apache Iceberg.
Businesses want adaptive, open, and interoperable architectures that let several engines work on a single copy of data while data management is revolutionised. Apache Iceberg is a popular open table style. The latest Big Lake storage development offers Apache Iceberg access to Google's infrastructure, enabling open data lakehouses.
Major advances include:
BigLake Metastore is normally available: BigLake Metastore, formerly BigQuery, is now public. This completely managed, serverless, and scalable solution simplifies runtime metadata maintenance and operations for BigQuery and other Iceberg-compatible engines. Use of Google's global metadata management infrastructure reduces the need to control proprietary metastore implementation. BigLake Metastore is necessary for open interoperability.
Iceberg REST Catalogue API Preview Introduction: To complement the GA Custom Iceberg Catalogue, the Iceberg REST Catalogue (Preview) provides a standard REST interface for interoperability. Users, including Spark users, can use the BigLake metastore as a serverless Iceberg catalogue. The Custom Iceberg Catalogue lets Spark and other open-source engines connect with Apache Iceberg and BigQuery BigLake tables.
Google Cloud is simplifying lakehouse upkeep using Apache Iceberg and Google Cloud Storage management. Cloud Storage features like auto-class tiering, encryption, and automatic table maintenance including compaction and trash collection are supported. This enhances Iceberg data management in Cloud Storage.
BigQuery usually has Apache Iceberg BigLake tables: These publicly available tables combine BigQuery's scalable, real-time metadata with Iceberg formats' transparency. This enables BigQuery's Write API's high-throughput streaming ingestion and zero-latency reads at tens of GiB/second. It also has automatic table management (compaction, garbage collection), native Vertex AI interface, auto-reclustering speed improvements, and future fine-grained DML and multi-table transactions (coming soon in preview). These tables maintain Iceberg's openness while providing controlled, enterprise-ready functionality. BigLake automatically creates and registers an Apache Iceberg V2 metadata snapshot in its metastore. This snapshot updates automatically after edits.
BigLake natively supports Dataplex Universal Catalogue for AI-Powered Governance. This interface provides consistent and fine-grained access restrictions to apply Dataplex governance standards across engines. Direct Cloud Storage access supports table-level access control, whereas BigQuery can use Storage API connectors for open-source engines for finer control. Dataplex integration improves BigQuery and BigLake Iceberg table governance with search, discovery, profiling, data quality checks, and end-to-end data lineage. Dataplex simplifies data discovery with AI-generated insights and semantic search. End-to-end governance benefits are automatic and don't require registration.
The BigLake metastore enables interoperability with BigQuery, AlloyDB (preview), Spark, and Flink. This increased compatibility allows AlloyDB users to easily consume analytical BigLake tables for Apache Iceberg from within AlloyDB (Preview). PostgreSQL users can link real-time AlloyDB transactional data with rich analytical data for operational and AI-driven use cases.
CME Group Executive Director Zenul Pomal noted, âWe needed teams throughout the company to access data in a consistent and secure way â regardless of where it stored or what technologies they were using.â They used Google's BigLake. BigLake from Google was clear. The uniform layer for accessing data and a fully managed experience with enterprise capabilities via BigQuery are available without moving or duplicating data, whether the data is in traditional tables or open table formats like Apache Iceberg. Metadata quality is critical as it explores gen AI applications. BigLake Metastore and Data Catalogue help us preserve high-quality metadata.
At Google Cloud Next '25, Google Cloud announced support for change data capture, multi-statement transactions, and fine-grained DML in the coming months.
Google Cloud is evolving BigLake into a comprehensive storage engine that uses open-source, third-party, and Google Cloud services by eliminating trade-offs between open and managed data solutions. This boosts data and AI innovation.














