06-10-2025 02:19 PM
Hi -- I'm trying to connect to BigQuery as a foreign catalog. I'm able to create and successfully test the connection, but when I create a foreign catalog it appears empty, and queries against that catalog return a "TABLE_OR_VIEW_NOT_FOUND" error.
The BQ instance is owned by a vendor, I'm using a service account that's specifically for data sharing. Our permissions are limited, but we do have the DataViewer role, and I can execute queries just fine using other tools (eg, the BQ Python Client) with the same service account.
Some information that may be relevant based on what I've tried so far:
The project that owns our service account, and is specified in the service account JSON, is different from the project with the bigquery data. Creating a Databricks "Connection" succeeds when I leave the "project" field blank or provide the SA project, if I provide the BQ project it fails (we don't have permission to create jobs in that project).
I've tried creating the foreign catalog with the BQ project, SA project, no project, combinations, including region, escaped and unescaped, and garbage names. The catalog creation always succeeds, and the only time we get an error here is when the project name contains an invalid character. In all other cases, the error we get running a query is the same, "table or view not found".
I noticed that we don't have access to the INFORMATION_SCHEMA tables for our dataset/region. This isn't an issue to query the tables we do have access to, but may affect the catalog integration part? I would hope this isn't a hard blocker on any query against any table.
I'll also note that we're still in the trial period with our Databricks account. Hoping to validate that this connection is possible, since managing access across clouds is a big pain point!
Thanks,
--Matt
06-11-2025 07:33 PM
This is a common issue when setting up BigQuery as a foreign catalog in Databricks,
especially with cross-project service accounts and limited permissions.
The problem you're experiencing is likely related to how Databricks discovers and catalogs BigQuery metadata.
The most likely culprit is that Databricks needs to query BigQuery's metadata tables to populate the foreign catalog,
but your service account has limited access. Even though you can query specific tables directly,
the catalog discovery process requires broader metadata access.
Solutions to Try:
1. Request Additional IAM Roles
Ask your vendor to grant your service account these additional roles on the BigQuery project:
- roles/bigquery.metadataViewer - This is crucial for catalog discovery
- roles/bigquery.resourceViewer - Helps with project-level resource discovery
These roles allow reading metadata without granting data access beyond what you already have.
2. Specify the Correct Project in Catalog Creation
When creating the foreign catalog, use the BigQuery project ID (where the data lives), not your service account's project.
The connection can use your SA project, but the catalog needs to point to the data project.
3. Try Dataset-Specific Catalog Creation
Instead of cataloging the entire project, try creating a catalog that points to specific datasets you have access to.
Some Databricks versions allow you to specify dataset-level scope.
4. Manual Table Registration
As a workaround, you might be able to manually register specific tables in the catalog using SQL commands like:
CREATE TABLE catalog_name.schema_name.table_name
USING bigquery
OPTIONS (
table 'project.dataset.table'
)
a month ago
Thanks @lingareddy_Alva! This is helpful.
I reached out to ask about the required permissions, we'll see how that goes.
In the meantime, I'm trying to test out the other two paths. Do you know where I could find docs or examples for those operations? I'm having trouble finding more info relevant to troubleshooting after initial attempts didn't work.
On dataset-specific catalogs, wondering if there are options that can be specified under "CREATE FOREIGN CATALOG" to enable that, but can't find docs on what are valid options to provide. Trying to replace the "dataset project ID" in the UI "create catalog" flow yields the same result as before, empty catalog.
On manually registering the table, should I be creating it within the foreign catalog or in a different Databricks-managed catalog? I get an error trying to create schemas in a foreign catalog, and an error that the schema doesn't exist trying to create a table. If the latter, I feel like I would need to specify more options -- eg, the connection to use -- but again can't find references in the docs to how this is configured.
Thanks again!
a month ago
Dataset-Specific Catalogs
Unfortunately, Databricks doesn't support dataset-level scoping in the CREATE FOREIGN CATALOG command for BigQuery.
The catalog always tries to discover all datasets in the specified project. The options are quite limited:
CREATE FOREIGN CATALOG catalog_name
USING CONNECTION connection_name
OPTIONS (
project_id 'your-bq-project-id',
-- That's basically it for BigQuery-specific options
);
The UI field you're seeing is just a different way to specify the same project_id option.
Manual Table Registration
For manual registration, you need to create tables in a Databricks-managed catalog,
not the foreign catalog. Here's the correct approach:
1. Create a regular Databricks catalog and schema:
CREATE CATALOG my_bq_tables;
CREATE SCHEMA my_bq_tables.dataset_name;
2. Create tables that reference BigQuery using your connection:
CREATE TABLE my_bq_tables.dataset_name.table_name
USING bigquery
OPTIONS (
path 'your-bq-project.dataset.table',
connectionName 'your_connection_name'
);
3. You can also create views:
CREATE VIEW my_bq_tables.dataset_name.view_name
AS SELECT * FROM bigquery.`your_connection_name`.`your-bq-project.dataset.table`;
Troubleshooting Manual Registration
If you're getting errors, try:
- Verify your connection works with direct queries first
- Check the exact table path in BigQuery (project.dataset.table)
- Ensure your connection name is correct (case-sensitive)
The key insight is that foreign catalogs are for automatic discovery (which needs broad permissions),
while manual registration lets you work around permission limitations by explicitly defining what you need access to.
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now