cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Create External Catalog when dbname has special characters

micheloh
New Contributor

Hi experts,

I'm having a problem when trying to create an external catalog with my PostgreSQL database. 

The connection is fine. But the database name that I want to connect has dashes and colon (eg. my-db-prod:all). When trying to connect with it, I always get the error 

[DATA_SOURCE_OPTION_CONTAINS_INVALID_CHARACTERS] Option database must not be empty and should not contain invalid characters, query strings, or parameters. SQLSTATE: 42602

 Is it has any way to create the catalog with this dbname? Unfortunately I cannot change the dbname.

 

Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions

Ashwin_DSA
Databricks Employee
Databricks Employee

Hi @micheloh,

From what we’ve seen, this is currently a limitation of Lakehouse Federation foreign catalog creation rather than a problem with the connection itself.

The PostgreSQL connection can succeed, but the database value used when creating the foreign catalog is still validated, and names containing special characters such as : can trigger the [DATA_SOURCE_OPTION_CONTAINS_INVALID_CHARACTERS] error. Unfortunately, quoting or escaping the database name does not currently get around that validation.

So if the source database name cannot be changed, there isn’t a supported way today to create a foreign catalog directly against that database name.

The practical workaround for now is to access the PostgreSQL database through JDBC from a notebook or job instead of using a foreign catalog.

Sorry, we don’t have a better answer here.

FYI. There was a similar question in the past, too. 

If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.

Regards,
Ashwin | Delivery Solution Architect @ Databricks
Helping you build and scale the Data Intelligence Platform.
***Opinions are my own***

View solution in original post

4 REPLIES 4

ziafazal
Databricks Partner

Hi @micheloh 

You should try url-encoding your db(my-db-prod%3Aall)
Any reserved characters for URLs (for example, /, :, @, (, ), [, ], &, #, =, ?, and space) that appear in any part of the connection URL must be percent encoded. See RFC 3986 for details.

unfortunately didn't work. As Ashwin_DSA said, probably I need to find another way to get the data from the postgreSQL

Ashwin_DSA
Databricks Employee
Databricks Employee

Hi @micheloh,

From what we’ve seen, this is currently a limitation of Lakehouse Federation foreign catalog creation rather than a problem with the connection itself.

The PostgreSQL connection can succeed, but the database value used when creating the foreign catalog is still validated, and names containing special characters such as : can trigger the [DATA_SOURCE_OPTION_CONTAINS_INVALID_CHARACTERS] error. Unfortunately, quoting or escaping the database name does not currently get around that validation.

So if the source database name cannot be changed, there isn’t a supported way today to create a foreign catalog directly against that database name.

The practical workaround for now is to access the PostgreSQL database through JDBC from a notebook or job instead of using a foreign catalog.

Sorry, we don’t have a better answer here.

FYI. There was a similar question in the past, too. 

If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.

Regards,
Ashwin | Delivery Solution Architect @ Databricks
Helping you build and scale the Data Intelligence Platform.
***Opinions are my own***

Thank you for the reply.

Well, I think there's no other way for now but create a function query using JDBC on a notebook. No problem with that. I was just trying to evade the work to complicate for final users to get the data as some tables are huge and the connection provided is limited (we're connecting with Query Service from Adobe Experience Platform and it has some limitations to query like time and processing as guardrails).

It's a good case to use genie though! 🙂

 

Thank you and regards,

micheloh