Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
Hi! I'm using GCP. Does Databricks workspace always need two e2-highmem-2 instances running as soon as I create a workspace? I seem them in my VM list in GCP console no matter what (I can stop or remove a cluster, but these two machines are always th...
To clarify, on Databricks on GCP will automatically delete the underlying GKE after 5 days of inactivity (no cluster launches or non-empty instance pools) in the workspace. You can contact Databricks support if you want to shorten the idle TTL for th...
I am looking through Google Cloud Platform and I am looking to get started with Databricks on GCP. Happy if anyone can point me in the direction that can provide guidance on how to get started.Thansk
Hey boyelana Databricks on Google Cloud Platform is definitely an interesting and powerful combination, and I'm thrilled to see that you're looking to get started with it, boyelana!To begin your journey with Databricks on GCP, there are a few steps y...
On GCP I subscribed to Databricks in one project within the organization.Then I canceled this subscription and subscribed to Databricks in another project.When I try to login onto the newly subscribed databricks with google SSO:> There was an error s...
I can see the issue might be related to organizations or billing accounts. The new Databricks project I tried creating was on a different organization/billing-account than the test Databricks subscription I created a month back.I went back to the ori...
Hello, I have an Databricks account on Azure, and the goal is to compare different image tagging services from Azure, GCP, AWS via corresponding API calls, with Python notebook. I have problems with GCP vision API calls, specifically with credentials...
Ok, here is a trick: in my case, the file with GCP credentials is stored in notebook workspace storage, which is not visible to os.environ() command. So solution is to read a content of this file, and save it to the cluster storage attached to the no...
I started using Databricks in Google Cloud but it charges some unexpected costs. When I create a cluster I notice some compute resources being created in GCP but when I stop the cluster these resources are still up and never shut down. This issue res...
The answer to the question about the kubernetes cluster regardless of dbx compute and dwh resources running is provided in this thread: https://community.databricks.com/s/question/0D58Y00009TbWqtSAF/auto-termination-for-clusters-jobs-and-delta-live-t...
Hi there,I've a batch process configured in a workflow which fails due to a jdbc timeout on a Postgres DB.I checked the JDBC connection configuration and it seems to work when I query a table and doing a df.show() in the process and it displays th...
HI @Fred Foucart ,The above code looks good to me. Can you try with below code as well.spark.read\ .format("jdbc") \ .option("url", f"jdbc:postgresql://{host}/{database}") \ .option("driver", "org.postgresql.Driver") \ .option("user", username) ...
Hi All,Hope everyone is doing well.We are currently validating Databricks on GCP and Azure.We have a python notebook that does some ETL (Copy, extract zip files and process files within the zip files)Our Cluster Config on AzureDBX Runtime - 10.4 - Dr...
I have a databricks job running in azure databricks. A similar job is also running in databricks gcp. I would like to compare the cost. If I assign a custom tag to the job cluster running in azure databricks, I can see the cost incurred by that job i...
I was using the trial period in databricks for 14 days and had some important notebooks where I had made all the changes. Now I have extended the service and have subscribed for databricks in GCP. When I enter the workspace section I cannot see the w...
Hi @Aditya Aranya Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Than...
In the documentation https://registry.terraform.io/providers/databricks/databricks/latest/docs https://docs.gcp.databricks.com/dev-tools/terraform/index.html I could not find documentation on how to provision Databricks workspaces in GCP. Only cre...
Hi @horatiu guja Does @Debayan Mukherjee response answer your question?If yes, would you be happy to mark it as best so that other members can find the solution more quickly? Else, we can help you with more details.
I'm wondering if you can help me with a google auth issue related to structured streaming and long running databricks jobs in general. I will get this error after running for 8+ hours. Any tips on this? GCP auth issues for long running jobs?Caused by...
I'm trying to read data from GCP kafka through azure databricks but getting below warning and notebook is simply not completing. Any suggestion please? WARN NetworkClient: Consumer groupId Bootstrap broker rack disconnectedPlease note I've properly c...
Hello, We have several workspaces in GCP and want to create another one in another region. For some reason, after we enter all GKE IP ranges, we got a BAD_REQUEST error that applies that it couldn't get our oath token.We tried to login out and in aga...
Hi @Leon Bam Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!
Which cloud hosting environment is best to use for Databricks? My question pins down to the fact that there must be some difference between the latency, throughput, result consistency & reproducibility between different cloud hosting environments of ...
Hi @Vikas Sinha Does @Prabakar Ammeappin response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!
We are trying to move a GCP project to a Newly created Org and new billing account. We have a Databricks instance from GCP Marketplace with licensing As per the docs when we change a billing account for a Project the license on the first billing acco...
Hi @Ankit K Hope you are well. Just wanted to see if you were able to find an answer to your question and would you like to mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Cheers!