Hello, I am trying to write a simple upsert statement following the steps in the tutorials. here is what my code looks like:from pyspark.sql import functions as Fdef upsert_source_one(self df_source = spark.readStream.format("delta").table(self.so...
Using sample data sets. Here is the full code. This error does seem to be related to runtime version 15,df_source = spark.readStream.format("delta").table("`cat1`.`bronze`.`officer_info`")df_orig_state = spark.read.format("delta").table("`sample-db`....
Hi,We have been planning to migrate the Synapse Data bricks activity executions from 'All-purpose cluster' to 'New job cluster' to reduce overall cost. We are using Standard_D3_v2 as cluster node type that has 4 CPU cores in total. The current quota ...
Hi @IshaBudhiraja,
Quotas are used for different resource groups, subscriptions, accounts, and scopes. The number of cores for a particular region may be restricted by your subscription.To verify your subscription’s usage and quotas, follow these st...
Hello, I have a question.Context :I have a Unity Catalog organized with three schemas (bronze, silver and gold). Logically, I would like to create tables in each schemas.I tried to organize my pipelines on the layers, which mean that I would like to ...
Hello, thanks for the answers @YuliyanBogdanov, @standup1.So the solution is to use catalog.schema.table, and not LIVE.table, that's the key, you were right standup!But, you won't have the visibility of the tables on Bronze Pipeline, if you are on Si...
Dear all,We have a pyspark streaming job (DBR: 14.3) that continuously writes new data on a Delta Table (TableA).On this table, there is a pyspark batch job (DBR: 14.3) that operates every 15 minuted and in some cases it may delete some records from ...
Hi @EDDatabricks,
Thank you for providing the details about your PySpark streaming and batch jobs operating on a Delta Table.
The concurrency issue you’re encountering seems to be related to the deletion of records from your Delta Table (TableA) du...
Hello,I am trying to create a Storage Credential. I have created the access connector and gave the managed identity "Storage Blob Data Owner" permissions. However when I want to create a storage credential I get the following error:Creating a storage...
Hi @Kaniz Can you elaborate on the error "Refresh token not found for userId"?I have exactly the same problem as described in this thread. I am trying to create a storage credential using a Personal Access Token from a Service Principal. This results...
I am writing a frontend webpage that will log into DataBricks and allow the user to select datasets.I am new to front end development, so there may be some things I am missing here, but I know that the DataBricks SQL connector for javascript only wor...
Hi @BenDataBricks,
Ensure that the auth_code variable in your Python script contains the correct authorization code obtained from the browser.Verify that the code_verifier you’re using matches the one you generated earlier.Confirm that the redirect_...
Hi,I am working on a DR design for Databricks in Azure. The recommendation from Databricks is to use Deep Clone to clone the Unity Catalog tables (within or across catalogs). My design is to ensure that DR is managed across different regions i.e. pri...
Hi @SenthilJ, The recommendation from Databricks to use Deep Clone for cloning Unity Catalog (UC) tables is indeed a prudent approach. Deep Clone facilitates the seamless replication of UC objects, including schemas, managed tables, access permission...
I discovered recently mlflow managed by Databricks so I'm very new to this and I need some help.Can someone explain for me clearly the steps to do to be able to track my runs into the Databricks API.Here are the steps I followed :1/ Installing Databr...
Hi @Khaled75,
The specific error message you provided is related to fetching the experiment by name. It’s essential to understand the exact error message. Can you share the complete error text?Confirm that your Databricks authentication is working c...
Hi, I noticed that there is quite a significant delay (2 - 10s) between making a change to some file in Repos via Databricks file edit window and propagation of such change to the filesystem. Our engineers and scientists use YAML config files. If the...
I am unable to write data from Databricks into an S3 bucket. I have set up the permissions both on the bucket policy level, and the user level as well (Put, List, and others are added, have also tried with s3*). Bucket region and workspace region are...
Hi @Debi-Moha,
Ensure that the IAM role associated with your Databricks cluster has the necessary permissions to access the S3 bucket. Specifically, it should have permissions for s3:PutObject and s3:ListBucket.Double-check that the IAM role is corr...
Hi all,I am trying the new "git folder" feature, with a repo that works fine from the "Repos". In the new folder location, my imports from my own repo don't work anymore. Anyone faced something similar?Thanks in advance for sharing your experience
Hi @Kaniz ,Thanks for all the suggested options.I tried again with a brand new git folder. I just changed cluster from DBR 14.2 ML to 14.3 ML, and now the imports work as expected.Kind regards
I have a source table A in Unity Catalog. This table is constantly written to and is a streaming table.I also have another table B in Unity Catalog. This is a managed table with liquid clustering.Using Auto Loader I move new data from A to B using a ...
Hi,We are trying to use Python SDK and create a workspace client using the following code:%pip install databricks-sdk --upgrade dbutils.library.restartPython()from databricks.sdk import WorkspaceClientw = WorkspaceClient()Here is the notebook: https:...
Hi @databrick_usert , Hope you are doing well!
Can you check the version of the SDK running in this notebook? If it's not an upgraded version then could you please try to upgrade the SDK version and then restart the python after the pip install?
%p...
I'm trying to utilise the option to create UDFs in Unity Catalog. That would be a great way to have functions available in a fairly straightforward manner without e.g. putting the function definitions in an extra notebook that I %run to make them ava...
I can see someone has asked a very similar question with the same error message:https://community.databricks.com/t5/data-engineering/unable-to-use-sql-udf/td-p/61957The OP hasn't yet provided sufficient details about his/her function so no proper res...
Hi I have a requirement to read table from azure sql db and update the table in azure databricks with transformations and overwrite updated table to the azure sql db but due to lazy evaluation of pyspark im unable to overwrite the table in azure sql ...
Hi @Kingston Make sure that you have the proper permissions on the SQL server for the user you do the authentication through JDBC with, i.e. database reader / database writer. Then your approach can go in two directions, push the data from Databrick...