cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

sharma_kamal
by New Contributor III
  • 962 Views
  • 2 replies
  • 1 kudos

Resolved! Getting errors while reading data from URL

I'm encountering some issues while trying to read a public dataset from a URL using Databricks. Here's the code snippet(along with errors) I'm working with: I'm confused about Delta format error here.When I read data from a URL, how would it have a D...

sharma_kamal_1-1710132330915.png
  • 962 Views
  • 2 replies
  • 1 kudos
Latest Reply
MuthuLakshmi
New Contributor III
  • 1 kudos

@sharma_kamal  Please disable the formatCheck in notebook and check if you could read the data The configuration command %sql SET spark.databricks.delta.formatCheck.enabled=false will disable the format check for Delta tables in Databricks. Databrick...

  • 1 kudos
1 More Replies
Yuki
by New Contributor
  • 1213 Views
  • 3 replies
  • 0 kudos

Can I use Git provider with using Service Principal in job

Hi everyone,I'm trying to use Git provider in Databricks job.First, I was using my personal user account to `Run as`.But when I change `Run as` to Service Principal, it was failed because of permission error.And I can't find a way to solve it.Could I...

Yuki_0-1699340000007.png
  • 1213 Views
  • 3 replies
  • 0 kudos
Latest Reply
martindlarsson
New Contributor III
  • 0 kudos

The documentation is lacking in this area which should be easy to set up. Instead we are forced to search among community topics such as these.

  • 0 kudos
2 More Replies
r-goswami
by New Contributor II
  • 797 Views
  • 3 replies
  • 0 kudos

Unable to create/save job of type "python script"

Hi All,We are facing an issue while creating a simple job of type "python script". A python file in workspace is selected as a source. No arguments/job parameters are provided. This is a strange behavior and just started occurring since today morning...

  • 797 Views
  • 3 replies
  • 0 kudos
Latest Reply
r-goswami
New Contributor II
  • 0 kudos

Hi Ayushi,How can I call call RESET API? this issue is occurring when creating a new job from databricks web UI. It looks like REST API is for resetting job settings of an existing job.Can this be an issue with the databricks workspace we are using?A...

  • 0 kudos
2 More Replies
hyedesign
by New Contributor II
  • 1184 Views
  • 3 replies
  • 0 kudos

Getting SparkConnectGrpcException: (java.io.EOFException) error when using foreachBatch

Hello, I am trying to write a simple upsert statement following the steps in the tutorials. here is what my code looks like:from pyspark.sql import functions as Fdef upsert_source_one(self df_source = spark.readStream.format("delta").table(self.so...

  • 1184 Views
  • 3 replies
  • 0 kudos
Latest Reply
hyedesign
New Contributor II
  • 0 kudos

Using sample data sets. Here is the full code. This error does seem to be related to runtime version 15,df_source = spark.readStream.format("delta").table("`cat1`.`bronze`.`officer_info`")df_orig_state = spark.read.format("delta").table("`sample-db`....

  • 0 kudos
2 More Replies
IshaBudhiraja
by New Contributor II
  • 996 Views
  • 3 replies
  • 0 kudos

Migration of Synapse Data bricks activity executions from All purpose cluster to New job cluster

Hi,We have been planning to migrate the Synapse Data bricks activity executions from 'All-purpose cluster' to 'New job cluster' to reduce overall cost. We are using Standard_D3_v2 as cluster node type that has 4 CPU cores in total. The current quota ...

IshaBudhiraja_0-1711688756158.png
  • 996 Views
  • 3 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @IshaBudhiraja,  Quotas are used for different resource groups, subscriptions, accounts, and scopes. The number of cores for a particular region may be restricted by your subscription.To verify your subscription’s usage and quotas, follow these st...

  • 0 kudos
2 More Replies
AxelBrsn
by New Contributor III
  • 2131 Views
  • 3 replies
  • 2 kudos

Resolved! Use DLT from another pipeline

Hello, I have a question.Context :I have a Unity Catalog organized with three schemas (bronze, silver and gold). Logically, I would like to create tables in each schemas.I tried to organize my pipelines on the layers, which mean that I would like to ...

  • 2131 Views
  • 3 replies
  • 2 kudos
Latest Reply
AxelBrsn
New Contributor III
  • 2 kudos

Hello, thanks for the answers @YuliyanBogdanov, @standup1.So the solution is to use catalog.schema.table, and not LIVE.table, that's the key, you were right standup!But, you won't have the visibility of the tables on Bronze Pipeline, if you are on Si...

  • 2 kudos
2 More Replies
maikelos272
by New Contributor II
  • 2658 Views
  • 4 replies
  • 2 kudos

Cannot create storage credential without Contributor role

Hello,I am trying to create a Storage Credential. I have created the access connector and gave the managed identity "Storage Blob Data Owner" permissions. However when I want to create a storage credential I get the following error:Creating a storage...

  • 2658 Views
  • 4 replies
  • 2 kudos
Latest Reply
Kim3
New Contributor II
  • 2 kudos

Hi @Kaniz_Fatma Can you elaborate on the error "Refresh token not found for userId"?I have exactly the same problem as described in this thread. I am trying to create a storage credential using a Personal Access Token from a Service Principal. This r...

  • 2 kudos
3 More Replies
BenDataBricks
by New Contributor
  • 771 Views
  • 1 replies
  • 0 kudos

OAuth U2M Manual token generation failing

I am writing a frontend webpage that will log into DataBricks and allow the user to select datasets.I am new to front end development, so there may be some things I am missing here, but I know that the DataBricks SQL connector for javascript only wor...

  • 771 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @BenDataBricks,  Ensure that the auth_code variable in your Python script contains the correct authorization code obtained from the browser.Verify that the code_verifier you’re using matches the one you generated earlier.Confirm that the redirect_...

  • 0 kudos
SenthilJ
by New Contributor III
  • 1584 Views
  • 1 replies
  • 0 kudos

Resolved! Databricks Deep Clone

Hi,I am working on a DR design for Databricks in Azure. The recommendation from Databricks is to use Deep Clone to clone the Unity Catalog tables (within or across catalogs). My design is to ensure that DR is managed across different regions i.e. pri...

Data Engineering
Disaster Recovery
Unity Catalog
  • 1584 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @SenthilJ, The recommendation from Databricks to use Deep Clone for cloning Unity Catalog (UC) tables is indeed a prudent approach. Deep Clone facilitates the seamless replication of UC objects, including schemas, managed tables, access permission...

  • 0 kudos
Khaled75
by New Contributor
  • 651 Views
  • 1 replies
  • 0 kudos

Connect databricks

I discovered recently mlflow managed by Databricks so I'm very new to this and I need some help.Can someone explain for me clearly the steps to do to be able to track my runs into the Databricks API.Here are the steps I followed :1/ Installing Databr...

Capture d’écran 2024-03-30 à 01.20.12.png Capture d’écran 2024-03-31 à 16.54.55.png
Data Engineering
Data
tracking_ui
  • 651 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @Khaled75,  The specific error message you provided is related to fetching the experiment by name. It’s essential to understand the exact error message. Can you share the complete error text?Confirm that your Databricks authentication is working c...

  • 0 kudos
Debi-Moha
by New Contributor II
  • 1117 Views
  • 1 replies
  • 1 kudos

Unable to write to S3 bucket from Databricks using boto3

I am unable to write data from Databricks into an S3 bucket. I have set up the permissions both on the bucket policy level, and the user level as well (Put, List, and others are added, have also tried with s3*). Bucket region and workspace region are...

  • 1117 Views
  • 1 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Hi @Debi-Moha,  Ensure that the IAM role associated with your Databricks cluster has the necessary permissions to access the S3 bucket. Specifically, it should have permissions for s3:PutObject and s3:ListBucket.Double-check that the IAM role is corr...

  • 1 kudos
Kibour
by Contributor
  • 4318 Views
  • 2 replies
  • 2 kudos

Resolved! Import from repo

Hi all,I am trying the new "git folder" feature, with a repo that works fine from the "Repos". In the new folder location, my imports from my own repo don't work anymore. Anyone faced something similar?Thanks in advance for sharing your experience

  • 4318 Views
  • 2 replies
  • 2 kudos
Latest Reply
Kibour
Contributor
  • 2 kudos

Hi @Kaniz_Fatma ,Thanks for all the suggested options.I tried again with a brand new git folder. I just changed cluster from DBR 14.2 ML to 14.3 ML, and now the imports work as expected.Kind regards

  • 2 kudos
1 More Replies
cosminsanda
by New Contributor III
  • 2453 Views
  • 9 replies
  • 0 kudos

Adding a new column triggers reprocessing of Auto Loader source table

I have a source table A in Unity Catalog. This table is constantly written to and is a streaming table.I also have another table B in Unity Catalog. This is a managed table with liquid clustering.Using Auto Loader I move new data from A to B using a ...

Data Engineering
auto-loader
  • 2453 Views
  • 9 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

change data feed might be a solution for you perhaps.https://docs.databricks.com/en/delta/delta-change-data-feed.html

  • 0 kudos
8 More Replies
databrick_usert
by New Contributor
  • 745 Views
  • 1 replies
  • 0 kudos

Workspace client creation error

Hi,We are trying to use Python SDK and create a workspace client using the following code:%pip install databricks-sdk --upgrade dbutils.library.restartPython()from databricks.sdk import WorkspaceClientw = WorkspaceClient()Here is the notebook: https:...

  • 745 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ayushi_Suthar
Honored Contributor
  • 0 kudos

Hi @databrick_usert , Hope you are doing well!  Can you check the version of the SDK running in this notebook? If it's not an upgraded version then could you please try to upgrade the SDK version and then restart the python after the pip install?  %p...

  • 0 kudos
MartinIsti
by New Contributor III
  • 944 Views
  • 1 replies
  • 0 kudos

Python UDF in Unity Catalog - spark.sql error

I'm trying to utilise the option to create UDFs in Unity Catalog. That would be a great way to have functions available in a fairly straightforward manner without e.g. putting the function definitions in an extra notebook that I %run to make them ava...

Data Engineering
function
udf
  • 944 Views
  • 1 replies
  • 0 kudos
Latest Reply
MartinIsti
New Contributor III
  • 0 kudos

I can see someone has asked a very similar question with the same error message:https://community.databricks.com/t5/data-engineering/unable-to-use-sql-udf/td-p/61957The OP hasn't yet provided sufficient details about his/her function so no proper res...

  • 0 kudos
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels