cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

garciargs
by New Contributor III
  • 144 Views
  • 2 replies
  • 3 kudos

DLT multiple source table to single silver table generating unexpected result

Hi,I´ve been trying this all day long. I'm build a POC of a pipeline that would be used on my everyday ETL.I have two initial tables, vendas and produtos, and they are as the following:vendas_rawvenda_idproduto_iddata_vendaquantidadevalor_totaldth_in...

  • 144 Views
  • 2 replies
  • 3 kudos
Latest Reply
NandiniN
Databricks Employee
  • 3 kudos

When dealing with Change Data Capture (CDC) in Delta Live Tables, it's crucial to handle out-of-order data correctly. You can use the APPLY CHANGES API to manage this. The APPLY CHANGES API ensures that the most recent data is used by specifying a co...

  • 3 kudos
1 More Replies
LorenRD
by Contributor
  • 9324 Views
  • 12 replies
  • 10 kudos
  • 9324 Views
  • 12 replies
  • 10 kudos
Latest Reply
miranda_luna_db
Databricks Employee
  • 10 kudos

Hi Mike - We're working on a capability that will allow auth to be delegated to the app. Happy to set up some time to share plans/get feedback if of interest. If you reach out to your account team they can help make it happen!

  • 10 kudos
11 More Replies
Anmol_Chauhan
by New Contributor II
  • 238 Views
  • 4 replies
  • 1 kudos

How to use Widgets with SQL Endpoint in Databricks?

I' trying to use widgets with SQL Endpoints but I'm encountering an error, whereas they work seamlessly with Databricks Interactive Cluster. While query parameters can substitute widgets in SQL endpoints, but I specifically require dropdown and multi...

  • 238 Views
  • 4 replies
  • 1 kudos
Latest Reply
Walter_C
Databricks Employee
  • 1 kudos

Got it, let me test, i think there is no specific way to do it, but if you add the option All as the first on the list it should select it

  • 1 kudos
3 More Replies
lukamu
by New Contributor II
  • 198 Views
  • 3 replies
  • 1 kudos

Resolved! Issue with filter_by in Databricks SQL Query History API (/api/2.0/sql/history/queries)

Hi everyone,I'm trying to use the filter_by parameter in a GET request to /api/2.0/sql/history/queries, but I keep getting a 400 Bad Request error. When I use max_results, it works fine, but adding filter_by causes the request to fail.Example value f...

  • 198 Views
  • 3 replies
  • 1 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 1 kudos

Hi @lukamu, glad that it worked for you!

  • 1 kudos
2 More Replies
GinoBarkley
by New Contributor II
  • 129 Views
  • 1 replies
  • 1 kudos

Resolved! Extracting data from GCP BigQuery using Foreign Catalog

On Databricks, I have created a connection type Google Query and tested the connection successfully. I have then created a foreign catalog from the connection to a Google BigQuery project. I can see all the data sets and tables in the Foreign Catalog...

Data Engineering
bigquery
Databricks
location US
  • 129 Views
  • 1 replies
  • 1 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 1 kudos

Hi @GinoBarkley, Could you please advise which DBR version are you using in your personal access mode? I see this requirments: Databricks clusters must use Databricks Runtime 16.1 or above and shared or single user access mode. SQL warehouses must ...

  • 1 kudos
hari-prasad
by Valued Contributor II
  • 334 Views
  • 6 replies
  • 2 kudos

'from_json' spark function not parsing value column from Confluent Kafka topic

For one of badge completion, it was mandatory to complete a Spark Streaming Demo Practice.Due to the absence of a Kafka broker setup required for the demo practice, I configured a Confluent Kafka cluster and made several modifications to the Spark sc...

hariprasad_0-1737534905518.png hariprasad_1-1737534936401.png hariprasad_0-1737533122673.png hariprasad_2-1737534973740.png
  • 334 Views
  • 6 replies
  • 2 kudos
Latest Reply
saurabh18cs
Valued Contributor III
  • 2 kudos

I am not sure if I read the full explanation but how about this :     df     .withColumn('value_str', F.decode(F.col('value'), 'utf-8'))    .withColumn('value_json', F.explode(F.from_json(F.col('value_str'),   json_schema)))    .select('value_json.*'...

  • 2 kudos
5 More Replies
amarnadh-gadde
by New Contributor II
  • 162 Views
  • 5 replies
  • 0 kudos

Default catalog created wrong on my workspace

We have provisioned a new databricks account and workspace on premium plan. When built out workspace using terraform, we expected to see a default catalog matching workspace name as per this documentation. However I dont see it. All I see are the 3 c...

amarnadhgadde_0-1738260452789.png amarnadhgadde_1-1738260467287.png amarnadhgadde_2-1738260473392.png
  • 162 Views
  • 5 replies
  • 0 kudos
Latest Reply
amarnadh-gadde
New Contributor II
  • 0 kudos

@saurabh18cs Ours is a new account and as I represented in the screenshots above, for new accounts databricks documents suggests we will have a new catalog created matching workspace name. However that never happened in our account and the default ca...

  • 0 kudos
4 More Replies
Balram-snaplogi
by New Contributor II
  • 157 Views
  • 2 replies
  • 0 kudos

Not able to Run jobs using M2M authentication form our code

Hi,I am using OAuth machine-to-machine (M2M) authentication with the JDBC approach.String url = "jdbc:databricks://<server-hostname>:443";Properties p = new java.util.Properties();p.put("httpPath", "<http-path>");p.put("AuthMech", "11");p.put("Auth_F...

  • 157 Views
  • 2 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @Balram-snaplogi ,It looks like permission problem. Could you check if the service principal has the necessary permissions to execute jobs?In Databricks, permissions for jobs can be managed to control access. The following permissions are availabl...

  • 0 kudos
1 More Replies
hedbergjacob
by New Contributor II
  • 192 Views
  • 2 replies
  • 0 kudos

Resolved! Delta Live Table "Default Schema" mandatory but not editable

Hi,We have an issue with a DLT pipeline. We want to add some source code to an existing pipeline. However, when we save, error message shows that "Default schema" is a mandatory field. However, we are not able to edit the field. The DLT pipeline does...

Data Engineering
deltalivetables
  • 192 Views
  • 2 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

Does the pipeline settings JSON includes the "schema" field. If you have full admin rights, you can update the existing pipeline settings to include the "schema" field. Like  curl -X PATCH https://<databricks-instance>/api/2.0/pipelines/<pipeline-id>...

  • 0 kudos
1 More Replies
simha08
by New Contributor II
  • 285 Views
  • 2 replies
  • 0 kudos

Unable to Read Collection/Files from MongoDB using Azure Databricks

Hi there,Can someone help to read data from MongoDB using Azure Databricks? Surprisingly, I am able to connect from Jupyter Notebook and read data, but not from the Azure Databricks.1) I have install the required spark-connector packages in the clust...

  • 285 Views
  • 2 replies
  • 0 kudos
Latest Reply
simha08
New Contributor II
  • 0 kudos

I am using following code to read the data from mongoDB using Databricksfrom pyspark.sql import SparkSessionspark = SparkSession \.builder \.appName("myApp") \.config("spark.mongodb.connection.uri", "mongodb+srv://username:password@cluster.xxxx.mongo...

  • 0 kudos
1 More Replies
RamanBajetha
by New Contributor II
  • 152 Views
  • 2 replies
  • 1 kudos

Issue with Generic DLT Pipeline Handling Multiple BUs

We are implementing a data ingestion framework where data flows from a foreign catalog (source) to a raw layer (Delta tables) and then to a bronze layer (DLT streaming tables). Currently, each Business Unit (BU) has a separate workflow and DLT pipeli...

  • 152 Views
  • 2 replies
  • 1 kudos
Latest Reply
NandiniN
Databricks Employee
  • 1 kudos

You can create separate schemas within the same catalog for each BU. For example, you can have schemas like BU1_schema, BU2_schema, etc., within the same catalog. By using Unity Catalog, you can segregate BU-specific tables within the same DLT pipeli...

  • 1 kudos
1 More Replies
desertstorm
by New Contributor II
  • 3850 Views
  • 8 replies
  • 0 kudos

Driver Crash on processing large dataframe

I have a dataframe with abt 2 million text rows (1gb). I partition it into about 700 parititons as thats the no of cores available on my cluster exceutors. I run the transformations extracting medical information and then write the results in parquet...

  • 3850 Views
  • 8 replies
  • 0 kudos
Latest Reply
Isi
New Contributor III
  • 0 kudos

Hey @Svish ,Your problem is probably caused by using Pandas. Pandas loads all the data into the driver memory, which is likely why you are experiencing issues. If you can modify your code to use Spark instead, you will probably avoid this problem.How...

  • 0 kudos
7 More Replies
developer321
by New Contributor II
  • 203 Views
  • 2 replies
  • 0 kudos

getting "NoSuchMethodError" while using tsl 15.4 and spark 3.5

hi, i am using data bricks version 15.4 and spark 3.5 and getting "NoSuchMethodError" and all the resources i found only solution is to downgrade spark and data bricks version. is there any solution apart from this as i cant do this in my case. regar...

  • 203 Views
  • 2 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @developer321, Are you upgrading any of the default libraries of DBR 15.4 LTS? please share more details on your use-case and commands / settings being used.

  • 0 kudos
1 More Replies
cool_cool_cool
by New Contributor II
  • 120 Views
  • 1 replies
  • 0 kudos

Job Stuck with single user access mode

Heya So I'm working on a new workflow. I've started by writing a notebook and running it on an interactive cluster with "Single User" access mode, and everything worked fine.I created a workflow for this task with the same interactive cluster, and ev...

  • 120 Views
  • 1 replies
  • 0 kudos
Latest Reply
Isi
New Contributor III
  • 0 kudos

Hey!You cannot access an Instance Profile (IAM Role) in “Shared” mode, so discard this option if your job relies on AWS credentials via an instance profile. If your workflow depends on accessing S3 or other AWS resources using an IAM Role, you must u...

  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels