Data Engineering

Forum Posts

Sorted by:

by Shivanshu_ • Contributor

01-18-2024 1:32:11 AM

5383 Views
4 replies
3 kudos

parallelizing function call in databricks

I have a use case where I have to process stream data and have to create categorical table's(500 table count). I'm using concurrent threadpools to parallelize the whole process, but while seeing the spark UI, my code dosen't utilizes all the workers(...

Data Engineering

parallelism

threading

threadpool executor

5383 Views
4 replies
3 kudos

01-18-2024 1:32:11 AM

View Replies

Latest Reply

jose_gonzalez
Databricks Employee

01-31-2024 11:03:35 AM

3 kudos

You can use DLT, read from many-to-one table.

3 kudos

01-31-2024 11:03:35 AM

3 More Replies

by Fnazar • New Contributor II

01-31-2024 3:09:17 AM

2475 Views
3 replies
0 kudos

Streaming live table

I am trying to create a streaming live table using the below syntax : CREATE OR REFRESH STREAMING LIVE TABLE revenue_stream AS (SELECT * FROM stream (finance_silver.finance_db.revenue)) And as I am trying to execute this notebook via DLT pipeline i a...

Data Engineering

2475 Views
3 replies
0 kudos

01-31-2024 3:09:17 AM

View Replies

Latest Reply

jose_gonzalez
Databricks Employee

01-31-2024 10:58:52 AM

0 kudos

You can use materialized views in serverless only

0 kudos

01-31-2024 10:58:52 AM

2 More Replies

by FurqanAmin • New Contributor II

01-29-2024 12:43:35 AM

1628 Views
1 replies
0 kudos

Spark Logs inaccessible - from the UI and dbfs (GCS)

We have a lot of jobs with spark-submit tasks, previously we were able to see the logs for the jobs. Now we are not able to see the logs in the DBX UI.We created a test job for this 'test_job_2' in our workspace to test it out. When the job finishes ...

Data Engineering

1628 Views
1 replies
0 kudos

01-29-2024 12:43:35 AM

View Replies

Latest Reply

Yeshwanth
Databricks Employee

01-31-2024 7:55:13 AM

0 kudos

@FurqanAmin Could you please attach a screenshot of this?

0 kudos

01-31-2024 7:55:13 AM

by OvZ • New Contributor III

01-30-2023 6:19:17 AM

31780 Views
13 replies
1 kudos

Resolved! Is it possible to disable jdbc/odbc connection to (azure) databrick cluster

Hi,I wanna know if it is possible to disable jdbc/odbc connection to (azure) databrick cluster.So know (download) tools could connect this way ? Thz in adv,Oscar

Data Engineering

31780 Views
13 replies
1 kudos

01-30-2023 6:19:17 AM

View Replies

Latest Reply

wdphilli
Databricks Partner

01-31-2024 7:06:07 AM

1 kudos

Hi @OvZ & @LandanG, as a point of clarification, the script provided is intended to be run in a notebook first. After running the below in a notebook, it creates the init script at the location "dbfs:/databricks/init_scripts/disable_jdbc_odbc.conf" %...

1 kudos

01-31-2024 7:06:07 AM

12 More Replies

by manohar3 • New Contributor III

01-29-2024 8:01:59 AM

4818 Views
2 replies
0 kudos

Resolved! spark databricks jdbc driver integration return rows having column names as values

Hi all,i am using below to code to query table but query returns rows having column names as valuesspark.read .format("jdbc") .option("url", "jdbc:databricks://acme.cloud.databricks.com:443/myschema;transportMode=http;ssl=1;httpPath=<httppath>;Au...

Data Engineering

4818 Views
2 replies
0 kudos

01-29-2024 8:01:59 AM

View Replies

Latest Reply

manohar3
New Contributor III

01-30-2024 6:41:59 PM

0 kudos

This seems to be issue with spark and was able to fix issue by following postshttps://stackoverflow.com/questions/47020379/bigquery-simba-jdbc-error-with-sparkhttps://stackoverflow.com/questions/68013347/how-to-register-a-jdbc-spark-dialect-in-python...

0 kudos

01-30-2024 6:41:59 PM

1 More Replies

by Yoni • New Contributor

05-16-2022 12:55:12 AM

17476 Views
5 replies
3 kudos

Resolved! MLFlow failed: You haven't configured the CLI yet

I'm getting an errorYou haven’t configured the CLI yet! Please configure by entering `/databricks/python_shell/scripts/db_ipykernel_launcher.py configure`My cluster is running Databricks Runtime Version 10.1I've also installed mlflow to the cluster l...

Data Engineering

17476 Views
5 replies
3 kudos

05-16-2022 12:55:12 AM

View Replies

Latest Reply

HemantKumar
New Contributor II

01-30-2024 5:27:55 PM

3 kudos

dbutils.library.restartPython()Add that after you run the pip install mlflow, it worked for me in a non-ML cluster

3 kudos

01-30-2024 5:27:55 PM

4 More Replies

by DumbBeaver • New Contributor II

01-30-2024 5:20:33 AM

3105 Views
2 replies
1 kudos

Resolved! ERROR: Writing to Unity Catalog from Remote Spark using JDBC

This is my code here. df = spark.createDataFrame([[1,1,2]], schema=['id','first_name','last_name'])(df.write.format("jdbc") .option("url", <jdbc-url>) .option("dbtable","hive_metastore.default.test") .option("driver", "com.databricks.clien...

Data Engineering

3105 Views
2 replies
1 kudos

01-30-2024 5:20:33 AM

View Replies

Latest Reply

feiyun0112
Honored Contributor

01-30-2024 7:11:07 AM

1 kudos

%scala import org.apache.spark.sql.jdbc.{JdbcDialect, JdbcDialects} JdbcDialects.registerDialect(new JdbcDialect() { override def canHandle(url: String): Boolean = url.toLowerCase.startsWith("jdbc:databricks:") override def quoteIde...

1 kudos

01-30-2024 7:11:07 AM

1 More Replies

by Azure_Data_Bric • New Contributor III

01-18-2024 12:38:44 AM

4745 Views
6 replies
0 kudos

Historical Data Clean-up from Silver tables

Hi Everyone,I need your help/suggestion.We are using a DLT framework for our ELT process, data is received from the Source to the RAW layer in parquet format. This raw data is loaded to the Bronze layer which acts like a history table. From the BRONZ...

Data Engineering

4745 Views
6 replies
0 kudos

01-18-2024 12:38:44 AM

View Replies

Latest Reply

Azure_Data_Bric
New Contributor III

01-30-2024 10:29:55 AM

0 kudos

Hi,I see Optimize and VACUUM are running on all tables once per day automatically.that day when we performed historical deletion, we deleted the data first, and then we just ran VACUUM with zero hour retention. After some time Optimize and VACUUM (wi...

0 kudos

01-30-2024 10:29:55 AM

5 More Replies

by CloudPlatformer • New Contributor II

01-30-2024 8:30:49 AM

4129 Views
1 replies
0 kudos

Npip Tunnel Setup Failure

Hi everyone,I'm currently running into an issue when trying to create any type of compute cluster in a workspace (premium, with VNet Injection and private DNS zone + private Endpoint). The operation always fails with: Compute terminated. Reason: Npip...

Data Engineering

4129 Views
1 replies
0 kudos

01-30-2024 8:30:49 AM

View Replies

Latest Reply

CloudPlatformer
New Contributor II

01-30-2024 8:32:13 AM

0 kudos

I forgot to add: the workspace as well as the other resources are hosted in Azure.

0 kudos

01-30-2024 8:32:13 AM

by Etyr • Contributor II

01-30-2024 2:20:37 AM

2936 Views
2 replies
0 kudos

Can not connect to databricks on Azure Machine Learning Compute Cluster.

Hello,I'am having an issue where I have :A local machine in WSL 1,Python 3.8 and 3.10OpenJDK 19.0.1 (version "build 19.0.1+10-21")Compute Instance In Azure Machine LearningPython 3.8OpenJDK 8 (version "1.8.0_392")Compute Cluster in Azure Machine Lear...

Data Engineering

2936 Views
2 replies
0 kudos

01-30-2024 2:20:37 AM

View Replies

Latest Reply

Etyr
Contributor II

01-30-2024 2:54:32 AM

0 kudos

Additional information I forgot to write.Compute Instance has a User managed Identity in Azure, a Service Principal access is created in databricks with its Application ID. Same with the compute cluster, it has its own User Managed Identity that is a...

0 kudos

01-30-2024 2:54:32 AM

1 More Replies

by learning_1989 • New Contributor II

01-29-2024 10:31:11 PM

3107 Views
2 replies
1 kudos

You have json file which is nested with multiple key value pair how you read it in databricks?

Data Engineering

3107 Views
2 replies
1 kudos

01-29-2024 10:31:11 PM

View Replies

Latest Reply

Lakshay
Databricks Employee

01-30-2024 3:46:10 AM

1 kudos

You should be able to read the json file with below code. val df = spark.read.format("json").load("file.json") After this you will need to use the explode function to add columns to the dataframe using the nested values.

1 kudos

01-30-2024 3:46:10 AM

1 More Replies

by learning_1989 • New Contributor II

01-29-2024 11:08:34 PM

5547 Views
1 replies
1 kudos

You need to pass the data from adf to tables In delta table or df in databricks how you do it

Data Engineering

5547 Views
1 replies
1 kudos

01-29-2024 11:08:34 PM

View Replies

by RKNutalapati • Valued Contributor

12-15-2023 9:53:27 AM

4697 Views
3 replies
0 kudos

How to use Oracle Wallet to connect from databricks

How to onnect Databricks to Oracle DAS / Autonomous Database using a cloud wallet, What are the typical steps and best practices to follow. Appreciate an example code snippet for connecting to the above data source

Data Engineering

4697 Views
3 replies
0 kudos

12-15-2023 9:53:27 AM

View Replies

Latest Reply

RKNutalapati
Valued Contributor

01-29-2024 5:49:41 PM

0 kudos

Followed below steps to build the connection.Unzip Oracle Wallet objects and copy them to a secure location accessible by your Databricks workspace.Collaborate with your Network team and Oracle Autonomous Instance Admins to open firewalls between yo...

0 kudos

01-29-2024 5:49:41 PM

2 More Replies

by Snentley • New Contributor II

11-05-2023 9:36:21 AM

2045 Views
1 replies
0 kudos

Free Voucher for Data Engineering Associate Certification

Could you please inform me which specific webinar participation might grant eligibility for a certification exam voucher? Additionally, I would like to know whether this voucher would cover the full cost of the certification exam or only a partial am...

Data Engineering

2045 Views
1 replies
0 kudos

11-05-2023 9:36:21 AM

View Replies

Latest Reply

Kiv9
New Contributor II

01-29-2024 4:51:04 PM

0 kudos

Did you get any response on this

0 kudos

01-29-2024 4:51:04 PM

by Phani1 • Databricks MVP

01-29-2024 2:10:40 AM

1047 Views
1 replies
0 kudos

Databricks masking

Should we convert the Python-based masking logic to SQL in databricks for implementing masking? Will the masking feature continue to work while connected to Power BI?Regards,Phanindra

Data Engineering

1047 Views
1 replies
0 kudos

01-29-2024 2:10:40 AM

View Replies

Latest Reply

shan_chandra
Databricks Employee

01-29-2024 3:11:08 PM

0 kudos

@Phani1 - could you please be more precise on the question. Are you discussing about mask function in DBSQL?

0 kudos

01-29-2024 3:11:08 PM

Databricks Community

Forum Posts

parallelizing function call in databricks

Streaming live table

Spark Logs inaccessible - from the UI and dbfs (GCS)

Resolved! Is it possible to disable jdbc/odbc connection to (azure) databrick cluster

Resolved! spark databricks jdbc driver integration return rows having column names as values

Resolved! MLFlow failed: You haven't configured the CLI yet

Resolved! ERROR: Writing to Unity Catalog from Remote Spark using JDBC

Historical Data Clean-up from Silver tables

Npip Tunnel Setup Failure

Can not connect to databricks on Azure Machine Learning Compute Cluster.

You have json file which is nested with multiple key value pair how you read it in databricks?

You need to pass the data from adf to tables In delta table or df in databricks how you do it

How to use Oracle Wallet to connect from databricks

Free Voucher for Data Engineering Associate Certification

Databricks masking

File Arrival Trigger - Multiple tables

Issue while handling Deletes and Inserts in Struct...

DLT with CDC and schema changes in streaming pipel...

how to update not tracked column only in new row v...

Databricks Cost Estimation Template