Data Engineering

Forum Posts

Sorted by:

by Mohit_m • Valued Contributor II

06-15-2022 5:23:13 AM

14367 Views
3 replies
4 kudos

Resolved! How to get the Job ID and Run ID and save into a database

We are having Databricks Job running with main class and JAR file in it. Our JAR file code base is in Scala. Now, when our job starts running, we need to log Job ID and Run ID into the database for future purpose. How can we achieve this?

Data Engineering

14367 Views
3 replies
4 kudos

06-15-2022 5:23:13 AM

View Replies

Latest Reply

Bruno-Castro
Visitor

45m ago

4 kudos

That article is for members only, can we also specify here how to do it (for those that are not medium members?). Thanks!

4 kudos

45m ago

2 More Replies

by thecodecache • Visitor

2 hours ago

18 Views
0 replies
0 kudos

Transpile a SQL Script into PySpark DataFrame API equivalent code

Input SQL Script (assume any dialect) : SELECT b.se10, b.se3, b.se_aggrtr_indctr, b.key_swipe_ind FROM (SELECT se10, se3, se_aggrtr_indctr, ROW_NUMBER() OVER (PARTITION BY SE10 ...

Data Engineering

18 Views
0 replies
0 kudos

2 hours ago

by SreeG • New Contributor II

a week ago

293 Views
3 replies
0 kudos

CICD for Work Flows

HiI am facing issues when deploying work flows to different environment. The same works for Notebooks and Scripts, when deploying the work flows, it failed with "Authorization Failed. Your token may be expired or lack the valid scope". Anything shoul...

Data Engineering

CICD

293 Views
3 replies
0 kudos

a week ago

View Replies

Latest Reply

Yeshwanth
Valued Contributor

yesterday

0 kudos

@SreeG thanks for confirming!

0 kudos

yesterday

2 More Replies

by ashraf1395 • New Contributor

yesterday

23 Views
0 replies
0 kudos

How to extend free trial period or enter free startup tier to complete our POC for a client.

We are a data consultancy. Our free trial period is currently getting over and we are still doing POC for one of our potential clients and focusing on providing expert services around databricks.1. Is there a possibility that we can extend the free t...

Data Engineering

23 Views
0 replies
0 kudos

yesterday

by MarkD • Visitor

yesterday

44 Views
1 replies
0 kudos

Is it possible to migrate data from one DLT pipeline to another?

Hi,We have a DLT pipeline that has been running for a while with a Hive Metastore target that has stored billions of records. We'd like to move the data to a Unity Catalog. The documentation says "Existing pipelines that use the Hive metastore cannot...

Data Engineering

Delta Live Tables

dlt

Unity Catalog

44 Views
1 replies
0 kudos

yesterday

View Replies

Latest Reply

Yeshwanth
Valued Contributor

yesterday

0 kudos

@MarkD good day! I'm sorry, but according to the description, existing pipelines using the Hive metastore cannot be upgraded to use Unity Catalog. To migrate an existing pipeline that writes to Hive metastore, you must create a new pipeline and re-in...

0 kudos

yesterday

by TheDataDexter • New Contributor III

01-07-2022 1:39:04 AM

2123 Views
4 replies
3 kudos

Resolved! Single-Node cluster works but Multi-Node clusters do not read data.

I am currently working with a VNET injected databricks workspace. At the moment I have mounted a the databricks cluster on an ADLS G2 resource. When running notebooks on a single node that read, transform, and write data we do not encounter any probl...

Data Engineering

2123 Views
4 replies
3 kudos

01-07-2022 1:39:04 AM

View Replies

Latest Reply

ellafj
Visitor

yesterday

3 kudos

@TheDataDexter Did you find a solution to your problem? I am facing the same issue

3 kudos

yesterday

3 More Replies

by Ameshj • New Contributor II

Wednesday

340 Views
8 replies
0 kudos

Dbfs init script migration

I need help with migrating from dbfs on databricks to workspace. I am new to databricks and am struggling with what is on the links provided.My workspace.yml also has dbfs hard-coded. Included is a full deployment with great expectations.This was don...

Data Engineering

Azure Databricks

dbfs

Great expectations

python

340 Views
8 replies
0 kudos

Wednesday

View Replies

Latest Reply

NandiniN
Valued Contributor II

Monday

0 kudos

One of the other suggestions is to use Lakehouse Federation. It is possible it may be a driver issue (we will get to know from the logs)

0 kudos

Monday

7 More Replies

by Red_blue_green • New Contributor III

02-08-2024 2:18:57 AM

2040 Views
3 replies
0 kudos

Databricks: Change the existing schema of columns to non-nullable for a delta table using Pyspark?

Hello,I have currently a delta folder as a table with several columns that are nullable. I want to migrate data to the table and overwrite the content using Pyspark, add several new columns and make them not nullable. I have found a way to make the c...

Data Engineering

delta

2040 Views
3 replies
0 kudos

02-08-2024 2:18:57 AM

View Replies

Latest Reply

kanjinghat
Visitor

yesterday

0 kudos

Not sure if you found a solution, you can also try as below. In this case you pass the full path to the delta not the table itself.spark.sql(f"ALTER TABLE delta.`{full_delta_path}` ALTER column {column_name} SET NOT NULL")

0 kudos

yesterday

2 More Replies

by venkata_kishore • Visitor

yesterday

80 Views
1 replies
1 kudos

delta live tables - oracle connectivity

Is delta live tables/pipelines support oracle or external database connectivity ? i am getting Oracle Driver not found error. dlt not supporting maven install through asset bundles. ERRORs: 1) py4j.protocol.Py4JJavaError: An error occurred while call...

Data Engineering

Delta Live Tables

dlt

oracle

pipelines

80 Views
1 replies
1 kudos

yesterday

View Replies

Latest Reply

RamGoli
New Contributor II

yesterday

1 kudos

Hi @venkata_kishore , As of now, DLT does not support Oracle, and one cannot install third-party libraries and JARs. https://docs.databricks.com/en/delta-live-tables/unity-catalog.html#limitationsIf Lakehouse Federation has support for Oracle, then ...

1 kudos

yesterday

by Phani1 • Valued Contributor

yesterday

31 Views
0 replies
0 kudos

topology, centralized, and federated workspaces

Hi Team,Can you provide information about topology, centralized, and federated workspaces in databricks and how they are used?Regards,Janga

Data Engineering

Workspace

31 Views
0 replies
0 kudos

yesterday

by Lea • Visitor

yesterday

82 Views
0 replies
1 kudos

Advice for generic file processing for ingestion of multiple data formats

Hello,We are using delta live tables to ingest data from multiple business groups, each with different input file formats and parsing requirements. The input files are ingested from azure blob storage. Right now, we are only servicing three busines...

Data Engineering

82 Views
0 replies
1 kudos

yesterday

by israelst • New Contributor II

01-15-2024 12:53:49 AM

456 Views
5 replies
4 kudos

DLT can't authenticate with kinesis using instance profile

When running my notebook using personal compute with instance profile I am indeed able to readStream from kinesis. But adding it as a DLT with UC, while specifying the same instance-profile in the DLT pipeline setting - causes a "MissingAuthenticatio...

Data Engineering

Delta Live Tables

Unity Catalog

456 Views
5 replies
4 kudos

01-15-2024 12:53:49 AM

View Replies

Latest Reply

Babu_Krishnan
New Contributor III

yesterday

4 kudos

@Mathias_Peters , Thanks for the details. Curious how make the roleAan part work , we are able to make it work only with passing accessKey and Secret key, not with roleArn. if you are using SQL based DLT tables , Could you please share some code samp...

4 kudos

yesterday

4 More Replies

by dprutean • New Contributor III

06-28-2023 7:29:54 AM

581 Views
1 replies
0 kudos

JDBC Driver Error

connect to databricks unity catalog and meet this error java.sql.SQLException: [Databricks][DatabricksJDBCDriver](500540) Error caught in BackgroundFetcher. Foreground thread ID: 59. Background thread ID: 61. Error caught: null. at com.databricks.cli...

Data Engineering

581 Views
1 replies
0 kudos

06-28-2023 7:29:54 AM

View Replies

Latest Reply

Kaniz
Community Manager

yesterday

0 kudos

Hi @dprutean, Thank you for providing the details about the error you’re encountering while connecting to the Databricks Unity Catalog using the Databricks JDBC driver. Let’s troubleshoot this step by step: Check your connection string: The conn...

0 kudos

yesterday

by AdityaM • New Contributor II

a week ago

130 Views
2 replies
0 kudos

Creating external tables using gzipped CSV file - S3 URI without extensions

Hi Databricks community,Hope you are doing well.I am trying to create an external table using a Gzipped CSV file uploaded to an S3 bucket.The S3 URI of the resource doesn't have any file extensions, but the content of the file is a Gzipped comma sepa...

Data Engineering

130 Views
2 replies
0 kudos

a week ago

View Replies

Latest Reply

AdityaM
New Contributor II

yesterday

0 kudos

Hey , thanks for your response. I tried using a Serde(I think the OpenCSVSerde should work for me) but unfortunately im getting the below from the Unity Catalog:[UC_DATASOURCE_NOT_SUPPORTED] Data source format hive is not supported in Unity Catalog....

0 kudos

yesterday

1 More Replies

by KrishnaK135 • New Contributor

06-27-2023 5:46:56 PM

241 Views
1 replies
0 kudos

Advanced Data Engineering

Just finished the final day of training. Great content and delivery!

Data Engineering

241 Views
1 replies
0 kudos

06-27-2023 5:46:56 PM

View Replies

Latest Reply

Kaniz
Community Manager

yesterday

0 kudos

Hi @KrishnaK135, That's wonderful to hear! We're thrilled that you found the content and delivery of the training at DAIS 2023 to be excellent. Your positive feedback means a lot to us! We also wanted to share some exciting news with you all. The Dat...

0 kudos

yesterday

User

Count

1603

736

344

284

247

Databricks

Forum Posts

Resolved! How to get the Job ID and Run ID and save into a database

Transpile a SQL Script into PySpark DataFrame API equivalent code

CICD for Work Flows

How to extend free trial period or enter free startup tier to complete our POC for a client.

Is it possible to migrate data from one DLT pipeline to another?

Resolved! Single-Node cluster works but Multi-Node clusters do not read data.

Dbfs init script migration

Databricks: Change the existing schema of columns to non-nullable for a delta table using Pyspark?

delta live tables - oracle connectivity

topology, centralized, and federated workspaces

Advice for generic file processing for ingestion of multiple data formats

DLT can't authenticate with kinesis using instance profile

JDBC Driver Error

Creating external tables using gzipped CSV file - S3 URI without extensions

Advanced Data Engineering

Load multiple delta tables at once from Sql server

Starting Serverless sql cluster on GCP

"Can't login to databricks socket is closed" when ...

Temporary views no longer working for Share Comput...

Does DLT use one single SparkSession?