Data Engineering

Forum Posts

Sorted by:

by ipreston • New Contributor II

01-17-2024 9:27:40 AM

3493 Views
6 replies
0 kudos

Possible false positive warning on DLT pipeline

I have a DLT pipeline script that starts by extracting metadata on the tables it should generate from a delta table. Each record returned from the table should be a dlt table to generate, so I use .collect() to turn each row into a list and then iter...

Data Engineering

3493 Views
6 replies
0 kudos

01-17-2024 9:27:40 AM

View Replies

Latest Reply

ipreston
New Contributor II

01-22-2024 8:01:00 AM

0 kudos

Thanks for the reply. Based on that response though, it seems like the warning itself is a bug in the DLT implementation. Per the docs "However, you can include these functions outside of table or view function definitions because this code is run on...

0 kudos

01-22-2024 8:01:00 AM

5 More Replies

by fix_databricks • Visitor

2 hours ago

10 Views
0 replies
0 kudos

Cannot run another notebook from same directory

Hello, I am having a similar problem from this thread which was never resolved: https://community.databricks.com/t5/data-engineering/unexpected-error-while-calling-notebook-string-matching-regex-w/td-p/18691 I renamed a notebook (utility_data_wrangli...

Data Engineering

10 Views
0 replies
0 kudos

2 hours ago

by Hubert-Dudek • Esteemed Contributor III

5 hours ago

28 Views
0 replies
0 kudos

RocksDB for storing state stream

Now, you can keep the state of stateful streaming in RocksDB. For example, retrieving keys from memory to check for duplicate records inside the watermark is now faster. #databricks

Data Engineering

28 Views
0 replies
0 kudos

5 hours ago

by israelst • New Contributor II

01-15-2024 12:53:49 AM

480 Views
7 replies
5 kudos

DLT can't authenticate with kinesis using instance profile

When running my notebook using personal compute with instance profile I am indeed able to readStream from kinesis. But adding it as a DLT with UC, while specifying the same instance-profile in the DLT pipeline setting - causes a "MissingAuthenticatio...

Data Engineering

Delta Live Tables

Unity Catalog

480 Views
7 replies
5 kudos

01-15-2024 12:53:49 AM

View Replies

Latest Reply

Mathias_Peters
New Contributor III

9 hours ago

5 kudos

We have used the roleArn and role session name like this: CREATE STREAMING TABLE table_name as SELECT * FROM STREAM read_kinesis ( streamName => 'stream', initialPosition => 'earliest', roleArn => 'arn:aws:iam::ACCT_ID:role/R...

5 kudos

9 hours ago

6 More Replies

by jenshumrich • New Contributor III

8 hours ago

28 Views
0 replies
0 kudos

R install - cannot open URL

Neither standard nor non standard repo seem available. Any idea how to debug/fix this? %r install.packages("gghighlight", lib="/databricks/spark/R/lib", repos = "http://cran.us.r-project.org") Warning: unable to access index for repository http://cra...

Data Engineering

28 Views
0 replies
0 kudos

8 hours ago

by NataliaCh • Visitor

8 hours ago

31 Views
0 replies
0 kudos

Delta table cannot be reached with INTERNAL_ERROR

Hi all!I've been dropping and recreating delta tables at the new location. For one table something went wrong and now I cannot nor DROP nor recreate it. It is visible in catalog, however, when I click on the table I see message: [INTERNAL_ERROR] The ...

Data Engineering

31 Views
0 replies
0 kudos

8 hours ago

by madhumitha • New Contributor

Sunday

126 Views
5 replies
0 kudos

Connect power bi desktop semantic model output to databricks

Hello, I am trying to connect the power bi semantic model output (basically the data that has already been pre processed) to databricks. Does anybody know how to do this? I would like it to be an automated process so I would like to know any way to p...

Data Engineering

126 Views
5 replies
0 kudos

Sunday

View Replies

Latest Reply

Kaniz
Community Manager

Monday

0 kudos

Hi @madhumitha, Connecting Power BI semantic model output to Databricks can be done in a few steps. Here are a couple of options: Databricks Power Query Connector: The new Databricks connector is natively integrated into Power BI. You can configu...

0 kudos

Monday

4 More Replies

by ashraf1395 • New Contributor

yesterday

45 Views
1 replies
0 kudos

How to extend free trial period or enter free startup tier to complete our POC for a client.

We are a data consultancy. Our free trial period is currently getting over and we are still doing POC for one of our potential clients and focusing on providing expert services around databricks.1. Is there a possibility that we can extend the free t...

Data Engineering

45 Views
1 replies
0 kudos

yesterday

View Replies

Latest Reply

Mo
Contributor III

10 hours ago

0 kudos

hey @ashraf1395, I suggest you contact your databricks representative or account manager.

0 kudos

10 hours ago

by Mohit_m • Valued Contributor II

06-15-2022 5:23:13 AM

14424 Views
3 replies
4 kudos

Resolved! How to get the Job ID and Run ID and save into a database

We are having Databricks Job running with main class and JAR file in it. Our JAR file code base is in Scala. Now, when our job starts running, we need to log Job ID and Run ID into the database for future purpose. How can we achieve this?

Data Engineering

14424 Views
3 replies
4 kudos

06-15-2022 5:23:13 AM

View Replies

Latest Reply

Bruno-Castro
Visitor

12 hours ago

4 kudos

That article is for members only, can we also specify here how to do it (for those that are not medium members?). Thanks!

4 kudos

12 hours ago

2 More Replies

by thecodecache • Visitor

12 hours ago

34 Views
0 replies
0 kudos

Transpile a SQL Script into PySpark DataFrame API equivalent code

Input SQL Script (assume any dialect) : SELECT b.se10, b.se3, b.se_aggrtr_indctr, b.key_swipe_ind FROM (SELECT se10, se3, se_aggrtr_indctr, ROW_NUMBER() OVER (PARTITION BY SE10 ...

Data Engineering

34 Views
0 replies
0 kudos

12 hours ago

by SreeG • New Contributor II

a week ago

295 Views
3 replies
0 kudos

CICD for Work Flows

HiI am facing issues when deploying work flows to different environment. The same works for Notebooks and Scripts, when deploying the work flows, it failed with "Authorization Failed. Your token may be expired or lack the valid scope". Anything shoul...

Data Engineering

CICD

295 Views
3 replies
0 kudos

a week ago

View Replies

Latest Reply

Yeshwanth
Valued Contributor

yesterday

0 kudos

@SreeG thanks for confirming!

0 kudos

yesterday

2 More Replies

by MarkD • New Contributor

yesterday

53 Views
1 replies
0 kudos

Is it possible to migrate data from one DLT pipeline to another?

Hi,We have a DLT pipeline that has been running for a while with a Hive Metastore target that has stored billions of records. We'd like to move the data to a Unity Catalog. The documentation says "Existing pipelines that use the Hive metastore cannot...

Data Engineering

Delta Live Tables

dlt

Unity Catalog

53 Views
1 replies
0 kudos

yesterday

View Replies

Latest Reply

Yeshwanth
Valued Contributor

yesterday

0 kudos

@MarkD good day! I'm sorry, but according to the description, existing pipelines using the Hive metastore cannot be upgraded to use Unity Catalog. To migrate an existing pipeline that writes to Hive metastore, you must create a new pipeline and re-in...

0 kudos

yesterday

by TheDataDexter • New Contributor III

01-07-2022 1:39:04 AM

2127 Views
4 replies
3 kudos

Resolved! Single-Node cluster works but Multi-Node clusters do not read data.

I am currently working with a VNET injected databricks workspace. At the moment I have mounted a the databricks cluster on an ADLS G2 resource. When running notebooks on a single node that read, transform, and write data we do not encounter any probl...

Data Engineering

2127 Views
4 replies
3 kudos

01-07-2022 1:39:04 AM

View Replies

Latest Reply

ellafj
Visitor

yesterday

3 kudos

@TheDataDexter Did you find a solution to your problem? I am facing the same issue

3 kudos

yesterday

3 More Replies

by Ameshj • New Contributor II

a week ago

351 Views
8 replies
0 kudos

Dbfs init script migration

I need help with migrating from dbfs on databricks to workspace. I am new to databricks and am struggling with what is on the links provided.My workspace.yml also has dbfs hard-coded. Included is a full deployment with great expectations.This was don...

Data Engineering

Azure Databricks

dbfs

Great expectations

python

351 Views
8 replies
0 kudos

a week ago

View Replies

Latest Reply

NandiniN
Valued Contributor II

Monday

0 kudos

One of the other suggestions is to use Lakehouse Federation. It is possible it may be a driver issue (we will get to know from the logs)

0 kudos

Monday

7 More Replies

by Red_blue_green • New Contributor III

02-08-2024 2:18:57 AM

2060 Views
3 replies
0 kudos

Databricks: Change the existing schema of columns to non-nullable for a delta table using Pyspark?

Hello,I have currently a delta folder as a table with several columns that are nullable. I want to migrate data to the table and overwrite the content using Pyspark, add several new columns and make them not nullable. I have found a way to make the c...

Data Engineering

delta

2060 Views
3 replies
0 kudos

02-08-2024 2:18:57 AM

View Replies

Latest Reply

kanjinghat
New Contributor

yesterday

0 kudos

Not sure if you found a solution, you can also try as below. In this case you pass the full path to the delta not the table itself.spark.sql(f"ALTER TABLE delta.`{full_delta_path}` ALTER column {column_name} SET NOT NULL")

0 kudos

yesterday

2 More Replies

User

Count

1603

736

344

284

247

Databricks

Forum Posts

Possible false positive warning on DLT pipeline

Cannot run another notebook from same directory

RocksDB for storing state stream

DLT can't authenticate with kinesis using instance profile

R install - cannot open URL

Delta table cannot be reached with INTERNAL_ERROR

Connect power bi desktop semantic model output to databricks

How to extend free trial period or enter free startup tier to complete our POC for a client.

Resolved! How to get the Job ID and Run ID and save into a database

Transpile a SQL Script into PySpark DataFrame API equivalent code

CICD for Work Flows

Is it possible to migrate data from one DLT pipeline to another?

Resolved! Single-Node cluster works but Multi-Node clusters do not read data.

Dbfs init script migration

Databricks: Change the existing schema of columns to non-nullable for a delta table using Pyspark?

Load multiple delta tables at once from Sql server

Starting Serverless sql cluster on GCP

"Can't login to databricks socket is closed" when ...

Temporary views no longer working for Share Comput...

Does DLT use one single SparkSession?