Data Engineering

Forum Posts

Sorted by:

by pgagliardi • New Contributor II

12-08-2022 11:14:40 AM

1646 Views
1 replies
2 kudos

Latest pushed code is not taken into account by Notebook

Hello, I cloned a repo my_repo in the Dataricks space Repos.Inside my_repo, I created a notebook new_experiment where I can import functions from my_repo, which is really handy. When I want to modify a function in my_repo, I open my local IDE, do the...

Data Engineering

1646 Views
1 replies
2 kudos

12-08-2022 11:14:40 AM

View Replies

Latest Reply

Jnguyen
New Contributor II

02-07-2024 10:39:02 PM

2 kudos

Use %reload_ext autoreload instead, it will do your expected behavior.You just need to run it once, like %load_ext autoreload %autoreload 2

2 kudos

02-07-2024 10:39:02 PM

by Mr__D • New Contributor II

03-23-2023 11:10:20 AM

10816 Views
7 replies
1 kudos

Resolved! Writing modular code in Databricks

Hi All, Could you please suggest to me the best way to write PySpark code in Databricks,I don't want to write my code in Databricks notebook but create python files(modular project) in Vscode and call only the primary function in the notebook(the res...

Data Engineering

10816 Views
7 replies
1 kudos

03-23-2023 11:10:20 AM

View Replies

Latest Reply

Gamlet
New Contributor II

01-17-2024 5:33:35 AM

1 kudos

Certainly! To write PySpark code in Databricks while maintaining a modular project in VSCode, you can organize your PySpark code into Python files in VSCode, with a primary function encapsulating the main logic. Then, upload these files to Databricks...

1 kudos

01-17-2024 5:33:35 AM

6 More Replies

by Danielsg94 • New Contributor II

08-24-2022 12:59:47 AM

31044 Views
6 replies
2 kudos

Resolved! How can I write a single file to a blob storage using a Python notebook, to a folder with other data?

When I use the following code: df .coalesce(1) .write.format("com.databricks.spark.csv") .option("header", "true") .save("/path/mydata.csv")it writes several files, and when used with .mode("overwrite"), it will overwrite everything in th...

Data Engineering

31044 Views
6 replies
2 kudos

08-24-2022 12:59:47 AM

View Replies

Latest Reply

Simha
New Contributor II

01-17-2024 4:37:17 AM

2 kudos

Hi Daniel,May I know, how did you fix this issue. I am facing similar issue while writing csv/parquet to blob/adls, it creates a separate folder with the filename and creates a partition file within that folder.I need to write just a file on to the b...

2 kudos

01-17-2024 4:37:17 AM

5 More Replies

by Erik • Valued Contributor II

01-05-2022 5:17:51 AM

9108 Views
6 replies
3 kudos

Resolved! How to run code-formating on the notebooks

Has anyone found a nice way to run code-formating (like black) on the notebooks **in the workspace**? My current workflow is to commit the file, pull it locally, format, repush and pull. It would be nice if it was some relatively easy way to run blac...

Data Engineering

9108 Views
6 replies
3 kudos

01-05-2022 5:17:51 AM

View Replies

Latest Reply

MartinPlay01
New Contributor II

12-28-2023 5:49:43 AM

3 kudos

Hi Erik,I don't know if you are aware of this feature, currently there is an option to format the code in your databricks notebooks using the black code style formatter.Just you need to either have a version of your DBR equal to or greater than 11.2 ...

3 kudos

12-28-2023 5:49:43 AM

5 More Replies

by Prank • New Contributor III

03-16-2023 7:41:27 AM

5295 Views
11 replies
8 kudos

Facing issue in the below mentioned code: dbutils.notebook.entry_point.getDbutils().notebook().getContext().tags().get('browserHostName').get() while executing through workflow, but it's totally working fine with workspace.

Please do let me know how to resolve this issue?

Data Engineering

5295 Views
11 replies
8 kudos

03-16-2023 7:41:27 AM

View Replies

Latest Reply

BilalAslamDbrx
Honored Contributor III

09-17-2023 8:04:08 AM

8 kudos

@Prank why do you want the browser hostname?

8 kudos

09-17-2023 8:04:08 AM

10 More Replies

by Mr_K • New Contributor

05-19-2023 7:41:15 AM

4829 Views
2 replies
2 kudos

AnalysisException: [UC_COMMAND_NOT_SUPPORTED] Spark higher-order functions are not supported in Unity Catalog.;

Hello,forecast_date = '2017-12-01' spark.conf.set('spark.sql.shuffle.partitions', 500 ) # generate forecast for this data forecasts = ( history .where(history.date < forecast_date) # limit training data to prior to our forecast date .groupBy...

Data Engineering

4829 Views
2 replies
2 kudos

05-19-2023 7:41:15 AM

View Replies

Latest Reply

Tharun-Kumar
Honored Contributor II

08-10-2023 9:54:22 PM

2 kudos

@Mr_K ApplyInPandas is a higher order function in Python. As of now, we do not support higher order functions in Unity Catalog. We do support direct calls made to python UDFs. Here is an example of how to reference UDFs in UC - https://docs.databrick...

2 kudos

08-10-2023 9:54:22 PM

1 More Replies

by jch • New Contributor III

05-15-2023 2:48:30 PM

7019 Views
4 replies
5 kudos

Resolved! Why does spark.read.csv come back with an error: com.databricks.sql.io.FileReadException: Error while reading file dbfs:/mnt/cntnr/demo/circuits.csv ?

I need help understanding why I can't open a file.In a databricks notebook, I use this code:%fs ls /mnt/cntnr/demoI get back dbfs:/mnt/cntnr/demo/circuits.csv as one of the path values.When I use this code, I get an error:circuits_df = spark.read....

Data Engineering

7019 Views
4 replies
5 kudos

05-15-2023 2:48:30 PM

View Replies

Latest Reply

jch
New Contributor III

06-21-2023 5:56:15 AM

5 kudos

It turns out my spark config was wrong #Set Spark configuration configs = {"fs.azure.account.auth.type": "OAuth", "fs.azure.account.oauth.provider.type": "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider", "fs.azu...

5 kudos

06-21-2023 5:56:15 AM

3 More Replies

by PrawnballNightm • New Contributor III

06-07-2023 10:10:23 AM

4344 Views
4 replies
0 kudos

Resolved! Cannot configure VS code databricks extension with a non-standard databricks URL: not a databricks host.

Hello,I'm trying to connect to our databricks instance using the vscode extension. However, when following this guide we cannot get the configuration to proceed past the point that it asks for our instance URL. The prompt appears to expect a URL of t...

Data Engineering

4344 Views
4 replies
0 kudos

06-07-2023 10:10:23 AM

View Replies

Latest Reply

PrawnballNightm
New Contributor III

06-21-2023 4:17:43 AM

0 kudos

Hello,Yes, the databricks team shared a modified version of the vs code plugin which did not include the URL matching logic. It connects successfully. However, our custom URL is as it is because our organisation is hosting its own instance of Databri...

0 kudos

06-21-2023 4:17:43 AM

3 More Replies

by Data_Analytics1 • Contributor III

06-20-2023 3:33:52 AM

2049 Views
1 replies
0 kudos

Getting JsonParseException: Unexpected character ('<' (code 60))

I have a scheduled job that is executed using a notebook. Within one of the notebook cells, there is a check to determine if a table exists. However, even when the table does exist, it incorrectly identifies it as non-existent and proceeds to execut...

Data Engineering

2049 Views
1 replies
0 kudos

06-20-2023 3:33:52 AM

View Replies

Latest Reply

Anonymous
Not applicable

06-20-2023 8:20:39 PM

0 kudos

Hi @Mahesh Chahare Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

0 kudos

06-20-2023 8:20:39 PM

by Eelke • New Contributor II

04-12-2023 5:36:50 AM

6206 Views
3 replies
0 kudos

I want to perform interpolation on a streaming table in delta live tables.

I have the following code:from pyspark.sql.functions import * !pip install dbl-tempo from tempo import TSDF from pyspark.sql.functions import * # interpolate target_cols column linearly for tsdf dataframe def interpolate_tsdf(tsdf_data, target_c...

Data Engineering

6206 Views
3 replies
0 kudos

04-12-2023 5:36:50 AM

View Replies

Latest Reply

Eelke
New Contributor II

06-15-2023 1:30:57 AM

0 kudos

The issue was not resolved because we were trying to use a streaming table within TSDF which does not work.

0 kudos

06-15-2023 1:30:57 AM

2 More Replies

by Sas • New Contributor II

05-14-2023 11:12:32 PM

1266 Views
1 replies
0 kudos

A streaming job going into infinite looping

HiBelow i am trying to read data from kafka, determine whether its fraud or not and then i need to write it back to mongodbbelow is my code read_kafka.pyfrom pyspark.sql import SparkSession from pyspark.sql.functions import * from pyspark.sql.types i...

Data Engineering

1266 Views
1 replies
0 kudos

05-14-2023 11:12:32 PM

View Replies

Latest Reply

swethaNandan
New Contributor III

06-14-2023 10:35:00 AM

0 kudos

Hi Saswata,Can you remove the filter and see if it is printing output to console?kafka_df5=kafka_df4.filter(kafka_df4.status=="FRAUD")Thanks and RegardsSwetha Nandajan

0 kudos

06-14-2023 10:35:00 AM

by eyalo • New Contributor II

05-14-2023 5:06:19 AM

4027 Views
6 replies
0 kudos

Why the SFTP ingest doesn't work?

Hi, I did the following code but it seems like the cluster is running for a long period of time and then stops without any results. Attached my following code: (I used 'com.springml.spark.sftp' library and install it as Maven)Also i whitelisted my lo...

Data Engineering

4027 Views
6 replies
0 kudos

05-14-2023 5:06:19 AM

View Replies

Latest Reply

eyalo
New Contributor II

05-22-2023 4:03:30 AM

0 kudos

@Debayan Mukherjee Hi, I don't know if you got my reply so i am bouncing my message to you again.Thanks.

0 kudos

05-22-2023 4:03:30 AM

5 More Replies

by Ashwathy • New Contributor II

12-14-2022 5:04:47 AM

5552 Views
3 replies
3 kudos

Facing issue while using widget values in sql script

I am using below code to create and read widgets. I am assigning default value.dbutils.widgets.text("pname", "default","parameter_name")pname=dbutils.widgets.get("pname")I am using this widget parameter in some sql scripts. one example is given below...

Data Engineering

5552 Views
3 replies
3 kudos

12-14-2022 5:04:47 AM

View Replies

Latest Reply

Kaniz_Fatma
Community Manager

12-20-2022 11:06:36 PM

3 kudos

Hi @Ashwathy P P , Which Databricks Runtime are you using?A known issue is that a widget state may not be adequately clear after pressing Run All, even after clearing or removing the widget in the code. If this happens, you will see a discrepancy be...

3 kudos

12-20-2022 11:06:36 PM

2 More Replies

by Mumrel • Contributor

05-15-2023 2:07:17 AM

1888 Views
2 replies
2 kudos

Resolved! Error 95 when importing one Notebook into another

When I follow the instructions Modularize your code using files I get the following error:I am on azure, use DBRT 12.2 LTS, use ADLS as storage, I am happy to provide more details if needed. My research suggest that the reason is that the dfbs fuse...

Data Engineering

1888 Views
2 replies
2 kudos

05-15-2023 2:07:17 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

05-15-2023 6:25:48 AM

2 kudos

import works for .py files..%run is for notebooks.is lib a .py file or a notebook?

2 kudos

05-15-2023 6:25:48 AM

1 More Replies

by Sid0610 • New Contributor II

05-01-2023 9:23:06 AM

2176 Views
3 replies
3 kudos

Resolved! Databricks SQL CREATE TABLE ParseException

I am trying to use the following code to create a deltatable%sqlCREATE TABLE rectangles(a INT, b INT, area INT GENERATED ALWAYS AS IDENTITY (START WITH 1, STEP BY 1))I don't know why but I am always getting the ParseException error.I tried all other ...

Data Engineering

2176 Views
3 replies
3 kudos

05-01-2023 9:23:06 AM

View Replies

Latest Reply

emiratesevisaon
New Contributor II

05-05-2023 4:52:09 AM

3 kudos

How can we use SQL for my website emiratesevisaonline.com backend date?

3 kudos

05-05-2023 4:52:09 AM

2 More Replies