Data Engineering

Forum Posts

Sorted by:

by qyu • New Contributor II

02-11-2022 10:21:13 AM

12003 Views
3 replies
3 kudos

Resolved! Need help with this python import error.

I am using databricks runtime 9.1 LTS ML and I got this error when I tried to import Scikit Learn package. I got the following error message:TypeError Traceback (most recent call last) <command-181041> in <module> ...

Data Engineering

12003 Views
3 replies
3 kudos

02-11-2022 10:21:13 AM

View Replies

Latest Reply

qyu
New Contributor II

02-14-2022 3:43:17 PM

3 kudos

@Atanu Sarkar I am using databricks runtime 9.1ML LTS and python version is 3.8.10I am only just running import statementfrom sklearn.metrics import * from sklearn.preprocessing import LabelEncoder

3 kudos

02-14-2022 3:43:17 PM

2 More Replies

by danielveraec • New Contributor III

02-14-2022 7:55:08 AM

12358 Views
3 replies
1 kudos

Resolved! Error writing a partitioned Delta Table from a multitasking job in azure databricks

I have a notebook that writes a delta table with a statement similar to the following:match = "current.country = updates.country and current.process_date = updates.process_date" deltaTable = DeltaTable.forPath(spark, silver_path) deltaTable.alias("cu...

Data Engineering

12358 Views
3 replies
1 kudos

02-14-2022 7:55:08 AM

View Replies

Latest Reply

danielveraec
New Contributor III

02-16-2022 11:20:26 AM

1 kudos

Initially, the affected table only had a date field as partition. So I partitioned it with country and date fields. This new partition created the country and date directories however the old directories of the date partition remained and were not de...

1 kudos

02-16-2022 11:20:26 AM

2 More Replies

by sudhanshu1 • New Contributor III

02-16-2022 2:33:02 PM

9831 Views
1 replies
0 kudos

Query to know all tables and columns name in delta lake

Hi all,Does anyone know how to write simple SQL query to get all tables and columns name. In oracle we do ,select * from all tab columns. Similarly in SQL server we do select * from information schema . columns.Do we have something like this in dat...

Data Engineering

9831 Views
1 replies
0 kudos

02-16-2022 2:33:02 PM

View Replies

Latest Reply

AmanSehgal
Honored Contributor III

02-16-2022 3:05:45 PM

0 kudos

To view columns in a table, use SHOW COLUMNS.%sql show columns in <schema_name>.<table_name>To show all the tables in a column, use following PySpark code:%python schema_name = "default" tbl_columns = {} # Get all tables in a schema tables = spar...

0 kudos

02-16-2022 3:05:45 PM

by Jreco • Contributor

02-01-2022 3:01:03 AM

8416 Views
6 replies
4 kudos

Resolved! messages from event hub does not flow after a time

Hi Team,I'm trying to build a Real-time solution using Databricks and Event hubs.Something weird happens after a time that the process start.At the begining the messages flow through the process as expected with this rate: please, note that the last ...

Data Engineering

8416 Views
6 replies
4 kudos

02-01-2022 3:01:03 AM

View Replies

Latest Reply

Jreco
Contributor

02-01-2022 7:18:38 AM

4 kudos

Thanks for your answer @Hubert Dudek , Is already specifiedWhat do youn mean with this? This is the weird part of this, bucause the data is flowing good, but at any time is like the Job stop the reading or somethign like that and if I restart the ...

4 kudos

02-01-2022 7:18:38 AM

5 More Replies

by wpenfold • New Contributor II

02-10-2022 1:45:32 PM

34235 Views
5 replies
2 kudos

Resolved! Is there a way I can tell when a Notebook was last run, so I can identify and delete Notebooks that are no longer being used?

Data Engineering

34235 Views
5 replies
2 kudos

02-10-2022 1:45:32 PM

View Replies

Latest Reply

AmanSehgal
Honored Contributor III

02-10-2022 4:39:16 PM

2 kudos

Using workspace API you can list out all the notebooks for a given user.The API response will tell you if the objects under the path is a folder or a notebook. If it's a folder then you can add it to the path and get notebooks within the folder.Put a...

2 kudos

02-10-2022 4:39:16 PM

4 More Replies

by Juniper_AIML • New Contributor

02-15-2022 8:50:49 AM

4082 Views
1 replies
0 kudos

How to setup Instance profile for initializing Databricks Cluster using Docker?

I was trying to start of the Databricks cluster through a docker image. I followed the setup instruction. Excluding the additional setup to setup the IAM role and instance profile as I was facing issues.The image is stored on AWS ECR in a public repo...

Data Engineering

4082 Views
1 replies
0 kudos

02-15-2022 8:50:49 AM

View Replies

Latest Reply

Anonymous
Not applicable

02-16-2022 8:44:24 AM

0 kudos

Hello again, @Aman Gaurav. Thanks for your second question. As before, we'll see what the community has to say first.

0 kudos

02-16-2022 8:44:24 AM

by fff_ds • New Contributor

02-15-2022 1:05:44 PM

1778 Views
1 replies
1 kudos

Manual overwrite in s3 console of a collection of parquet files and now we can't read them.

org.apache.spark.SparkException: Job aborted due to stage failure: Task 19 in stage 26.0 failed 4 times, most recent failure: Lost task 19.3 in stage 26.0 (TID 4205, 10.66.225.154, executor 0): com.databricks.sql.io.FileReadException: Error while rea...

Data Engineering

1778 Views
1 replies
1 kudos

02-15-2022 1:05:44 PM

View Replies

Latest Reply

Anonymous
Not applicable

02-16-2022 8:24:05 AM

1 kudos

Hello, @Lili Ehrlich. Welcome! My name is Piper, and I'm a moderator for Databricks. Thank you for bringing your question to us. Let's give it a while for the community to respond first.Thanks in advance for your patience.

1 kudos

02-16-2022 8:24:05 AM

by Mihai1 • New Contributor III

02-10-2022 3:03:15 AM

3940 Views
3 replies
4 kudos

Resolved! MLflow Model Serving on Azure Databricks General Availability

When is MLflow Model Serving on Azure Databricks expected to become General Available?

Data Engineering

3940 Views
3 replies
4 kudos

02-10-2022 3:03:15 AM

View Replies

Latest Reply

User16764241763
Databricks Employee

02-15-2022 12:50:26 PM

4 kudos

Hello Mihai,We plan to GA, Model serving by end of this year as we are working on a lot of improvements.

4 kudos

02-15-2022 12:50:26 PM

2 More Replies

by omran • New Contributor II

01-10-2022 11:10:44 AM

4501 Views
2 replies
2 kudos

Resolved! What is the procedure of using IPOPT solver in Azure databricks?

I have installed ipopt solver of version 3.11.1 in azure databricks but while running the code it throwing an error - WARNING: Could not locate the 'ipopt' executable, which is required for solver ipoptApplicationError: No executable found for solv...

Data Engineering

4501 Views
2 replies
2 kudos

01-10-2022 11:10:44 AM

View Replies

Latest Reply

User16764241763
Databricks Employee

02-15-2022 12:59:24 PM

2 kudos

Hello @omran shaik In the past, we have recommended customers to use docker containers with Databricks as some of these solvers required native compilation and did not work great on the runtimes.With DCS you have full control of what you want to in...

2 kudos

02-15-2022 12:59:24 PM

1 More Replies

by Anonymous • Not applicable

02-15-2022 9:36:28 AM

1310 Views
0 replies
1 kudos

The Next Databricks Office HoursOur next Office Hours session is scheduled for February 23, 2022 - 8:00 am PDT Do you have questions about how to set ...

The Next Databricks Office HoursOur next Office Hours session is scheduled for February 23, 2022 - 8:00 am PDTDo you have questions about how to set up or use Databricks? Do you want to get best practices for deploying your use case or tips on data a...

Data Engineering

1310 Views
0 replies
1 kudos

02-15-2022 9:36:28 AM

by ninjadev999 • New Contributor II

02-11-2022 12:15:11 AM

9367 Views
7 replies
1 kudos

Resolved! Can't write big DataFrame into MSSQL server by using jdbc driver on Azure Databricks

I'm reading a huge csv file including 39,795,158 records and writing into MSSQL server, on Azure Databricks. The Databricks(notebook) is running on a cluster node with 56 GB Memory, 16 Cores, and 12 workers.This is my code in Python and PySpark:from ...

Data Engineering

9367 Views
7 replies
1 kudos

02-11-2022 12:15:11 AM

View Replies

Latest Reply

User16764241763
Databricks Employee

02-15-2022 6:29:06 AM

1 kudos

Hi,If you are using Azure SQL DB Managed instance, could you please file a support request with Azure team? This is to review any timeouts, perf issues on the backend.Also, it seems like the timeout is coming from SQL Server which is closing the conn...

1 kudos

02-15-2022 6:29:06 AM

6 More Replies

by my_community2 • New Contributor III

02-11-2022 10:56:05 AM

4885 Views
1 replies
2 kudos

Resolved! SQL cast operator not working properly

please have a look at the attached screenshotThree strings converted to float, each resulting in the same number. 22015683.000000000000000000 => 2201568422015684.000000000000000000 => 2201568422015685.000000000000000000 => 22015684

Data Engineering

4885 Views
1 replies
2 kudos

02-11-2022 10:56:05 AM

View Replies

Latest Reply

MartinB
Contributor III

02-14-2022 1:17:41 PM

2 kudos

Hi @Maciej G ,I guess, this has something to do with the data type FLOAT and its precision.Floats are only an approximation with a given precision. Either you should consider using date type DOUBLE (double precision compared to FLOAT) - or, if you ...

2 kudos

02-14-2022 1:17:41 PM

by Serhii • Contributor

02-14-2022 9:49:29 AM

1623 Views
0 replies
0 kudos

What is nuid field is the Jupyter Notebook cell metadata?

What is nuid field is the Jupyter Notebook cell metadata? Could it be used to uniquely identify each cell? Thanks for your help in advance!

Data Engineering

1623 Views
0 replies
0 kudos

02-14-2022 9:49:29 AM

by jimnaik • New Contributor III

02-09-2022 10:53:28 AM

22596 Views
2 replies
3 kudos

Resolved! How to execute .sh and .py file in the workspace?

I want to execute shell script which is running .py file. May I know how to run .sh file and .py files in Databricks workspace?

Data Engineering

22596 Views
2 replies
3 kudos

02-09-2022 10:53:28 AM

View Replies

Latest Reply

jimnaik
New Contributor III

02-14-2022 8:45:58 AM

3 kudos

I tried executing like this and it worked: %sh /dbfs/***/***/***.sh

3 kudos

02-14-2022 8:45:58 AM

1 More Replies

by IvNen • New Contributor II

02-11-2022 2:30:33 AM

4602 Views
1 replies
1 kudos

Azure functions error communicating with a databricks notebook

I have a connection between azure functions and databricks notebook to pull data from the notebook. That was working fine until 7th of Feb, but then I started getting an error without a sensible error code. I have attached the stack trace and the err...

Data Engineering

4602 Views
1 replies
1 kudos

02-11-2022 2:30:33 AM

View Replies

Latest Reply

IvNen
New Contributor II

02-14-2022 7:46:02 AM

1 kudos

Notebook in a databricks cluster keeps a cache while the cluster is running. If you add an import statement and then remove it, the notebook still has a cached instance of that import and will continue to work. Running code in Visual Studio against t...

1 kudos

02-14-2022 7:46:02 AM

Databricks Community

Forum Posts

Resolved! Need help with this python import error.

Resolved! Error writing a partitioned Delta Table from a multitasking job in azure databricks

Query to know all tables and columns name in delta lake

Resolved! messages from event hub does not flow after a time

Resolved! Is there a way I can tell when a Notebook was last run, so I can identify and delete Notebooks that are no longer being used?

How to setup Instance profile for initializing Databricks Cluster using Docker?

Manual overwrite in s3 console of a collection of parquet files and now we can't read them.

Resolved! MLflow Model Serving on Azure Databricks General Availability

Resolved! What is the procedure of using IPOPT solver in Azure databricks?

The Next Databricks Office HoursOur next Office Hours session is scheduled for February 23, 2022 - 8:00 am PDT Do you have questions about how to set ...

Resolved! Can't write big DataFrame into MSSQL server by using jdbc driver on Azure Databricks

Resolved! SQL cast operator not working properly

What is nuid field is the Jupyter Notebook cell metadata?

Resolved! How to execute .sh and .py file in the workspace?

Azure functions error communicating with a databricks notebook

File Arrival Trigger - Multiple tables

Issue while handling Deletes and Inserts in Struct...

DLT with CDC and schema changes in streaming pipel...

how to update not tracked column only in new row v...

Databricks Cost Estimation Template