cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

qyu
by New Contributor II
  • 12003 Views
  • 3 replies
  • 3 kudos

Resolved! Need help with this python import error.

I am using databricks runtime 9.1 LTS ML and I got this error when I tried to import Scikit Learn package. I got the following error message:TypeError Traceback (most recent call last) <command-181041> in <module> ...

  • 12003 Views
  • 3 replies
  • 3 kudos
Latest Reply
qyu
New Contributor II
  • 3 kudos

@Atanu Sarkar​ I am using databricks runtime 9.1ML LTS and python version is 3.8.10I am only just running import statementfrom sklearn.metrics import * from sklearn.preprocessing import LabelEncoder

  • 3 kudos
2 More Replies
danielveraec
by New Contributor III
  • 12358 Views
  • 3 replies
  • 1 kudos

Resolved! Error writing a partitioned Delta Table from a multitasking job in azure databricks

I have a notebook that writes a delta table with a statement similar to the following:match = "current.country = updates.country and current.process_date = updates.process_date" deltaTable = DeltaTable.forPath(spark, silver_path) deltaTable.alias("cu...

eb3tr
  • 12358 Views
  • 3 replies
  • 1 kudos
Latest Reply
danielveraec
New Contributor III
  • 1 kudos

Initially, the affected table only had a date field as partition. So I partitioned it with country and date fields. This new partition created the country and date directories however the old directories of the date partition remained and were not de...

  • 1 kudos
2 More Replies
sudhanshu1
by New Contributor III
  • 9831 Views
  • 1 replies
  • 0 kudos

Query to know all tables and columns name in delta lake

Hi all,​Does anyone know how to write simple SQL query to get all tables and columns name. In oracle we do ,select * from all tab columns. Similarly in SQL server we do select * from information schema . columns.​Do we have something like this in dat...

  • 9831 Views
  • 1 replies
  • 0 kudos
Latest Reply
AmanSehgal
Honored Contributor III
  • 0 kudos

To view columns in a table, use SHOW COLUMNS.%sql show columns in <schema_name>.<table_name>To show all the tables in a column, use following PySpark code:%python   schema_name = "default" tbl_columns = {}   # Get all tables in a schema tables = spar...

  • 0 kudos
Jreco
by Contributor
  • 8416 Views
  • 6 replies
  • 4 kudos

Resolved! messages from event hub does not flow after a time

Hi Team,I'm trying to build a Real-time solution using Databricks and Event hubs.Something weird happens after a time that the process start.At the begining the messages flow through the process as expected with this rate: please, note that the last ...

image image image
  • 8416 Views
  • 6 replies
  • 4 kudos
Latest Reply
Jreco
Contributor
  • 4 kudos

Thanks for your answer @Hubert Dudek​ , Is already specifiedWhat do youn mean with this? This is the weird part of this, bucause the data is flowing good, but at any time is like the Job stop the reading or somethign like that and if I restart the ...

  • 4 kudos
5 More Replies
wpenfold
by New Contributor II
  • 34235 Views
  • 5 replies
  • 2 kudos
  • 34235 Views
  • 5 replies
  • 2 kudos
Latest Reply
AmanSehgal
Honored Contributor III
  • 2 kudos

Using workspace API you can list out all the notebooks for a given user.The API response will tell you if the objects under the path is a folder or a notebook. If it's a folder then you can add it to the path and get notebooks within the folder.Put a...

  • 2 kudos
4 More Replies
Juniper_AIML
by New Contributor
  • 4082 Views
  • 1 replies
  • 0 kudos

How to setup Instance profile for initializing Databricks Cluster using Docker?

I was trying to start of the Databricks cluster through a docker image. I followed the setup instruction. Excluding the additional setup to setup the IAM role and instance profile as I was facing issues.The image is stored on AWS ECR in a public repo...

Screenshot 2022-02-15 at 2.39.57 PM Screenshot 2022-02-15 at 2.49.03 PM
  • 4082 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hello again, @Aman Gaurav​. Thanks for your second question. As before, we'll see what the community has to say first.

  • 0 kudos
fff_ds
by New Contributor
  • 1778 Views
  • 1 replies
  • 1 kudos

Manual overwrite in s3 console of a collection of parquet files and now we can't read them.

org.apache.spark.SparkException: Job aborted due to stage failure: Task 19 in stage 26.0 failed 4 times, most recent failure: Lost task 19.3 in stage 26.0 (TID 4205, 10.66.225.154, executor 0): com.databricks.sql.io.FileReadException: Error while rea...

  • 1778 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hello, @Lili Ehrlich​. Welcome! My name is Piper, and I'm a moderator for Databricks. Thank you for bringing your question to us. Let's give it a while for the community to respond first.Thanks in advance for your patience.

  • 1 kudos
Mihai1
by New Contributor III
  • 3940 Views
  • 3 replies
  • 4 kudos

Resolved! MLflow Model Serving on Azure Databricks General Availability

When is MLflow Model Serving on Azure Databricks expected to become General Available?

  • 3940 Views
  • 3 replies
  • 4 kudos
Latest Reply
User16764241763
Databricks Employee
  • 4 kudos

Hello Mihai,We plan to GA, Model serving by end of this year as we are working on a lot of improvements.

  • 4 kudos
2 More Replies
omran
by New Contributor II
  • 4501 Views
  • 2 replies
  • 2 kudos

Resolved! What is the procedure of using IPOPT solver in Azure databricks?

I have installed ipopt solver of version 3.11.1 in azure databricks but while running the code it throwing an error - WARNING: Could not locate the 'ipopt' executable, which is required for solver ipoptApplicationError: No executable found for solv...

  • 4501 Views
  • 2 replies
  • 2 kudos
Latest Reply
User16764241763
Databricks Employee
  • 2 kudos

Hello @omran shaik​  In the past, we have recommended customers to use docker containers with Databricks as some of these solvers required native compilation and did not work great on the runtimes.With DCS you have full control of what you want to in...

  • 2 kudos
1 More Replies
Anonymous
by Not applicable
  • 1310 Views
  • 0 replies
  • 1 kudos

The Next Databricks Office HoursOur next Office Hours session is scheduled for February 23, 2022 - 8:00 am PDT Do you have questions about how to set ...

The Next Databricks Office HoursOur next Office Hours session is scheduled for February 23, 2022 - 8:00 am PDTDo you have questions about how to set up or use Databricks? Do you want to get best practices for deploying your use case or tips on data a...

  • 1310 Views
  • 0 replies
  • 1 kudos
ninjadev999
by New Contributor II
  • 9367 Views
  • 7 replies
  • 1 kudos

Resolved! Can't write big DataFrame into MSSQL server by using jdbc driver on Azure Databricks

I'm reading a huge csv file including 39,795,158 records and writing into MSSQL server, on Azure Databricks. The Databricks(notebook) is running on a cluster node with 56 GB Memory, 16 Cores, and 12 workers.This is my code in Python and PySpark:from ...

  • 9367 Views
  • 7 replies
  • 1 kudos
Latest Reply
User16764241763
Databricks Employee
  • 1 kudos

Hi,If you are using Azure SQL DB Managed instance, could you please file a support request with Azure team? This is to review any timeouts, perf issues on the backend.Also, it seems like the timeout is coming from SQL Server which is closing the conn...

  • 1 kudos
6 More Replies
my_community2
by New Contributor III
  • 4885 Views
  • 1 replies
  • 2 kudos

Resolved! SQL cast operator not working properly

please have a look at the attached screenshotThree strings converted to float, each resulting in the same number. 22015683.000000000000000000 => 2201568422015684.000000000000000000 => 2201568422015685.000000000000000000 => 22015684

sql_cast
  • 4885 Views
  • 1 replies
  • 2 kudos
Latest Reply
MartinB
Contributor III
  • 2 kudos

Hi @Maciej G​ ,I guess, this has something to do with the data type FLOAT and its precision.Floats are only an approximation with a given precision. Either you should consider using date type DOUBLE (double precision compared to FLOAT) - or, if you ...

  • 2 kudos
jimnaik
by New Contributor III
  • 22596 Views
  • 2 replies
  • 3 kudos

Resolved! How to execute .sh and .py file in the workspace?

I want to execute shell script which is running .py file. May I know how to run .sh file and .py files in Databricks workspace?

  • 22596 Views
  • 2 replies
  • 3 kudos
Latest Reply
jimnaik
New Contributor III
  • 3 kudos

I tried executing like this and it worked: %sh /dbfs/***/***/***.sh

  • 3 kudos
1 More Replies
IvNen
by New Contributor II
  • 4602 Views
  • 1 replies
  • 1 kudos

Azure functions error communicating with a databricks notebook

I have a connection between azure functions and databricks notebook to pull data from the notebook. That was working fine until 7th of Feb, but then I started getting an error without a sensible error code. I have attached the stack trace and the err...

Capture
  • 4602 Views
  • 1 replies
  • 1 kudos
Latest Reply
IvNen
New Contributor II
  • 1 kudos

Notebook in a databricks cluster keeps a cache while the cluster is running. If you add an import statement and then remove it, the notebook still has a cached instance of that import and will continue to work. Running code in Visual Studio against t...

  • 1 kudos
Labels