cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

BenzDriver
by New Contributor II
  • 2578 Views
  • 2 replies
  • 1 kudos

Resolved! SQL command FSCK is not found

Hello there,I currently have the problem of deleted files still being in the transaction log when trying to call a delta table. What I found was this statement:%sql FSCK REPAIR TABLE table_name [DRY RUN]But using it returned following error:Error in ...

  • 2578 Views
  • 2 replies
  • 1 kudos
Latest Reply
RKNutalapati
Valued Contributor
  • 1 kudos

Remove square brackets and try executing the command%sqlFSCK REPAIR TABLE table_name DRY RUN

  • 1 kudos
1 More Replies
qyu
by New Contributor II
  • 10737 Views
  • 3 replies
  • 3 kudos

Resolved! Need help with this python import error.

I am using databricks runtime 9.1 LTS ML and I got this error when I tried to import Scikit Learn package. I got the following error message:TypeError Traceback (most recent call last) <command-181041> in <module> ...

  • 10737 Views
  • 3 replies
  • 3 kudos
Latest Reply
qyu
New Contributor II
  • 3 kudos

@Atanu Sarkar​ I am using databricks runtime 9.1ML LTS and python version is 3.8.10I am only just running import statementfrom sklearn.metrics import * from sklearn.preprocessing import LabelEncoder

  • 3 kudos
2 More Replies
danielveraec
by New Contributor III
  • 10050 Views
  • 3 replies
  • 1 kudos

Resolved! Error writing a partitioned Delta Table from a multitasking job in azure databricks

I have a notebook that writes a delta table with a statement similar to the following:match = "current.country = updates.country and current.process_date = updates.process_date" deltaTable = DeltaTable.forPath(spark, silver_path) deltaTable.alias("cu...

eb3tr
  • 10050 Views
  • 3 replies
  • 1 kudos
Latest Reply
danielveraec
New Contributor III
  • 1 kudos

Initially, the affected table only had a date field as partition. So I partitioned it with country and date fields. This new partition created the country and date directories however the old directories of the date partition remained and were not de...

  • 1 kudos
2 More Replies
sudhanshu1
by New Contributor III
  • 7784 Views
  • 1 replies
  • 0 kudos

Query to know all tables and columns name in delta lake

Hi all,​Does anyone know how to write simple SQL query to get all tables and columns name. In oracle we do ,select * from all tab columns. Similarly in SQL server we do select * from information schema . columns.​Do we have something like this in dat...

  • 7784 Views
  • 1 replies
  • 0 kudos
Latest Reply
AmanSehgal
Honored Contributor III
  • 0 kudos

To view columns in a table, use SHOW COLUMNS.%sql show columns in <schema_name>.<table_name>To show all the tables in a column, use following PySpark code:%python   schema_name = "default" tbl_columns = {}   # Get all tables in a schema tables = spar...

  • 0 kudos
Jreco
by Contributor
  • 5921 Views
  • 6 replies
  • 4 kudos

Resolved! messages from event hub does not flow after a time

Hi Team,I'm trying to build a Real-time solution using Databricks and Event hubs.Something weird happens after a time that the process start.At the begining the messages flow through the process as expected with this rate: please, note that the last ...

image image image
  • 5921 Views
  • 6 replies
  • 4 kudos
Latest Reply
Jreco
Contributor
  • 4 kudos

Thanks for your answer @Hubert Dudek​ , Is already specifiedWhat do youn mean with this? This is the weird part of this, bucause the data is flowing good, but at any time is like the Job stop the reading or somethign like that and if I restart the ...

  • 4 kudos
5 More Replies
wpenfold
by New Contributor II
  • 31447 Views
  • 5 replies
  • 2 kudos
  • 31447 Views
  • 5 replies
  • 2 kudos
Latest Reply
AmanSehgal
Honored Contributor III
  • 2 kudos

Using workspace API you can list out all the notebooks for a given user.The API response will tell you if the objects under the path is a folder or a notebook. If it's a folder then you can add it to the path and get notebooks within the folder.Put a...

  • 2 kudos
4 More Replies
Juniper_AIML
by New Contributor
  • 3093 Views
  • 1 replies
  • 0 kudos

How to setup Instance profile for initializing Databricks Cluster using Docker?

I was trying to start of the Databricks cluster through a docker image. I followed the setup instruction. Excluding the additional setup to setup the IAM role and instance profile as I was facing issues.The image is stored on AWS ECR in a public repo...

Screenshot 2022-02-15 at 2.39.57 PM Screenshot 2022-02-15 at 2.49.03 PM
  • 3093 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hello again, @Aman Gaurav​. Thanks for your second question. As before, we'll see what the community has to say first.

  • 0 kudos
fff_ds
by New Contributor
  • 1358 Views
  • 1 replies
  • 1 kudos

Manual overwrite in s3 console of a collection of parquet files and now we can't read them.

org.apache.spark.SparkException: Job aborted due to stage failure: Task 19 in stage 26.0 failed 4 times, most recent failure: Lost task 19.3 in stage 26.0 (TID 4205, 10.66.225.154, executor 0): com.databricks.sql.io.FileReadException: Error while rea...

  • 1358 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hello, @Lili Ehrlich​. Welcome! My name is Piper, and I'm a moderator for Databricks. Thank you for bringing your question to us. Let's give it a while for the community to respond first.Thanks in advance for your patience.

  • 1 kudos
Mihai1
by New Contributor III
  • 2813 Views
  • 3 replies
  • 4 kudos

Resolved! MLflow Model Serving on Azure Databricks General Availability

When is MLflow Model Serving on Azure Databricks expected to become General Available?

  • 2813 Views
  • 3 replies
  • 4 kudos
Latest Reply
User16764241763
Honored Contributor
  • 4 kudos

Hello Mihai,We plan to GA, Model serving by end of this year as we are working on a lot of improvements.

  • 4 kudos
2 More Replies
omran
by New Contributor II
  • 3483 Views
  • 2 replies
  • 2 kudos

Resolved! What is the procedure of using IPOPT solver in Azure databricks?

I have installed ipopt solver of version 3.11.1 in azure databricks but while running the code it throwing an error - WARNING: Could not locate the 'ipopt' executable, which is required for solver ipoptApplicationError: No executable found for solv...

  • 3483 Views
  • 2 replies
  • 2 kudos
Latest Reply
User16764241763
Honored Contributor
  • 2 kudos

Hello @omran shaik​  In the past, we have recommended customers to use docker containers with Databricks as some of these solvers required native compilation and did not work great on the runtimes.With DCS you have full control of what you want to in...

  • 2 kudos
1 More Replies
Anonymous
by Not applicable
  • 966 Views
  • 0 replies
  • 1 kudos

The Next Databricks Office HoursOur next Office Hours session is scheduled for February 23, 2022 - 8:00 am PDT Do you have questions about how to set ...

The Next Databricks Office HoursOur next Office Hours session is scheduled for February 23, 2022 - 8:00 am PDTDo you have questions about how to set up or use Databricks? Do you want to get best practices for deploying your use case or tips on data a...

  • 966 Views
  • 0 replies
  • 1 kudos
ninjadev999
by New Contributor II
  • 7431 Views
  • 7 replies
  • 1 kudos

Resolved! Can't write big DataFrame into MSSQL server by using jdbc driver on Azure Databricks

I'm reading a huge csv file including 39,795,158 records and writing into MSSQL server, on Azure Databricks. The Databricks(notebook) is running on a cluster node with 56 GB Memory, 16 Cores, and 12 workers.This is my code in Python and PySpark:from ...

  • 7431 Views
  • 7 replies
  • 1 kudos
Latest Reply
User16764241763
Honored Contributor
  • 1 kudos

Hi,If you are using Azure SQL DB Managed instance, could you please file a support request with Azure team? This is to review any timeouts, perf issues on the backend.Also, it seems like the timeout is coming from SQL Server which is closing the conn...

  • 1 kudos
6 More Replies
my_community2
by New Contributor III
  • 3453 Views
  • 1 replies
  • 2 kudos

Resolved! SQL cast operator not working properly

please have a look at the attached screenshotThree strings converted to float, each resulting in the same number. 22015683.000000000000000000 => 2201568422015684.000000000000000000 => 2201568422015685.000000000000000000 => 22015684

sql_cast
  • 3453 Views
  • 1 replies
  • 2 kudos
Latest Reply
MartinB
Contributor III
  • 2 kudos

Hi @Maciej G​ ,I guess, this has something to do with the data type FLOAT and its precision.Floats are only an approximation with a given precision. Either you should consider using date type DOUBLE (double precision compared to FLOAT) - or, if you ...

  • 2 kudos
jimnaik
by New Contributor III
  • 20964 Views
  • 2 replies
  • 3 kudos

Resolved! How to execute .sh and .py file in the workspace?

I want to execute shell script which is running .py file. May I know how to run .sh file and .py files in Databricks workspace?

  • 20964 Views
  • 2 replies
  • 3 kudos
Latest Reply
jimnaik
New Contributor III
  • 3 kudos

I tried executing like this and it worked: %sh /dbfs/***/***/***.sh

  • 3 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels