cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Jennifer
by New Contributor III
  • 121 Views
  • 1 replies
  • 0 kudos

Optimization failed for timestampNtz

We have a table using timestampNtz type for timestamp, which is also a cluster key for this table using liquid clustering. I ran OPTIMIZE <table-name>, it failed with errorUnsupported datatype 'TimestampNTZType' But the failed optmization also broke ...

  • 121 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Jennifer,  Since TimestampNTZType is not currently supported for optimization, you can try a workaround by converting the timestamp column to a different data type before running the OPTIMIZE command.For example, you could convert the timestampNt...

  • 0 kudos
vpacik
by New Contributor
  • 288 Views
  • 1 replies
  • 0 kudos

Databricks-connect OpenSSL Handshake failed on WSL2

When trying to setup databricks-connect on WSL2 using 13.3 cluster, I receive the following error regarding OpenSSL CERTIFICATE_ERIFY_FAILED.The authentication is done via SPARK_REMOTE env. variable. E0415 11:24:26.646129568 142172 ssl_transport_sec...

  • 288 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @jp_allard,  One approach to resolve this is to disable SSL certificate verification. However, keep in mind that this approach may compromise security.In your Databricks configuration file (usually located at ~/.databrickscfg), add the following l...

  • 0 kudos
pernilak
by New Contributor III
  • 171 Views
  • 1 replies
  • 0 kudos

Working with Unity Catalog from VSCode using the Databricks Extension

Hi!As suggested by Databricks, we are working with Databricks from VSCode using Databricks bundles for our deployment and using the VSCode Databricks Extension and Databricks Connect during development.However, there are some limitations that we are ...

  • 171 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @pernilak, It’s great that you’re using Databricks with Visual Studio Code (VSCode) for your development workflow! Let’s address the limitations you’ve encountered when working with files from Unity Catalog using native Python. When running Python...

  • 0 kudos
jp_allard
by New Contributor
  • 150 Views
  • 1 replies
  • 0 kudos

Selective Overwrite to a Unity Catalog Table

I have been able to perform a selective overwrite using replace Where to a hive_metastore table, but when I use the same code for the same table in a unity catalog, no data is written.Has anyone else had this issue or is there common mistakes that ar...

  • 150 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @jp_allard ,  The Unity Catalog is a newer feature in Databricks, designed to replace the traditional Hive Metastore.When transitioning from Hive Metastore to Unity Catalog, there might be differences in behavior due to underlying architectural ch...

  • 0 kudos
CDICSteph
by New Contributor
  • 888 Views
  • 5 replies
  • 0 kudos

permission denied listing external volume when using vscode databricks extension

hey, i'm using the Db extension for vscode (Databricks connect v2). When using dbutils to list an external volume defined in UC like so:   dbutils.fs.ls("/Volumes/dev/bronze/rawdatafiles/") i get this error: "databricks.sdk.errors.mapping.PermissionD...

  • 888 Views
  • 5 replies
  • 0 kudos
Latest Reply
lukasjh
New Contributor II
  • 0 kudos

We still face the problem (UC enabled shared cluster). Is there any resolution? @Kaniz  

  • 0 kudos
4 More Replies
JeanT
by New Contributor
  • 158 Views
  • 1 replies
  • 0 kudos

Help with Identifying and Parsing Varying Date Formats in Spark DataFrame

 Hello Spark Community,I'm encountering an issue with parsing dates in a Spark DataFrame due to inconsistent date formats across my datasets. I need to identify and parse dates correctly, irrespective of their format. Below is a brief outline of my p...

  • 158 Views
  • 1 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

How about not specifying the format?  This will already match common formats.When you still have nulls, you can use your list with known exotic formats.Another solution is working with regular expressions.  looking for 2 digit numbers not larger than...

  • 0 kudos
Phuonganh
by New Contributor
  • 173 Views
  • 1 replies
  • 0 kudos

Databricks SDK for Python: Errors with parameters for Statement Execution

Hi team,Im using Databricks SDK for python to run SQL queries. I created a variable as below:param = [{'name' : 'a', 'value' :x'}, {'name' : 'b', 'value' : 'y'}]and passed it the statement as below_ = w.statement_execution.execute_statement( warehous...

  • 173 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Phuonganh, This error is not directly related to the Databricks SDK, but rather a misunderstanding of how to pass parameters in your SQL query. The param dictionary you’ve defined seems to have a typo in the value for the ‘a’ parameter. It should...

  • 0 kudos
AnkithP
by New Contributor
  • 185 Views
  • 1 replies
  • 1 kudos

Infer schema eliminating leading zeros.

Upon reading a CSV file with schema inference enabled, I've noticed that a column originally designated as string datatype contains numeric values with leading zeros. However, upon reading the data to Pyspark data frame, it undergoes automatic conver...

  • 185 Views
  • 1 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

if you set .option("inferSchema", "false") all columns will be read as string.You will have to cast all the other columns to their appropriate type though.  So passing a schema seems easier to me.

  • 1 kudos
zmsoft
by New Contributor
  • 162 Views
  • 1 replies
  • 0 kudos

Why is Dlt pipeline processing streaming data so slow?

Running a single table is fast, but running 80 tables at the same time takes a long time, is it serial queued execution? Isn't it concurrent?

  • 162 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @zmsoft,  The processing power of the nodes running your Dlt pipeline matters. Using more powerful node types can significantly impact performance.Consider using a more robust node type, such as the Standard_E16ds_v4 or Standard_E32ds_v4.

  • 0 kudos
PrebenOlsen
by New Contributor III
  • 233 Views
  • 2 replies
  • 0 kudos

Job stuck while utilizing all workers

Hi!Started a job yesterday. It was iterating over data, 2-months at a time, and writing to a table. It was successfully doing this for 4 out of 6 time periods. The 5th time period however, got stuck, 5 hours in.I can find one Failed Stage that reads ...

Data Engineering
job failed
Job froze
need help
  • 233 Views
  • 2 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

As Spark is lazy evaluated, using only small clusters for read and large ones for writes is not something that will happen.The data is read when you apply an action (write f.e.).That being said:  I have no knowledge of a bug in Databricks on clusters...

  • 0 kudos
1 More Replies
laurenskuiper97
by New Contributor
  • 228 Views
  • 1 replies
  • 0 kudos

JDBC / SSH-tunnel to connect to PostgreSQL not working on multi-node clusters

Hi everybody,I'm trying to setup a connection between Databricks' Notebooks and an external PostgreSQL database through a SSH-tunnel. On a single-node cluster, this is working perfectly fine. However, when this is ran on a multi-node cluster, this co...

Data Engineering
clusters
JDBC
spark
SSH
  • 228 Views
  • 1 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

I doubt it is possible.The driver runs the program, and sends tasks to the executors.  But since creating the ssh tunnel is no spark task, I don't think it will be established on any executor.

  • 0 kudos
Jotav93
by New Contributor II
  • 304 Views
  • 2 replies
  • 0 kudos

Move a delta table from a non UC metastore to a UC metastore preserving history

Hi, I am using Azure databricks and we recently enabled UC in our workspace. We have some tables in our non UC metastore that we want to move to a UC enabled metastore. Is there any way we can move these tables without loosing the delta table history...

Data Engineering
delta
unity
  • 304 Views
  • 2 replies
  • 0 kudos
Latest Reply
ThomazRossito
New Contributor III
  • 0 kudos

Hello,It is possible to have the expected result with dbutils.fs.cp("Origin location", "Destination location", True) and then create the table with the LOCATION of the Destination locationHope this helps

  • 0 kudos
1 More Replies
MathewDRitch
by New Contributor II
  • 249 Views
  • 3 replies
  • 1 kudos

Connecting from Databricks to Network Path

Hi All,Will appreciate if someone can help me with some references links on connecting from Databricks to external network path. I have Databricks on AWS and previously used to connect to files on external network path using Mount method. Now Databri...

  • 249 Views
  • 3 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

I don't think that it is possible at the moment.  UC focuses on cloud data.You might want to try to use Minio, but apparently UC does not support Minio yetPity, because that would be an awesome solution.

  • 1 kudos
2 More Replies
Dp15
by Contributor
  • 304 Views
  • 2 replies
  • 2 kudos

Using UDF in an insert command

Hi,I am trying to use a UDF to get the last day of the month and use the boolean result of the function in an insert command. Please find herewith the function and the my query.function:import calendarfrom datetime import datetime, date, timedeltadef...

  • 304 Views
  • 2 replies
  • 2 kudos
Latest Reply
Dp15
Contributor
  • 2 kudos

Thank you @Kaniz for your detailed explanation

  • 2 kudos
1 More Replies
Kroy
by Contributor
  • 2104 Views
  • 8 replies
  • 1 kudos

Resolved! What is difference between streaming and streaming live table

Can anyone explain in layman what is difference between Streaming and streaming live table ?

  • 2104 Views
  • 8 replies
  • 1 kudos
Latest Reply
CharlesReily
New Contributor III
  • 1 kudos

Streaming, in a broad sense, refers to the continuous flow of data over a network. It allows you to watch or listen to content in real-time without having to download the entire file first.  A "Streaming Live Table" might refer to a specific type of ...

  • 1 kudos
7 More Replies
Labels
Top Kudoed Authors