cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

cszczotka
by New Contributor III
  • 1279 Views
  • 0 replies
  • 0 kudos

Ephemeral storage how to create/mount.

Hi,I'm looking for information how to create/mount ephemeral storage to Databricks driver node in Azure Cloud.  Does anyone have any experience working with ephemeral storage?Thanks,

  • 1279 Views
  • 0 replies
  • 0 kudos
174817
by New Contributor III
  • 1631 Views
  • 2 replies
  • 3 kudos

DataBricks Rust client and/or OpenAPI spec

Hi,I'm looking for a DataBricks client for Rust.  I could only find these SDK implementations.Alternatively, I would be very happy with the OpenAPI spec.  Clearly one exists: the Go SDK implementation contains code to generate itself from such a spec...

Data Engineering
openapi
rust
sdk
unity
  • 1631 Views
  • 2 replies
  • 3 kudos
Latest Reply
feiyun0112
Honored Contributor
  • 3 kudos

Databricks REST API referenceThis reference contains information about the Databricks application programming interfaces (APIs). Each API reference page is presented primarily from a representational state transfer (REST) perspective. Databricks REST...

  • 3 kudos
1 More Replies
my_super_name
by New Contributor II
  • 1566 Views
  • 2 replies
  • 2 kudos

Auto Loader Schema Hint Behavior: Addressing Nested Field Errors

Hello,I'm using the auto loader to stream a table of data and have added schema hints to specify field values.I've observed that when my initial data file is missing fields specified in the schema hint,the auto loader correctly identifies this and ad...

  • 1566 Views
  • 2 replies
  • 2 kudos
Latest Reply
Mathias_Peters
Contributor
  • 2 kudos

Hi, we are having similar issues with schema hints formulated in fully qualified DDL, e.g. "a STRUCT<b INT>" etc. Did you find a solution? Also, did you specify the schema hint using the dot-notation, e.g. "a.b INT" before ingesting any data or after...

  • 2 kudos
1 More Replies
RiyazAli
by Valued Contributor II
  • 1268 Views
  • 1 replies
  • 0 kudos

Unable to create a record_id column via DLT - Autoloader

Hi Community,I'm trying to load data from the landing zone to the bronze layer via DLT- Autoloader, I want to add a column record_id to the bronze table while I fetch my data. I'm also using file arrival trigger in the workflow to update my table inc...

  • 1268 Views
  • 1 replies
  • 0 kudos
Latest Reply
RiyazAli
Valued Contributor II
  • 0 kudos

Hey @Retired_mod  - could you or any body from the community team help me here, please? I've been stuck since quite some time now.

  • 0 kudos
Aidonis
by New Contributor III
  • 18355 Views
  • 3 replies
  • 2 kudos

Resolved! Flatten Deep Nested Struct

Hi All,I have a deeply nested spark dataframe struct something similar to below |-- id: integer (nullable = true) |-- lower: struct (nullable = true) | |-- field_a: integer (nullable = true) | |-- upper: struct (containsNull = true) | | ...

  • 18355 Views
  • 3 replies
  • 2 kudos
Latest Reply
Praveen-bpk21
New Contributor II
  • 2 kudos

@Aidonis You can try this as well:flatten-spark-dataframe · PyPIThis also allows for specific level of flattening.

  • 2 kudos
2 More Replies
SPres
by New Contributor
  • 1179 Views
  • 1 replies
  • 0 kudos

Passing Parameters from Azure Synapse

Hey Community!Just curious if anyone has tried using Azure Synapse for orchestration and passing parameters from Synapse to a Databricks Notebook. My team is testing out Databricks, and I'm replacing Synapse Notebooks with Databricks Notebooks, but I...

  • 1179 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 0 kudos

Hi @SPres You can definitely pass these parameters to databricks notebook also.Please refer below docs - Run a Databricks Notebook with the activity - Azure Data Factory | Microsoft Learn

  • 0 kudos
Chengzhu
by New Contributor
  • 737 Views
  • 0 replies
  • 0 kudos

Databricks Model Registry Notification

Hi community,Currently, I am training models on databricks cluster and use mlflow to log and register models. My goal is to send notification to me when a new version of registered model happens (if the new run achieves some model performance baselin...

Screenshot 2024-04-17 at 1.14.11 PM.png Screenshot 2024-04-17 at 1.13.14 PM.png
  • 737 Views
  • 0 replies
  • 0 kudos
dilkushpatel
by New Contributor II
  • 1906 Views
  • 2 replies
  • 0 kudos

Databricks connecting SQL Azure DW - Confused between Polybase and Copy Into

I see two articles on databricks documentationshttps://docs.databricks.com/en/archive/azure/synapse-polybase.html#language-pythonhttps://docs.databricks.com/en/connect/external-systems/synapse-analytics.html#service-principal Polybase one is legacy o...

Data Engineering
azure
Copy
help
Polybase
Synapse
  • 1906 Views
  • 2 replies
  • 0 kudos
Abhi0607
by New Contributor II
  • 1278 Views
  • 2 replies
  • 0 kudos

Variables passed from ADF to Databricks Notebook Try-Catch are not accessible

Dear Members,I need your help in below scenario.I am passing few parameters from ADF pipeline to Databricks notebook.If I execute ADF pipeline to run my databricks notebook and use these variables as is in my code (python) then it works fine.But as s...

  • 1278 Views
  • 2 replies
  • 0 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 0 kudos

Hi @Abhi0607  Can you please help me to find if you are taking or defining these parameter value outside try catch or inside it ?

  • 0 kudos
1 More Replies
fuselessmatt
by Contributor
  • 8245 Views
  • 4 replies
  • 1 kudos

Accidentally removing the service principal that owns the view seems to put the Unity Catalog in an illegal state. Can you fix this?

I renamed our service principal in Terraform, which forces a replacement where the old service principal is removed and a new principal with the same permission is recreated. The Terraform succeeds to apply, but when I try to run dbt that creates tab...

  • 8245 Views
  • 4 replies
  • 1 kudos
Latest Reply
fuselessmatt
Contributor
  • 1 kudos

This is also true for removing groups before unassigning them (removing and unassigning in Terraform)│ Error: cannot update grants: Could not find principal with name <My Group Name>

  • 1 kudos
3 More Replies
JeanT
by New Contributor
  • 2476 Views
  • 1 replies
  • 0 kudos

Help with Identifying and Parsing Varying Date Formats in Spark DataFrame

 Hello Spark Community,I'm encountering an issue with parsing dates in a Spark DataFrame due to inconsistent date formats across my datasets. I need to identify and parse dates correctly, irrespective of their format. Below is a brief outline of my p...

  • 2476 Views
  • 1 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

How about not specifying the format?  This will already match common formats.When you still have nulls, you can use your list with known exotic formats.Another solution is working with regular expressions.  looking for 2 digit numbers not larger than...

  • 0 kudos
AnkithP
by New Contributor
  • 2181 Views
  • 1 replies
  • 1 kudos

Infer schema eliminating leading zeros.

Upon reading a CSV file with schema inference enabled, I've noticed that a column originally designated as string datatype contains numeric values with leading zeros. However, upon reading the data to Pyspark data frame, it undergoes automatic conver...

  • 2181 Views
  • 1 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

if you set .option("inferSchema", "false") all columns will be read as string.You will have to cast all the other columns to their appropriate type though.  So passing a schema seems easier to me.

  • 1 kudos
PrebenOlsen
by New Contributor III
  • 1655 Views
  • 2 replies
  • 0 kudos

Job stuck while utilizing all workers

Hi!Started a job yesterday. It was iterating over data, 2-months at a time, and writing to a table. It was successfully doing this for 4 out of 6 time periods. The 5th time period however, got stuck, 5 hours in.I can find one Failed Stage that reads ...

Data Engineering
job failed
Job froze
need help
  • 1655 Views
  • 2 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

As Spark is lazy evaluated, using only small clusters for read and large ones for writes is not something that will happen.The data is read when you apply an action (write f.e.).That being said:  I have no knowledge of a bug in Databricks on clusters...

  • 0 kudos
1 More Replies
laurenskuiper97
by New Contributor
  • 1388 Views
  • 1 replies
  • 0 kudos

JDBC / SSH-tunnel to connect to PostgreSQL not working on multi-node clusters

Hi everybody,I'm trying to setup a connection between Databricks' Notebooks and an external PostgreSQL database through a SSH-tunnel. On a single-node cluster, this is working perfectly fine. However, when this is ran on a multi-node cluster, this co...

Data Engineering
clusters
JDBC
spark
SSH
  • 1388 Views
  • 1 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

I doubt it is possible.The driver runs the program, and sends tasks to the executors.  But since creating the ssh tunnel is no spark task, I don't think it will be established on any executor.

  • 0 kudos
Jotav93
by New Contributor II
  • 1456 Views
  • 2 replies
  • 1 kudos

Move a delta table from a non UC metastore to a UC metastore preserving history

Hi, I am using Azure databricks and we recently enabled UC in our workspace. We have some tables in our non UC metastore that we want to move to a UC enabled metastore. Is there any way we can move these tables without loosing the delta table history...

Data Engineering
delta
unity
  • 1456 Views
  • 2 replies
  • 1 kudos
Latest Reply
ThomazRossito
Contributor
  • 1 kudos

Hello,It is possible to have the expected result with dbutils.fs.cp("Origin location", "Destination location", True) and then create the table with the LOCATION of the Destination locationHope this helps

  • 1 kudos
1 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels