cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

AFox
by Contributor
  • 8043 Views
  • 3 replies
  • 3 kudos

databricks-connect: PandasUDFs importing local packages: ModuleNotFoundError

databricks-connect==14.1.0Related to other posts:https://community.databricks.com/t5/data-engineering/modulenotfounderror-serializationerror-when-executing-over/td-p/14301https://stackoverflow.com/questions/59322622/how-to-use-a-udf-defined-in-a-sub-...

  • 8043 Views
  • 3 replies
  • 3 kudos
Latest Reply
AFox
Contributor
  • 3 kudos

There is a way to do this!! spark.addArtifact(src_zip_path, pyfile=True) Some things of note:This only works on single user (non shared) clusterssrc_zip_path must be a posixpath type string (i.e. forward slash ) even on windows (drop C: and replace t...

  • 3 kudos
2 More Replies
amitdatabricksc
by Databricks Partner
  • 13783 Views
  • 4 replies
  • 2 kudos

how to zip a dataframe

how to zip a dataframe so that i get a zipped csv output file. please share command. it is only 1 dataframe involved and not multiple. 

  • 13783 Views
  • 4 replies
  • 2 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 2 kudos

writing to a local directory does not work.See this topic:https://community.databricks.com/s/feed/0D53f00001M7hNlCAJ

  • 2 kudos
3 More Replies
harvey-c
by New Contributor III
  • 2200 Views
  • 0 replies
  • 0 kudos

How to manage data reload in DLT

Hi, Community membersI had an situation to reload some data via DLT pipeline.  All data are stored in landing storage account and they have been loaded in daily base. For example, from 1/Nov to 30/Nov.For some reason, I need to reload the data of 25/...

  • 2200 Views
  • 0 replies
  • 0 kudos
AkifCakir
by New Contributor II
  • 30017 Views
  • 3 replies
  • 4 kudos

Resolved! Why Spark Save Modes , "overwrite" always drops table although "truncate" is true ?

Hi Dear Team, I am trying to import data from databricks to Exasol DB. I am using following code in below with Spark version is 3.0.1 ,dfw.write \ .format("jdbc") \ .option("driver", exa_driver) \ .option("url", exa_url) \ .option("db...

  • 30017 Views
  • 3 replies
  • 4 kudos
Latest Reply
Gembo
New Contributor III
  • 4 kudos

@AkifCakir , Were you able to find a way to truncate without dropping the table using the .write function as I am facing the same issue as well.

  • 4 kudos
2 More Replies
feed
by New Contributor III
  • 23286 Views
  • 4 replies
  • 2 kudos

OSError: No wkhtmltopdf executable found: "b''"

OSError: No wkhtmltopdf executable found: "b''"If this file exists please check that this process can read it or you can pass path to it manually in method call, check README. Otherwise please install wkhtmltopdf - https://github.com/JazzCore/python-...

  • 23286 Views
  • 4 replies
  • 2 kudos
Latest Reply
Debayan
Databricks Employee
  • 2 kudos

Hi, When did you receive this error? running a code insde a notebook , or running a cluster? or a job?Also, please tag @Debayan​ with your next response which will notify me. Thank you!

  • 2 kudos
3 More Replies
george_ognyanov
by New Contributor III
  • 7869 Views
  • 5 replies
  • 3 kudos

Resolved! Terraform Azure Databricks Unity Catalogue - Failed to check metastore quota limit for region

I am trying to create a metastore via the Terraform Azure databricks_metastore resource but I keep getting the error: This is the exact code I am using to create the resource:I have tried using both my Databricks account and a service principal appli...

george_ognyanov_1-1699523634061.png george_ognyanov_0-1699523597833.png
  • 7869 Views
  • 5 replies
  • 3 kudos
Latest Reply
george_ognyanov
New Contributor III
  • 3 kudos

Hi @Retired_mod as far as I understand one region can have one metastore. I am able to create a metastore in the same region if I log into the Databricks GUI and do it there.Alternatively, if I already have a metastore created and try to execute the ...

  • 3 kudos
4 More Replies
Nathant93
by New Contributor III
  • 3851 Views
  • 1 replies
  • 0 kudos

SQL Server OUTPUT clause alternative

I am looking at after a merge or insert has happened to get the records in that batch that had been inserted via either method, much like the OUTPUT clause in sql server.Does anyone have any suggestions, the only thing I can think of is to add a time...

  • 3851 Views
  • 1 replies
  • 0 kudos
Latest Reply
Nathant93
New Contributor III
  • 0 kudos

I've managed to do it like this qry = spark.sql(f"DESCRIBE history <table_name> limit 1").collect()current_version = int(qry[0][0])prev_version = current_version - 1Then do an except statement between the versions. 

  • 0 kudos
KNYSJOA
by New Contributor
  • 5072 Views
  • 4 replies
  • 0 kudos

SDK Workspace client HTTP Connection Pool

Hello.Do you know how to solve issue with the HTTPSConnectionPool when we are using SDK WorkspaceClient in notebook via workflow?I would like to trigger job when some conditions are met. These condition are done using Python. I am using SDK to trigge...

  • 5072 Views
  • 4 replies
  • 0 kudos
Latest Reply
Dribka
New Contributor III
  • 0 kudos

It seems like the issue you're facing with the HTTPSConnectionPool in the SDK WorkspaceClient when using it within a workflow may be related to the environment variables or credentials not being propagated correctly. When running the notebook manuall...

  • 0 kudos
3 More Replies
Deexith
by New Contributor
  • 6906 Views
  • 3 replies
  • 0 kudos

getting this error in logs Status logger error unable to locate configured logger context factory though i am able to connect with databricks db and retrive the data for mulesoft integration

ERROR StatusLogger Unable to locate configured LoggerContextFactory org.mule.runtime.module.launcher.log4j2.MuleLog4jContextFactoryERROR StatusLogger Unable to load class org.apache.logging.log4j.core.config.xml.XmlConfigurationFactoryjava.lang.Class...

  • 6906 Views
  • 3 replies
  • 0 kudos
Latest Reply
DataBricks1565
New Contributor II
  • 0 kudos

Hi @Uppala Deexith​ Any update on how you fixed this issue would greatly appreciated.

  • 0 kudos
2 More Replies
CKBertrams
by New Contributor III
  • 2700 Views
  • 2 replies
  • 2 kudos

Resolved! Stream failure notifications

Hi all,I have a job running three consecutive streams, when just one of them fails I want to get notified. The notification only triggers when all tasks have failed or are skipped/canceled. Does anyone have a suggestion on how to implement this?

  • 2700 Views
  • 2 replies
  • 2 kudos
Latest Reply
deng_dev
New Contributor III
  • 2 kudos

Hi!You can add notifications directly on tasks 

  • 2 kudos
1 More Replies
Kayla
by Valued Contributor II
  • 3241 Views
  • 1 replies
  • 0 kudos

Clusters Suddenly Failing - java.lang.RuntimeException: abort: DriverClient destroyed

I'm having clusters randomly failing that we've been using without issue for weeks. We're able to run a handful of cells and then get an error about "java.lang.RuntimeException: abort: DriverClient destroyed". Has anyone run into this before?Edit: I ...

  • 3241 Views
  • 1 replies
  • 0 kudos
nag_kanchan
by New Contributor III
  • 1432 Views
  • 0 replies
  • 0 kudos

Applying SCD in DLT using 3 different tables at source

My organization has recently started using Delta Live Tables in Databricks for data modeling. One of the dimensions I am trying to model takes data from 3 existing tables in the data lake and needs to be slowly changing dimensions (SCD Type 1).This a...

  • 1432 Views
  • 0 replies
  • 0 kudos
Magnus
by Contributor
  • 5200 Views
  • 1 replies
  • 1 kudos

FIELD_NOT_FOUND when selecting field not part of original schema

Hi,I'm implementing a DLT pipeline using Auto Loader to ingest json files. The json files contains an array called Items that contains records and two of the fields in the records wasn't part of the original schema, but has been added later. Auto Loa...

Data Engineering
Auto Loader
Delta Live Tables
  • 5200 Views
  • 1 replies
  • 1 kudos
Labels