cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Salman1
by New Contributor
  • 1094 Views
  • 0 replies
  • 0 kudos

Cannot find UDF on subsequent job runs on same cluster.

Hello, I am trying to run jobs with a JAR task type using databricks on AWS on an all-purpose cluster. The issue I'm facing is that the job will complete the first run successfully but on any subsequent runs, it will fail. I have to restart my cluste...

  • 1094 Views
  • 0 replies
  • 0 kudos
chari
by Contributor
  • 3606 Views
  • 2 replies
  • 0 kudos

Fatal error when writing a big pandas dF

Hello DB community,I was trying to write a pandas dataframe containing 100000 rows as excel. Moments in the execution I received a fatal error : "Python kernel is unresponsive."However, I am constrained from increasing the number of clusters or other...

Data Engineering
Databricks
excel
python
  • 3606 Views
  • 2 replies
  • 0 kudos
Latest Reply
Ayushi_Suthar
Databricks Employee
  • 0 kudos

Hi @chari ,Thanks for bringing up your concerns, always happy to help  We understand that you are facing the following error while you are writing a pandas dataframe containing 100000rows in excel. As per the Error >>> Fatal error: The Python kernel ...

  • 0 kudos
1 More Replies
Yaacoub
by New Contributor
  • 9507 Views
  • 2 replies
  • 1 kudos

[UDF_MAX_COUNT_EXCEEDED] Exceeded query-wide UDF limit of 5 UDFs

In my project I defined a UDF: @udf(returnType=IntegerType()) def ends_with_one(value, bit_position): if bit_position + len(value) < 0: return 0 else: return int(value[bit_position] == '1') spark.udf.register("ends_with_one"...

  • 9507 Views
  • 2 replies
  • 1 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 1 kudos

Hi @Yaacoub, Just a friendly follow-up. Have you had a chance to review my colleague's reply? Please inform us if it contributes to resolving your query.

  • 1 kudos
1 More Replies
abelian-grape
by New Contributor III
  • 7758 Views
  • 4 replies
  • 0 kudos

Intermittent error databricks job kept running

Hi i have the following error, but the job kept running, is that normal?{     "message": "The service at /api/2.0/jobs/runs/get?run_id=899157004942769 is temporarily unavailable. Please try again later. [TraceId: -]",     "error_code": "TEMPORARILY_U...

  • 7758 Views
  • 4 replies
  • 0 kudos
Latest Reply
abelian-grape
New Contributor III
  • 0 kudos

@Ayushi_Suthar also when ever it happens the job status does not change to "failed". But it keeps running. Is that normal?

  • 0 kudos
3 More Replies
joao_vnb
by New Contributor III
  • 62964 Views
  • 7 replies
  • 11 kudos

Resolved! Automate the Databricks workflow deployment

Hi everyone,Do you guys know if it's possible to automate the Databricks workflow deployment through azure devops (like what we do with the deployment of notebooks)?

  • 62964 Views
  • 7 replies
  • 11 kudos
Latest Reply
asingamaneni
New Contributor II
  • 11 kudos

Did you get a chance to try Brickflows - https://github.com/Nike-Inc/brickflowYou can find the documentation here - https://engineering.nike.com/brickflow/v0.11.2/Brickflow uses - Databricks Asset Bundles(DAB) under the hood but provides a Pythonic w...

  • 11 kudos
6 More Replies
isaac_gritz
by Databricks Employee
  • 8272 Views
  • 1 replies
  • 2 kudos

Change Data Capture with Databricks

How to leverage Change Data Capture (CDC) from your databases to DatabricksChange Data Capture allows you to ingest and process only changed records from database systems to dramatically reduce data processing costs and enable real-time use cases suc...

  • 8272 Views
  • 1 replies
  • 2 kudos
Latest Reply
prasad95
New Contributor III
  • 2 kudos

Hi, @isaac_gritz can you provide any reference resource to achieve the AWS DynamoDB CDC to Delta Tables.Thank You,

  • 2 kudos
DatBoi
by Contributor
  • 6548 Views
  • 2 replies
  • 1 kudos

Resolved! What happens to table created with CTAS statement when data in source table has changed

Hey all - I am sure this has been documented / answered before but what happens to a table created with a CTAS statement when data in the source table has changed? Does the sink table reflect the changes? Or is the data stored when the table is defin...

  • 6548 Views
  • 2 replies
  • 1 kudos
Latest Reply
SergeRielau
Databricks Employee
  • 1 kudos

CREATE TABLE AS (CTAS) is a "one and done" kind of statement.The new table retains no memory on how it came to be.Therefore it will be oblivious to changes in the source.Views, as you say, stored queries, no data is persisted. And therefore the query...

  • 1 kudos
1 More Replies
Dhruv-22
by New Contributor III
  • 10756 Views
  • 4 replies
  • 1 kudos

Resolved! Managed table overwrites existing location for delta but not for oth

I am working on Azure Databricks, with Databricks Runtime version being - 14.3 LTS (includes Apache Spark 3.5.0, Scala 2.12). I am facing the following issue.Suppose I have a view named v1 and a database f1_processed created from the following comman...

  • 10756 Views
  • 4 replies
  • 1 kudos
Latest Reply
Red_blue_green
New Contributor III
  • 1 kudos

Hi,this is how the delta format work. With overwrite you are not deleting the files in the folder or replacing them. Delta is creating a new file with the overwritten schema and data. This way you are also able to return to former versions of the del...

  • 1 kudos
3 More Replies
Accn
by New Contributor
  • 1293 Views
  • 1 replies
  • 0 kudos

Dashboard from Notebook - How to schedule

notebook is created with insight and have created dashboard (Not a SQL) from it.Need to schedule this. I have tried scheduling by workflow - it only takes you to the notebookeven the schedule from dashboard takes me to the notebook and not the dashbo...

  • 1293 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ayushi_Suthar
Databricks Employee
  • 0 kudos

Hi @Accn , Thanks for bringing up your concerns, always happy to help  We understand your concern but right now there is only way to refresh a notebook dashboard is via scheduled jobs. To schedule a dashboard to refresh at a specified interval, click...

  • 0 kudos
chrisf_sts
by New Contributor II
  • 8398 Views
  • 1 replies
  • 1 kudos

Resolved! After moving mounted s3 bucket under unity catalog control, python file paths no longer work

I have been using a mounted external s3 bucket with json files up until a few days ago, when my company changed to using all file mounts under control of the unity catalog.  Suddenly I can no loner run a command like:with open("/mnt/my_files/my_json....

  • 8398 Views
  • 1 replies
  • 1 kudos
Latest Reply
Ayushi_Suthar
Databricks Employee
  • 1 kudos

Hi @chrisf_sts ,Thanks for bringing up your concerns, always happy to help  May I know which cluster access mode you are using to run the notebook commands? Can you please try to run this below command on Single user cluster access mode.  "with open(...

  • 1 kudos
brickster_2018
by Databricks Employee
  • 13196 Views
  • 3 replies
  • 6 kudos

Resolved! How to add I custom logging in Databricks

I want to add custom logs that redirect in the Spark driver logs. Can I use the existing logger classes to have my application logs or progress message in the Spark driver logs.

  • 13196 Views
  • 3 replies
  • 6 kudos
Latest Reply
Kaizen
Valued Contributor
  • 6 kudos

1) Is it possible to save all the custom logging to its own file? Currently it is being logging with all other cluster logs (see image) 2) Also Databricks it seems like a lot of blank files are also being created for this. Is this a bug? this include...

  • 6 kudos
2 More Replies
sha
by New Contributor
  • 1368 Views
  • 1 replies
  • 0 kudos

Importing data from S3 to Azure DataBricks Cluster with Unity Catalog in Shared Mode

Environment details:DataBricks on Azure, 13.3 LTS, Unity Catalog, Shared Cluster mode.Currently in the environment I'm in, we run imports from S3 with code like:spark.read.option('inferSchema', 'true').json(s3_path).  When running on a cluster in Sha...

  • 1368 Views
  • 1 replies
  • 0 kudos
Latest Reply
BR_DatabricksAI
Contributor
  • 0 kudos

Hello Sha, We usually get such error while working with shared cluster mode assuming this your dev environment just to avoid errors, please use different clusters. However as a alternative solution in case if would like to keep the shared cluster the...

  • 0 kudos
Dhruv-22
by New Contributor III
  • 5592 Views
  • 4 replies
  • 0 kudos

CREATE TABLE does not overwrite location whereas CREATE OR REPLACE TABLE does

I am working on Azure Databricks, with Databricks Runtime version being - 14.3 LTS (includes Apache Spark 3.5.0, Scala 2.12). I am facing the following issue.Suppose I have a view named v1 and a database f1_processed created from the following comman...

  • 5592 Views
  • 4 replies
  • 0 kudos
Latest Reply
Ayushi_Suthar
Databricks Employee
  • 0 kudos

Hi @Dhruv-22 ,  Based on the information you shared above, the "CREATE OR REPLACE" and "CREATE" commands in Databricks do have different behaviours, particularly when it comes to handling tables with specific target locations. The "CREATE OR REPLACE"...

  • 0 kudos
3 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels