cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

Oliver_Angelil
by Valued Contributor II
  • 7364 Views
  • 8 replies
  • 1 kudos

How to use the git CLI in databricks?

After making some changes in my feature branch, I have committed and pushed (to Azure Devops) some work (note I have not yet raised a PR or merge to any other branch). Many of the files I committed are data files and so I would like to reverse the co...

  • 7364 Views
  • 8 replies
  • 1 kudos
Latest Reply
Kayla
Valued Contributor
  • 1 kudos

I'm also curious about this question - does anyone have an answer? Being able to use the full repertoire of git commands inside Databricks would be quite useful.

  • 1 kudos
7 More Replies
samur
by New Contributor II
  • 1434 Views
  • 2 replies
  • 1 kudos

DBR 14.1 - foreachBatch in Spark Connect Shared Clusters are not supported in Unity Catalog.

I am getting this error on DBR 14.1AnalysisException: [UC_COMMAND_NOT_SUPPORTED.WITHOUT_RECOMMENDATION] The command(s): foreachBatch in Spark Connect Shared Clusters are not supported in Unity Catalog.This is the code: wstream = df.writeStream.foreac...

  • 1434 Views
  • 2 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Hi @samur, The error message you’re encountering in DBR 14.1 indicates that the command foreachBatch within Spark Connect Shared Clusters is not supported in the Unity Catalog. This limitation is specific to the version you’re using. If you need to w...

  • 1 kudos
1 More Replies
Iam
by New Contributor II
  • 1282 Views
  • 2 replies
  • 0 kudos

CANNOT_RENAME_ACROSS_SCHEMA message error

Hello...We enabled Unity Catalog and we are migrating schemas. When I ran the command sync schema catalog01.schema01 FROM hive_metastore.schema01 dry run  I got the error CANNOT_RENAME_ACROSS_CATALOG, reviewing your documentation it only said   CANNO...

  • 1282 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @YoSoy, The error CANNOT_RENAME_ACROSS_CATALOG occurs when you try to rename a table or schema across catalogs, which is not allowed. This is because renaming would involve moving all the files from one location to another, which is not supported ...

  • 0 kudos
1 More Replies
PetitLepton
by New Contributor II
  • 6815 Views
  • 1 replies
  • 0 kudos

List parameter in Python SQL connector 3.0.1

Hi,up to recently in version of the Python SQL connector 2.9.3, I was using a list as a parameter in the cursor.execute(operation, parameters) method without any trouble. It seems that it is not possible anymore in version 3.0.1 as the parsing of par...

  • 6815 Views
  • 1 replies
  • 0 kudos
Latest Reply
PetitLepton
New Contributor II
  • 0 kudos

I should better read the documentation : https://github.com/databricks/databricks-sql-python/blob/v3.0.0/docs/parameters.md. 

  • 0 kudos
dcardenas
by New Contributor
  • 419 Views
  • 0 replies
  • 0 kudos

Retrieving Logs with Job API Get-outputs service

Hello,I would like to retrieve the logs of some job that where launched using the Job Rest Api 2.0. I see in the doc that can be done with the service get-ouputs, however each time I check the service I just get the metadata part of the response but ...

  • 419 Views
  • 0 replies
  • 0 kudos
ken2
by New Contributor II
  • 1253 Views
  • 3 replies
  • 0 kudos

How to convert entity_id to notebook name or job

Hi, Databricks developers!I use system.access.table_lineage refering to this page.It's difficult for us to recognize which notebook was indicated by the entity_id.How do I get the table to convert entity_ids to Job names or Notebook names?

  • 1253 Views
  • 3 replies
  • 0 kudos
Latest Reply
mlamairesse
New Contributor II
  • 0 kudos

Workflows system tables are coming very soon. 

  • 0 kudos
2 More Replies
JordanYaker
by Contributor
  • 2693 Views
  • 2 replies
  • 0 kudos

Batch Doesn't Exist Failure

I have a job that's been working perfectly fine since I deployed it earlier this month. Last night, however, one of the tasks within the job started failing with the following error:java.lang.IllegalStateException: batch 4 doesn't exist at org.apac...

  • 2693 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @JordanYaker, The error message java.lang.IllegalStateException: batch 4 doesn't exist is thrown when Apache Spark™’s Structured Streaming job tries to access a batch that doesn’t exist in the metadata. This can happen for various reasons, such as...

  • 0 kudos
1 More Replies
leelee3000
by New Contributor III
  • 694 Views
  • 1 replies
  • 0 kudos

Dynamic Filtering Criteria for Data Streaming

One of the potential uses for DLT is a scenario where I have a large input stream of data and need to create multiple smaller streams based on dynamic and adjustable filtering criteria. The challenge is to allow non-engineering individuals to adjust ...

  • 694 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @leelee3000, I can provide a high-level approach to create a Spark DataFrame for streaming reads using Avro schemas from the Kafka schema registry.   Here's a general approach:   Retrieve the Avro schema: You can retrieve the Avro schema from the ...

  • 0 kudos
leelee3000
by New Contributor III
  • 1238 Views
  • 1 replies
  • 0 kudos

Parameterizing DLT Jobs

I have observed the use of advanced configuration and creating a map as a way to parameterize notebooks, but these appear to be cluster-wide settings. Is there a recommended best practice for directly passing parameters to notebooks running on a DLT ...

  • 1238 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @leelee3000, In Databricks workflows, you can pass parameters to tasks that reference notebooks. For example, you can use the dbutils.jobs.taskValues.set function to register a parameter in the first task and then reference it in subsequent tasks....

  • 0 kudos
Geoff
by New Contributor II
  • 1111 Views
  • 1 replies
  • 1 kudos

Bizarre Delta Tables pipeline error: ModuleNotFound

I received the following error when trying to import a function defined in a .py file into a .ipynb file. I would add code blocks, but the message keeps getting rejected for invalid HTML.# test_lib.py (same directory, in a subfolder)def square(x):ret...

  • 1111 Views
  • 1 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Hi @Geoff, The error message ModuleNotFoundError: No module named 'test_lib' indicates that Python cannot find the module test_lib. This could be due to several reasons:   File Location: The Python file test_lib.py needs to be in the same directory a...

  • 1 kudos
cg3
by New Contributor
  • 348 Views
  • 0 replies
  • 0 kudos

Define VIEW in Databricks Asset Bundles?

Is it possible to define a Unity Catalog VIEW in a Databricks Asset Bundle, or specify in the bundle that a specific notebook gets run once per deployment?

  • 348 Views
  • 0 replies
  • 0 kudos
erigaud
by Honored Contributor
  • 4931 Views
  • 1 replies
  • 1 kudos

Resolved! Dynamically specify pivot column in SQL

Hello everyone !I am looking for a way to dynamically specify pivot columns in a SQL query, so it can be used in a view. However we don't want to hard code the values that need to become columns, and would rather extract it from another table.I've se...

  • 4931 Views
  • 1 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Hi @erigaud, In Databricks SQL, you can’t use a dynamic list of columns directly in the PIVOT clause.    However, there is a workaround using DataFrames in PySpark.    This approach allows you to pivot on the mapping column dynamically. The distinct ...

  • 1 kudos
stef2
by New Contributor III
  • 7891 Views
  • 13 replies
  • 5 kudos

Resolved! 2023-03-22 10:29:23 | Error 403 | https://customer-academy.databricks.com/

I would like to know why I am getting this error when I tried to earn badges for lakehouse fundamentals. I can't access the quiz page. Can you please help on this?

  • 7891 Views
  • 13 replies
  • 5 kudos
Latest Reply
dkn_data
New Contributor II
  • 5 kudos

Login by you gmail account in customer-academy.databricks.com and search the LakeHouse short course and enroll free

  • 5 kudos
12 More Replies
Kishan1003
by New Contributor
  • 2229 Views
  • 2 replies
  • 0 kudos

Merge Operation is very slow for S/4 Table ACDOCA

Hello,we have a scenario in Databricks where every day  we get 60-70 million records  and it takes a lot of time to merge the data into 28 billion records which is already sitting there . The time taken to rewrite the files which are affected is too ...

  • 2229 Views
  • 2 replies
  • 0 kudos
Latest Reply
177991
New Contributor II
  • 0 kudos

Hi @Kishan1003  did you find something helpful? Im dealing with a similar situation, acdoca table on my side is around 300M (fairly smaller), and incoming daily data is usually around 1M. I have try partition using period, like fiscyearper column, zo...

  • 0 kudos
1 More Replies
costi9992
by New Contributor III
  • 3206 Views
  • 6 replies
  • 0 kudos

Resolved! Add policy init_scripts.*.volumes.destination for dlt not working

Hi,I tried to create a policy to use it for DLTs that are ran with shared clusters, but when i run the DLT with this policy I have an error. Init-script is added to Allowed JARs/Init Scripts.DLT events error: Cluster scoped init script /Volumes/main/...

  • 3206 Views
  • 6 replies
  • 0 kudos
Latest Reply
ayush007
New Contributor II
  • 0 kudos

@costi9992I am facing same issue with UC enabled cluster with 13.3 Databricks Runtime.I have uploaded the init shell script in Volume with particular init script allowed by metastore admin.But I get the same error as you stated .When I looked in clus...

  • 0 kudos
5 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels