cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

NSJ
by New Contributor II
  • 5827 Views
  • 3 replies
  • 2 kudos

Setup learning environment failed: Configuration dbacademy.library.version is not available.

Using 1.3 Getting Started with the Databricks Platform Lab.  to self learning. When I run DE 2.1 to setup environment, got following error:Configuration dbacademy.library.version is not available.Following is the code in the common setup.specified_ve...

  • 5827 Views
  • 3 replies
  • 2 kudos
Latest Reply
Luipiu
New Contributor III
  • 2 kudos

HiI resolved adding some instructions in the _common notebook, you can find inside the folder IncludesPut these at the beginning%pip install git+https://github.com/databricks-academy/dbacademy@v3.0.70%python dbutils.library.restartPython() After this...

  • 2 kudos
2 More Replies
SanSam
by New Contributor
  • 595 Views
  • 1 replies
  • 0 kudos

Geometry Point and WKB based on latitude and longitude

HiWhat is the best method to generate Geometry Point and WKB based on latitude and longitude stored in a Databricks table? Thanks,Sam

  • 595 Views
  • 1 replies
  • 0 kudos
Latest Reply
MariuszK
Valued Contributor III
  • 0 kudos

Hi,Spark has function to work with geospatial data, for instance ST_GeomFromWKB. You can use it to convert it human readable form. You can also create UDFs if something is missing. In my project I stored latitude and longitude as separate columns.

  • 0 kudos
palak_agarwala
by New Contributor
  • 670 Views
  • 1 replies
  • 0 kudos

Rename columns in Delta Live Tables

I want to explore the option of renaming a column in the SILVER layer of a DLT pipeline. Requesting suggestions. 

  • 670 Views
  • 1 replies
  • 0 kudos
Latest Reply
MariuszK
Valued Contributor III
  • 0 kudos

Full reload will rename column if it's caused by a column rename in a source file.

  • 0 kudos
MariuszK
by Valued Contributor III
  • 3139 Views
  • 2 replies
  • 0 kudos

Changes to deletion behavior of Materialized View and Streaming Tables defined by Delta Live Table

Hi,Sometime ago, I got a message that there will be a change (starting from 01/31/2025) in "deletion behavior of Materialized View and Streaming Tables defined by Delta Live Table", but when I remove dlt pipeline, it also removes related tables, will...

  • 3139 Views
  • 2 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @MariuszK, The users will need to explicitly call DROP MATERIALIZED VIEW to delete MVs and DROP TABLE to delete STs, when deleting DLT pipelines. https://home.databricks.com/account-alert-deletion-behavior-change-for-materialized-view-and-streamin...

  • 0 kudos
1 More Replies
muir
by New Contributor II
  • 1278 Views
  • 3 replies
  • 2 kudos

Resolved! Instance Pool Usage

We have instance pools setup with a maximum capacity and are looking at ways to monitor the usage to help with our capacity planning.I have been using the system tables to track how many nodes are being used within a pool at a point in time but it ap...

  • 1278 Views
  • 3 replies
  • 2 kudos
Latest Reply
TuckerGage
New Contributor II
  • 2 kudos

I also using it and it's working properly.

  • 2 kudos
2 More Replies
Ruby8376
by Valued Contributor
  • 1309 Views
  • 1 replies
  • 2 kudos

Tableau analytics integration with databricks delta lake

Hii there!!Currently, we are exploring options for reporting on Salesforce. We extract data from salesforce via databricks and store it in delta lake.Is there a connector by which data can be pulled from databricks into Tableau/CRM analytics??I know ...

  • 1309 Views
  • 1 replies
  • 2 kudos
Latest Reply
emillion25
New Contributor III
  • 2 kudos

Hello @ruby Were you able to resolve this? I know it's been a while, but I believe we now have multiple ways to connect Tableau and Databricks.1. Use the Native Databricks Connector for TableauTableau has a built-in Databricks connector, making it ea...

  • 2 kudos
tonykun_sg
by New Contributor II
  • 1619 Views
  • 5 replies
  • 0 kudos

Delta sharing for external table to external users who has no access to external storage?

We used delta sharing (authentication type: token) to generate the config.share file and share with external users not from our organisation, the users faced the "FileNotFoundError" error while using python "delta_sharing.load_as_pandas" method to re...

  • 1619 Views
  • 5 replies
  • 0 kudos
Latest Reply
Isi
Honored Contributor III
  • 0 kudos

Hello @tonykun_sg,It looks like ADLS Gen2 might be restricting access to the data through an ACL, which is why Databricks allows access but the underlying files remain protected. Could you check with your team to temporarily enable access for testing...

  • 0 kudos
4 More Replies
ggsmith
by Contributor
  • 3084 Views
  • 8 replies
  • 6 kudos

Resolved! Workflow SQL Task Query Showing Empty

I am trying to create a SQL task in Workflows. I have my query which executes successfully in the SQL editor, and it is saved in a repo.However, when I try to execute the task, the below error shows.Query text can not be empty: BAD_REQUEST: Query tex...

ggsmith_0-1738014329449.png ggsmith_1-1738014420683.png ggsmith_2-1738014505322.png
  • 3084 Views
  • 8 replies
  • 6 kudos
Latest Reply
ggsmith
Contributor
  • 6 kudos

It ended up being that the query wasn't actually saved. Once I manually clicked save, the query preview showed and the task ran successfully. I'm really surprised that was the reason. I had moved the query around to different folders and closed and r...

  • 6 kudos
7 More Replies
nguyenthuymo
by New Contributor III
  • 876 Views
  • 2 replies
  • 0 kudos

my query works with All-purpose cluster but return NULL with SQL Warehouse

Hi,(1) On SQL warehouse, I created a table in unity catalog from data source file vw_businessmetrics_1000.json in ADLS blob.USE CATALOG `upreport`;USE SCHEMA `test_genie`;-- Create the external table from the JSON fileCREATE EXTERNAL TABLE IF NOT EXI...

databricks_question_2.png databricks_question.png
  • 876 Views
  • 2 replies
  • 0 kudos
Latest Reply
nguyenthuymo
New Contributor III
  • 0 kudos

Hi @Ayushi_Suthar Thank you very much. I tried with the Classic and Pro and it did not work.My solution is: drop the table and recreate as a delta table then loading data from json to the delta table. Now it works. Probably, the SQL warehouse only su...

  • 0 kudos
1 More Replies
ankitmit
by New Contributor III
  • 1827 Views
  • 5 replies
  • 0 kudos

How to specify path while creating tables using DLT

Hi All,I am trying to create table using DLT and would like to specify the path where all the files should reside.I am trying something like this:dlt.create_streaming_table( name="test", schema="""product_id STRING NOT NULL PRIMARY KEY, ...

Data Engineering
Databricks
dlt
Unity Catalog
  • 1827 Views
  • 5 replies
  • 0 kudos
Latest Reply
joma
New Contributor II
  • 0 kudos

tengo un inconveniente igual. no me gusta guardar con un nombre aleatorio dentro de __unitystorage java.lang.IllegalArgumentException: Cannot specify an explicit path for a table when using Unity Catalog. Remove the explicit path:

  • 0 kudos
4 More Replies
Sunflower7500
by New Contributor II
  • 3974 Views
  • 4 replies
  • 2 kudos

Databricks PySpark error: OutOfMemoryError: GC overhead limit exceeded

I have a Databricks pyspark query that has been running fine for the last two weeks but am now getting the following error despite no changes to the query: OutOfMemoryError: GC overhead limit exceeded.I have done some research on possible solutions a...

Sunflower7500_0-1738624317697.png
  • 3974 Views
  • 4 replies
  • 2 kudos
Latest Reply
loic
Contributor
  • 2 kudos

When you say: "I have a Databricks pyspark query that has been running fine for the last two weeks but am now getting the following error despite no changes to the query: OutOfMemoryError: GC overhead limit exceeded."Can you tell us how do you execut...

  • 2 kudos
3 More Replies
g96g
by New Contributor III
  • 890 Views
  • 2 replies
  • 0 kudos

Streaming with Medalion Architchture and star schema Help

What are the best practices for implementing non-stop streaming in a Medallion Architecture with a Star Schema?Use Case:We have operational data and need to enable near real-time reporting in Power BI, with a maximum latency of 3 minutes. No Delta li...

  • 890 Views
  • 2 replies
  • 0 kudos
Latest Reply
MadhuB
Valued Contributor
  • 0 kudos

@g96g I've setup a near real-time (30-minute latency) streaming solution that ingests data from SQL Server into Delta Lake.Changes in the source SQL Server tables are captured using Change Data Capture (CDC) and written to CSV files in a data lake.A ...

  • 0 kudos
1 More Replies
ac567
by New Contributor III
  • 2473 Views
  • 3 replies
  • 0 kudos

Resolved! com.databricks.backend.common.rpc.DriverStoppedException

com.databricks.backend.common.rpc.DriverStoppedException: Driver down cause: driver state change (exit code: 143)facing this cluster issue while i deploy and run my workflow through asset bundle. i have tried everything to update in spark configurati...

  • 2473 Views
  • 3 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Awesome, good to hear!

  • 0 kudos
2 More Replies
Kayla
by Valued Contributor II
  • 3496 Views
  • 14 replies
  • 6 kudos

New error: middleware.base:exception while intercepting server message

We started getting a very weird error at random from Databricks. This is from cells that routinely work, and after it happens once it will happen on every cell. It appears to be including full text of a .py file we're importing, that I've had to remo...

  • 3496 Views
  • 14 replies
  • 6 kudos
Latest Reply
Kayla
Valued Contributor II
  • 6 kudos

@TKr Hey everybody - sorry that you experienced these issues. We identified the issue and reverted the feature causing it. Things should be back to normal already.I'm glad to hear that. Are you a Databricks employee?Referring to your question, we did...

  • 6 kudos
13 More Replies
cdn_yyz_yul
by New Contributor III
  • 1695 Views
  • 3 replies
  • 4 kudos

Resolved! Shoud data in Raw /Bronze be in Catalog?

Hello,What are the benefits of not "registering" Raw data into Unity Catalog when the data in Raw will be in its original format, such as .csv, .json, .parquet, etc?An example scenario could be:Data arrives at Landing as .zip; The zip will be verifie...

  • 1695 Views
  • 3 replies
  • 4 kudos
Latest Reply
cdn_yyz_yul
New Contributor III
  • 4 kudos

Thanks @Rjdudley I meant to say, the scenario is:Data arrives at Landing as .zip;   The zip will be verified for correctness, and then unzipped, the extracted files will be saved to Raw as-is, in a pre-defined folder structure. Unity Catalog will not...

  • 4 kudos
2 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels