cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

bohemiaRDX
by New Contributor II
  • 208 Views
  • 1 replies
  • 3 kudos

Resolved! Not able to read data from Delta External table in catalog

spark.conf.set(    "fs.azure.account.key.sa02flexflowinpp01prod.dfs.core.windows.net",    dbutils.secrets.get(scope="OpenScope", key="sa02StorageAccessKey") I created an external table using this configuration I am able to query the data only when I ...

bohemiaRDX_0-1738733317354.png
  • 208 Views
  • 1 replies
  • 3 kudos
Latest Reply
Ayushi_Suthar
Databricks Employee
  • 3 kudos

Hi @bohemiaRDX , Greetings!  Generally, this error would occur if the path is not added as an external location with storage credentials. Here the cluster could be trying to access the storage which doesn’t have UC storage credentials set nor any non...

  • 3 kudos
RohitKumar7
by New Contributor II
  • 133 Views
  • 1 replies
  • 0 kudos

Scanning Unity Catalog Schema and sample data

Hey Guyz, We need to scan the complete schema present at unity catalog to an external user or group of users without onboarding them on to our platform. Is there a way we can expose this details to them. Additionally, can we expose the sample dataset...

  • 133 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ayushi_Suthar
Databricks Employee
  • 0 kudos

Hi @RohitKumar7 , Greetings! Looking at your request, i would like to confirm you that it would be possible to use the Delta sharing feature. Delta sharing feature lets you share data and AI assets with users outside your organization, whether or not...

  • 0 kudos
Kjetil
by Contributor
  • 260 Views
  • 2 replies
  • 2 kudos

Unity Catalog and environment set up

We are implementing the Databricks medallion architecture (bronze, silver, gold). We have three different environments/workspaces in Databricks: Dev, Test and Prod. Each catalog in Unity Catalog points to a specific place in the Azure Data Lake. It t...

  • 260 Views
  • 2 replies
  • 2 kudos
Latest Reply
Kjetil
Contributor
  • 2 kudos

Thanks, Yes, that is indeed an option. The issue there is that we loose some flexibility in the sense that we cant define other sub-schemas to gold, silver, bronze as it would then be of the form prod.gold.<table-name>. instead of gold_dev.<schema-na...

  • 2 kudos
1 More Replies
NSJ
by New Contributor II
  • 1183 Views
  • 3 replies
  • 1 kudos

Setup learning environment failed: Configuration dbacademy.library.version is not available.

Using 1.3 Getting Started with the Databricks Platform Lab.  to self learning. When I run DE 2.1 to setup environment, got following error:Configuration dbacademy.library.version is not available.Following is the code in the common setup.specified_ve...

  • 1183 Views
  • 3 replies
  • 1 kudos
Latest Reply
Luipiu
New Contributor III
  • 1 kudos

HiI resolved adding some instructions in the _common notebook, you can find inside the folder IncludesPut these at the beginning%pip install git+https://github.com/databricks-academy/dbacademy@v3.0.70%python dbutils.library.restartPython() After this...

  • 1 kudos
2 More Replies
SanSam
by New Contributor
  • 147 Views
  • 1 replies
  • 0 kudos

Geometry Point and WKB based on latitude and longitude

HiWhat is the best method to generate Geometry Point and WKB based on latitude and longitude stored in a Databricks table? Thanks,Sam

  • 147 Views
  • 1 replies
  • 0 kudos
Latest Reply
MariuszK
Contributor III
  • 0 kudos

Hi,Spark has function to work with geospatial data, for instance ST_GeomFromWKB. You can use it to convert it human readable form. You can also create UDFs if something is missing. In my project I stored latitude and longitude as separate columns.

  • 0 kudos
palak_agarwala
by New Contributor
  • 168 Views
  • 1 replies
  • 0 kudos

Rename columns in Delta Live Tables

I want to explore the option of renaming a column in the SILVER layer of a DLT pipeline. Requesting suggestions. 

  • 168 Views
  • 1 replies
  • 0 kudos
Latest Reply
MariuszK
Contributor III
  • 0 kudos

Full reload will rename column if it's caused by a column rename in a source file.

  • 0 kudos
umahesb3
by New Contributor
  • 519 Views
  • 0 replies
  • 0 kudos

Facing issues databricks asset bundle, All jobs are getting Deployed into specified targets Instead

Facing issues databricks asset bundle, All jobs are getting Deployed into specified targets Instead of defined target following was files i am using resourser yaml and databricks yml file , i am using Databricks CLI v0.240.0 , i am using databricks b...

  • 519 Views
  • 0 replies
  • 0 kudos
MariuszK
by Contributor III
  • 242 Views
  • 2 replies
  • 0 kudos

Changes to deletion behavior of Materialized View and Streaming Tables defined by Delta Live Table

Hi,Sometime ago, I got a message that there will be a change (starting from 01/31/2025) in "deletion behavior of Materialized View and Streaming Tables defined by Delta Live Table", but when I remove dlt pipeline, it also removes related tables, will...

  • 242 Views
  • 2 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @MariuszK, The users will need to explicitly call DROP MATERIALIZED VIEW to delete MVs and DROP TABLE to delete STs, when deleting DLT pipelines. https://home.databricks.com/account-alert-deletion-behavior-change-for-materialized-view-and-streamin...

  • 0 kudos
1 More Replies
muir
by New Contributor II
  • 404 Views
  • 3 replies
  • 2 kudos

Resolved! Instance Pool Usage

We have instance pools setup with a maximum capacity and are looking at ways to monitor the usage to help with our capacity planning.I have been using the system tables to track how many nodes are being used within a pool at a point in time but it ap...

  • 404 Views
  • 3 replies
  • 2 kudos
Latest Reply
TuckerGage
New Contributor II
  • 2 kudos

I also using it and it's working properly.

  • 2 kudos
2 More Replies
Ruby8376
by Valued Contributor
  • 868 Views
  • 1 replies
  • 2 kudos

Tableau analytics integration with databricks delta lake

Hii there!!Currently, we are exploring options for reporting on Salesforce. We extract data from salesforce via databricks and store it in delta lake.Is there a connector by which data can be pulled from databricks into Tableau/CRM analytics??I know ...

  • 868 Views
  • 1 replies
  • 2 kudos
Latest Reply
emillion25
New Contributor III
  • 2 kudos

Hello @ruby Were you able to resolve this? I know it's been a while, but I believe we now have multiple ways to connect Tableau and Databricks.1. Use the Native Databricks Connector for TableauTableau has a built-in Databricks connector, making it ea...

  • 2 kudos
tonykun_sg
by New Contributor II
  • 491 Views
  • 5 replies
  • 0 kudos

Delta sharing for external table to external users who has no access to external storage?

We used delta sharing (authentication type: token) to generate the config.share file and share with external users not from our organisation, the users faced the "FileNotFoundError" error while using python "delta_sharing.load_as_pandas" method to re...

  • 491 Views
  • 5 replies
  • 0 kudos
Latest Reply
Isi
Contributor
  • 0 kudos

Hello @tonykun_sg,It looks like ADLS Gen2 might be restricting access to the data through an ACL, which is why Databricks allows access but the underlying files remain protected. Could you check with your team to temporarily enable access for testing...

  • 0 kudos
4 More Replies
ggsmith
by Contributor
  • 678 Views
  • 8 replies
  • 3 kudos

Resolved! Workflow SQL Task Query Showing Empty

I am trying to create a SQL task in Workflows. I have my query which executes successfully in the SQL editor, and it is saved in a repo.However, when I try to execute the task, the below error shows.Query text can not be empty: BAD_REQUEST: Query tex...

ggsmith_0-1738014329449.png ggsmith_1-1738014420683.png ggsmith_2-1738014505322.png
  • 678 Views
  • 8 replies
  • 3 kudos
Latest Reply
ggsmith
Contributor
  • 3 kudos

It ended up being that the query wasn't actually saved. Once I manually clicked save, the query preview showed and the task ran successfully. I'm really surprised that was the reason. I had moved the query around to different folders and closed and r...

  • 3 kudos
7 More Replies
nguyenthuymo
by New Contributor II
  • 222 Views
  • 2 replies
  • 0 kudos

my query works with All-purpose cluster but return NULL with SQL Warehouse

Hi,(1) On SQL warehouse, I created a table in unity catalog from data source file vw_businessmetrics_1000.json in ADLS blob.USE CATALOG `upreport`;USE SCHEMA `test_genie`;-- Create the external table from the JSON fileCREATE EXTERNAL TABLE IF NOT EXI...

databricks_question_2.png databricks_question.png
  • 222 Views
  • 2 replies
  • 0 kudos
Latest Reply
nguyenthuymo
New Contributor II
  • 0 kudos

Hi @Ayushi_Suthar Thank you very much. I tried with the Classic and Pro and it did not work.My solution is: drop the table and recreate as a delta table then loading data from json to the delta table. Now it works. Probably, the SQL warehouse only su...

  • 0 kudos
1 More Replies
ankitmit
by New Contributor III
  • 643 Views
  • 5 replies
  • 0 kudos

How to specify path while creating tables using DLT

Hi All,I am trying to create table using DLT and would like to specify the path where all the files should reside.I am trying something like this:dlt.create_streaming_table( name="test", schema="""product_id STRING NOT NULL PRIMARY KEY, ...

Data Engineering
Databricks
dlt
Unity Catalog
  • 643 Views
  • 5 replies
  • 0 kudos
Latest Reply
joma
New Contributor II
  • 0 kudos

tengo un inconveniente igual. no me gusta guardar con un nombre aleatorio dentro de __unitystorage java.lang.IllegalArgumentException: Cannot specify an explicit path for a table when using Unity Catalog. Remove the explicit path:

  • 0 kudos
4 More Replies
Sunflower7500
by New Contributor II
  • 638 Views
  • 4 replies
  • 2 kudos

Databricks PySpark error: OutOfMemoryError: GC overhead limit exceeded

I have a Databricks pyspark query that has been running fine for the last two weeks but am now getting the following error despite no changes to the query: OutOfMemoryError: GC overhead limit exceeded.I have done some research on possible solutions a...

Sunflower7500_0-1738624317697.png
  • 638 Views
  • 4 replies
  • 2 kudos
Latest Reply
loic
New Contributor III
  • 2 kudos

When you say: "I have a Databricks pyspark query that has been running fine for the last two weeks but am now getting the following error despite no changes to the query: OutOfMemoryError: GC overhead limit exceeded."Can you tell us how do you execut...

  • 2 kudos
3 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels