cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

eballinger
by Contributor
  • 4842 Views
  • 2 replies
  • 1 kudos

Resolved! How to grant all tables in schema except 1

Hi Guys, I am trying to grant all tables in a schema to a user group in databricks. The only catch is that there is one table I do not want granted. I currently am granting schema access to the group so the benefit is that as tables are add in the fu...

  • 4842 Views
  • 2 replies
  • 1 kudos
Latest Reply
NandiniN
Databricks Employee
  • 1 kudos

What you are facing is because of inheritance.  https://docs.databricks.com/en/data-governance/unity-catalog/manage-privileges/upgrade-privilege-model.html I would say this is by design, but please feel free to suggest it as an idea here - https://do...

  • 1 kudos
1 More Replies
jspehar
by New Contributor
  • 1054 Views
  • 2 replies
  • 0 kudos

JDBC Error Trying to Connect erwin Data Modeler to Databricks

I am trying to connect erWin Data Modeler to Databricks to reverse engineer a physical data model. I am trying to connect manually per erWin and Databricks instructions, but I am getting the following error[Databricks][DatabricksJDBCDriver][500593] C...

  • 1054 Views
  • 2 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

I hope you referred https://docs.databricks.com/en/partners/data-governance/erwin.html   It is also possible, it can be a library issue, hope you are using the Databricks JDBC driver.

  • 0 kudos
1 More Replies
AlexCancioBedon
by New Contributor II
  • 697 Views
  • 1 replies
  • 1 kudos
  • 697 Views
  • 1 replies
  • 1 kudos
Latest Reply
Advika_
Databricks Employee
  • 1 kudos

Congratulations, @AlexCancioBedon! This is a great milestone that showcases your expertise in Data engineering with Databricks. We’d love to have you share your insights with the community, whether by sharing best practices or helping others. Keep up...

  • 1 kudos
Sayeed
by New Contributor II
  • 1064 Views
  • 1 replies
  • 0 kudos

Missing dbc for databricks associate engineer certification

Hi ,I am unable to find the dbc for https://customer-academy.databricks.com/learn/courses/2963/data-ingestion-with-delta-lake/lessons/25622/demo-set-up-and-load-delta-tables or anything related to databricks associate engineer certification.Any help ...

Sayeed_0-1738320038441.png
  • 1064 Views
  • 1 replies
  • 0 kudos
Latest Reply
Advika_
Databricks Employee
  • 0 kudos

Hello @Sayeed! I see that you're currently going through a self-paced course, which does not include hands-on labs (dbc files). To access the labs, you can either purchase the ILT course, which will grant you access to the labs for 7 days, or get the...

  • 0 kudos
SaraCorralLou
by New Contributor III
  • 24778 Views
  • 3 replies
  • 2 kudos

Resolved! Differences between lit(None) or lit(None).cast('string')

I want to define a column with null values in my dataframe using pyspark. This column will later be used for other calculations.What is the difference between creating it in these two different ways?df.withColumn("New_Column", lit(None))df.withColumn...

  • 24778 Views
  • 3 replies
  • 2 kudos
Latest Reply
shadowinc
New Contributor III
  • 2 kudos

For me df.withColumn("New_Column", lit(None).cast(StringType())) this didn't work.I used this instead df.withColumn("New_Column", lit(null).cast(StringType))  

  • 2 kudos
2 More Replies
jeremy98
by Honored Contributor
  • 2283 Views
  • 5 replies
  • 1 kudos

Set serveless compute environment to a task of a job

Hi Community,I want to set the environment of a task inside in a job using DABs, but I got this error.I could achieve my goal, if I set manually the task inside to be environment 2, because I need to use Python 3.11.How can I do it through DABs?

jeremy98_0-1738149373540.png
  • 2283 Views
  • 5 replies
  • 1 kudos
Latest Reply
jeremy98
Honored Contributor
  • 1 kudos

Hi,Seems that this could be set for spark_python_task:resources: jobs: New_Job_Jan_29_2025_at_11_48_AM: name: New Job Jan 29, 2025 at 11:48 AM tasks: - task_key: test-py-version2 spark_python_task: pyth...

  • 1 kudos
4 More Replies
panganibana
by New Contributor II
  • 917 Views
  • 1 replies
  • 0 kudos

Resolved! Inconsistency on Dataframe queried from External Data Source

We have a Catalog pointing to an External Data Source (Google BigQuery).1) In a notebook, create a cell where it runs a query to populate a Dataframe. Display results.2) Create another cell below and display the same Dataframe.3) I get different resu...

Data Engineering
externaldata
  • 917 Views
  • 1 replies
  • 0 kudos
Latest Reply
crystal548
New Contributor III
  • 0 kudos

@panganibana wrote:We have a Catalog pointing to an External Data Source (Google BigQuery).1) In a notebook, create a cell where it runs a query to populate a Dataframe. Display results.2) Create another cell below and display the same Dataframe.3) I...

  • 0 kudos
markbaas
by New Contributor III
  • 11805 Views
  • 9 replies
  • 0 kudos

DBFS_DOWN

I have an Azure Databricks workspace with Unity Catalog setup, using VNet and private endpoints. Serverless works great; however, the regular clusters have problems showing large results:Failed to store the result. Try rerunning the command. Failed ...

  • 11805 Views
  • 9 replies
  • 0 kudos
Latest Reply
markbaas
New Contributor III
  • 0 kudos

The dbfs (dbstorage) resource in the managed azure resource group needs to have private endpoints to your virtual network. You can create those manually or through iac (bicep/terraform).

  • 0 kudos
8 More Replies
sdes10
by New Contributor II
  • 2551 Views
  • 3 replies
  • 0 kudos

DLT apply_as_deletes not working on existing data with full refresh

I have an existing DLT pipeline that works on a modified medallion architecture. Data is sent from debezium to kafka and lands into a bronze table. From bronze table, it goes to a silver table where it is schematized. Finally to a good table where I ...

  • 2551 Views
  • 3 replies
  • 0 kudos
Latest Reply
sdes10
New Contributor II
  • 0 kudos

@Sidhant07 how do i use skipChangeCommits? The idea is that i have a bronze, silver and gold table already built. Now i am enabling deletes on gold table in the apply_changes API. The silver table is added with operation column (values c,u,r,d). I di...

  • 0 kudos
2 More Replies
Abdurrahman
by New Contributor II
  • 1495 Views
  • 3 replies
  • 0 kudos

How can I save a large spark table (~88.3Mn rows) to a delta lake table

I am trying to add a column to an existing delta lake table by adding a column and saving the table as a new table. The spark driver is getting overloaded. I have databricks notebook to work with (I have a decent compute as well g5.12xlarge) and have...

  • 1495 Views
  • 3 replies
  • 0 kudos
Latest Reply
Amit_Dass
New Contributor III
  • 0 kudos

Hi @Abdurrahman, Addition to the Sidhant07, I assumed you are adding this new column and you may be using this column in query, Use the ZORDER & OPTIMIZE both. ZORDER (Highly Recommended): Even more important than just OPTIMIZE for adding columns eff...

  • 0 kudos
2 More Replies
clentin
by Contributor
  • 4117 Views
  • 6 replies
  • 0 kudos

Import Py File

How do i import a .py file in Databricks environment?Any help will be appreciated. 

  • 4117 Views
  • 6 replies
  • 0 kudos
Latest Reply
fifata
New Contributor II
  • 0 kudos

@filipniziol @tejaswi24 Sorry to bring this up again, but I'm facing kind of similar problem.We have Databricks Repos that is a copy of a GitHub repository. The GitHub contains only .py files, but when copied to Databricks, they all get converted to ...

  • 0 kudos
5 More Replies
Splush_
by New Contributor III
  • 10030 Views
  • 5 replies
  • 6 kudos

Cannot cast Decimal to Double

Hey,Im trying to save the contents of a database table to a databrick delta table. The schema right from the database returns the number fields as decimal(38, 10). At least one of the values is too large for this data type. So I try to convert it usi...

  • 10030 Views
  • 5 replies
  • 6 kudos
Latest Reply
Splush_
New Contributor III
  • 6 kudos

Hey guys,Thank you a lot for your help. Since this is taking days alreary, I have asked the application owners of the database to delete these values for me. Apparently they are weights in gram for whatever products - so the problematic rows are heav...

  • 6 kudos
4 More Replies
susanne
by Databricks Partner
  • 1711 Views
  • 2 replies
  • 2 kudos

Resolved! Views in DLT with Private Preview feature Direct Publish

Hi everyone,I am building a dlt Pipeline and there I am using the Direct Publish feature which is as of now still under Private Preview.While it works well to create streaming tables and write them to another schema than the dlt  default schema, I ge...

  • 1711 Views
  • 2 replies
  • 2 kudos
Latest Reply
susanne
Databricks Partner
  • 2 kudos

Hi Sidhan,thanks a lot for your reply, it works very well to write materialized views to a different schema than the default schema.Thanks for your guidance!Best regardsSusanne

  • 2 kudos
1 More Replies
AlexVB
by New Contributor III
  • 4795 Views
  • 2 replies
  • 0 kudos

Catalogue global UDF's

The current UDF implementation stores UDFs in a catalogue.schema location. This requires reference/call to said udf location; example `select my_catalogue.my_schema.my_udf()`. Or have the sql execute from that schema.In Snowflake, UDFs are globally a...

  • 4795 Views
  • 2 replies
  • 0 kudos
Latest Reply
Sidhant07
Databricks Employee
  • 0 kudos

Hi @AlexVB , The current UDF implementation in Databricks requires referencing the UDF location with select my_catalogue.my_schema.my_udf() or executing SQL from that schema because Databricks organizes database objects using a three-tier hierarchy: ...

  • 0 kudos
1 More Replies
messiah
by Databricks Partner
  • 3350 Views
  • 3 replies
  • 0 kudos

Unable to Read Data from S3 in Databricks (AWS Free Trial)

Hey Community, I recently signed up for a Databricks free trial on AWS and created a workspace using the quickstart method. After setting up my cluster and opening a notebook, I tried to read a Parquet file from S3 using: spark.read.parquet("s3://<bu...

  • 3350 Views
  • 3 replies
  • 0 kudos
Latest Reply
Sidhant07
Databricks Employee
  • 0 kudos

Hi @messiah , This occurs due to the lack of AWS credentials or IAM roles necessary to access the S3 bucket. Can you please check the AWS Credentials, IAM Roles and IAM Permissions: Make sure the IAM role associated with the instance profile has......

  • 0 kudos
2 More Replies
Labels