cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

billfoster
by New Contributor II
  • 24569 Views
  • 9 replies
  • 7 kudos

how can I learn DataBricks

I am currently enrolled in data engineering boot camp. We go over various technologies azure , pyspark , airflow , Hadoop ,nosql,SQL, python. But not over something like databricks. I am in contact with lots of recent graduates who landed a job. Almo...

  • 24569 Views
  • 9 replies
  • 7 kudos
Latest Reply
Ali23
New Contributor II
  • 7 kudos

 I'd be glad to help you on your journey to learning Databricks! Whether you're a beginner or aiming to advance your skills, here's a comprehensive guide:Foundations:Solid understanding of core concepts: Begin with foundational knowledge in big data,...

  • 7 kudos
8 More Replies
avidex180899
by New Contributor III
  • 14371 Views
  • 4 replies
  • 4 kudos

Resolved! UUID/GUID Datatype in Databricks SQL

Hi all,I am trying to create a table with a GUID column.I have tried using GUID, UUID; but both of them are not working.Can someone help me with the syntax for adding a GUID column?Thanks!

  • 14371 Views
  • 4 replies
  • 4 kudos
Latest Reply
rswarnkar5
New Contributor II
  • 4 kudos

> What ANSI SQL data structure to use for UUID or GUID?I had similar question. The answer was `STRING`. 

  • 4 kudos
3 More Replies
Anonymous
by Not applicable
  • 14694 Views
  • 4 replies
  • 1 kudos

Cluster in Pending State for long time

Pending for a long time at this stage “Finding instances for new nodes, acquiring more instances if necessary”. How can this be fixed?

  • 14694 Views
  • 4 replies
  • 1 kudos
Latest Reply
rswarnkar5
New Contributor II
  • 1 kudos

I faced similar situation yesterday. So I kept waiting instead of locking my system or closing the tabs. After sometime it went all fine. 

  • 1 kudos
3 More Replies
Akshay_Petkar
by Contributor III
  • 708 Views
  • 2 replies
  • 1 kudos

How to Use BladeBridge for Redshift to Databricks Migration?

Hi all,I have a Redshift queries that I need to migrate to Databricks using BladeBridge, but I have never used BladeBridge before and can’t find any clear documentation or steps on how to use it within the Databricks environment.If anyone has already...

  • 708 Views
  • 2 replies
  • 1 kudos
Latest Reply
ddharma
New Contributor II
  • 1 kudos

Dear @lingareddy_Alva ,Thank you so much for sharing these steps & specifics. Much appreciated!Context:Have just started exploring BladeBridge for AWS Redshift to Databricks migration. "BladeBridge operates as a code translation framework" and it sup...

  • 1 kudos
1 More Replies
User16826988857
by Databricks Employee
  • 3367 Views
  • 1 replies
  • 0 kudos

How to allow Table deletion without requiring ownership on table? Problem Description In DBR 6 (and earlier), a non-admin user can delete a table that...

How to allow Table deletion without requiring ownership on table?Problem DescriptionIn DBR 6 (and earlier), a non-admin user can delete a table that the user doesn't own, as long as the user has ownership on the table's parent database (perhaps throu...

  • 3367 Views
  • 1 replies
  • 0 kudos
Latest Reply
abueno
Contributor
  • 0 kudos

I am having the same issue but on Python 3.10.12.I need to be able to have another user have "manage" access to a table in the unity catalog.  We both have write access to the schema. 

  • 0 kudos
Rosty
by New Contributor
  • 328 Views
  • 1 replies
  • 0 kudos

DBT task status update gets delayed for several minutes

Hi, our team has recently begun experiencing a several-minute delay between Databricks DBT tasks finishing the computations and the subsequent status update from running state to success.  The DBT project is part of the workspace git repo. In the fir...

  • 328 Views
  • 1 replies
  • 0 kudos
Latest Reply
Brahmareddy
Esteemed Contributor
  • 0 kudos

Hi Rosty,How are you doing today? thanks for sharing the detailed context. I agree, it definitely sounds frustrating to have DBT tasks showing delays even after finishing the actual work. Based on what you've described, the delay is likely happening ...

  • 0 kudos
zychoo
by New Contributor
  • 189 Views
  • 1 replies
  • 0 kudos

Move large SQL data into Databricks

Hello, I have a large on-prem SQL database (~15TB). It heavily utilizes the sql_variant datatype. Would like to move it into a Databricks bronze layer, and have it synchronized as close to 'live' as possible. What could be the solution? It seems like...

  • 189 Views
  • 1 replies
  • 0 kudos
Latest Reply
WiliamRosa
New Contributor II
  • 0 kudos

Hi @zychoo ,I would consider a “Near-real-time” solution into Databricks Bronze, something like:- Log-based CDC tool (Qlik / Debezium / HVR) captures changes from SQL Server.- Tool serializes sql_variant to JSON or string+type metadata.- Writes to S3...

  • 0 kudos
de2298
by New Contributor
  • 630 Views
  • 1 replies
  • 1 kudos

AWS Databricks and Fabric OneLake integration

Bit of a weird scenario and I wanted to hear from the experts in this communityLets say I have a Fabric Lakehouse (OneLake) and I want to read that data into Databricks (AWS) Unity Catalog to play with that data. What is the recommended mechanism to ...

  • 630 Views
  • 1 replies
  • 1 kudos
Latest Reply
WiliamRosa
New Contributor II
  • 1 kudos

Hi @de2298, Currently, Microsoft Fabric does not offer a built-in connector that allows direct querying or exposure of Delta Share tables from AWS Databricks into a Fabric Warehouse. The Unity Catalog mirroring feature is supported only with Azure Da...

  • 1 kudos
ismaelhenzel
by Contributor
  • 293 Views
  • 4 replies
  • 0 kudos

Resolved! Declarative Pipelines with datacontracts

I'm wondering if anyone has successfully integrated data contracts with declarative pipelines in Databricks. Specifically, I want to reuse the quality checks and schema definitions from the contract directly within the pipeline's stages. I haven't fo...

  • 293 Views
  • 4 replies
  • 0 kudos
Latest Reply
WiliamRosa
New Contributor II
  • 0 kudos

Suggested Steps:Define the data contractCreate a YAML/JSON file containing:Schema (column names, data types, required fields)Data quality rules (null checks, ranges, regex patterns, allowed value lists)Governance metadata (e.g., data sensitivity, LGP...

  • 0 kudos
3 More Replies
sahil_s_jain
by New Contributor III
  • 1671 Views
  • 7 replies
  • 2 kudos

Issue: NoSuchMethodError in Spark Job While Upgrading to Databricks 15.5 LTS

Problem DescriptionI am attempting to upgrade my application from Databricks runtime version 12.2 LTS to 15.5 LTS. During this upgrade, my Spark job fails with the following error:java.lang.NoSuchMethodError: org.apache.spark.scheduler.SparkListenerA...

  • 1671 Views
  • 7 replies
  • 2 kudos
Latest Reply
ameerafi
Databricks Employee
  • 2 kudos

@DBonomo @sahil_s_jain We can write separate getSchema method inside the BulkCopyUtils.scala file and call that method instead of referring it from spark. You can add the below function in the BulkCopyUtils.scala file and build it locally. You can th...

  • 2 kudos
6 More Replies
ManojkMohan
by Contributor III
  • 281 Views
  • 3 replies
  • 3 kudos

Ingest into bronze Table: getting error Delta Live Tables (DLT) is not supported on this cluster

Use Case Description:Manually uploading orders data into databricks Then moving it into a bronze layer using the below code technical Code used python  Getting error:The Delta Live Tables (DLT) module is not supported on this cluster. You should eith...

  • 281 Views
  • 3 replies
  • 3 kudos
Latest Reply
ilir_nuredini
Honored Contributor
  • 3 kudos

Hello @ManojkMohan ,If you try to run a DLT pipeline on an (e.g.) all-purpose compute cluster, it will fail. DLT pipelines require a DLT job computer cluster.To run it, create a new pipeline in the Databricks UI, assign your DLT notebook to it, and s...

  • 3 kudos
2 More Replies
183530
by New Contributor III
  • 2378 Views
  • 4 replies
  • 2 kudos

How to search an array of words in a text field

Example:TABLE 1FIELD_TEXTI like salty food and Italian foodI have Italian foodbread, rice and beansmexican foodscoke, spritearray['italia', 'mex','coke']match TABLE1 X ARRAYResults:I like salty food and Italian foodI have Italian foodmexican foodsis ...

  • 2378 Views
  • 4 replies
  • 2 kudos
Latest Reply
ihopmenu
New Contributor II
  • 2 kudos

Yes, it’s possible to search an array of words in a text field using SQL with LIKE clauses or regex functions, while PySpark provides higher scalability with functions like rlike and array_contains (Wikipedia explains that SQL is a domain-specific la...

  • 2 kudos
3 More Replies
SCPablo
by New Contributor
  • 256 Views
  • 1 replies
  • 1 kudos

Resolved! Enable Classic (Non-Serverless) Clusters on Free Trial

Hi Databricks community,I’m using a Free Trial cloud account. Currently and I need to create classic clusters for Spark exercises.Is there a way to enable Standard/Classic Clusters in a trial workspace, or any workaround for Free Trial users?Any guid...

  • 256 Views
  • 1 replies
  • 1 kudos
Latest Reply
ilir_nuredini
Honored Contributor
  • 1 kudos

Hello @SCPablo ,If you are referring to the 14 days free trial account (link: https://docs.databricks.com/aws/en/getting-started/free-trial) , you can create compute clusters and experiment with them. But if you are referring to the Databricks Free E...

  • 1 kudos
kanikvijay9
by New Contributor II
  • 371 Views
  • 2 replies
  • 1 kudos

Performance Issues with Writing Large DataFrames to Managed Tables in Databricks (3.5B+ Rows)

Hi Community,I'm working on a large-scale data processing job in Databricks and facing performance and stability issues during the write operations. Here's a detailed breakdown of my use case and environment:Use Case Overview:Primary Data Frames:Firs...

kanikvijay9_0-1755015948307.png kanikvijay9_1-1755015978065.png kanikvijay9_0-1755016092143.png
  • 371 Views
  • 2 replies
  • 1 kudos
Latest Reply
kanikvijay9
New Contributor II
  • 1 kudos

I found the solution, Please refer to the below links for the solutionLinkedIn Post: https://www.linkedin.com/posts/activity-7363497408925745154-LaaL?utm_source=share&utm_medium=member_desktop&rcm=ACoAACTtno0BU78QJcWz-X3GHtKRvhXxf5fod90Medium Blob: h...

  • 1 kudos
1 More Replies
Fikrat
by New Contributor III
  • 335 Views
  • 7 replies
  • 1 kudos

Resolved! Lakebridge Transpile paths

Hi there,What kind of source and target paths can I use in Transpile command?I'm trying to run command:databricks labs lakebridge transpile --source-dialect tsql --input-source and I get error:ERROR [src/databricks/labs/lakebridge.transpile] ValueErr...

  • 335 Views
  • 7 replies
  • 1 kudos
Latest Reply
Fikrat
New Contributor III
  • 1 kudos

Also, what cloud source locations are available for Transpile- dbfs, Unity Catalog volumes, etc?

  • 1 kudos
6 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels