cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

ilarsen
by Contributor
  • 1769 Views
  • 2 replies
  • 1 kudos

Auto Loader and source file structure optimisation

Hi.  I have a question, and I've not been able to find an answer.  I'm sure there is one...I just haven't found it through searching and browsing the docs. How much does it matter (if it is indeed that simple) if source files read by auto loader are ...

  • 1769 Views
  • 2 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Hi @ilarsen, According to the Azure Databricks documentation, Auto Loader incrementally and efficiently processes new data files as they arrive in cloud storage. Auto Loader can load data files from Azure data lake Storage Gen2 (ADLS Gen2) using hier...

  • 1 kudos
1 More Replies
Rubini_MJ
by New Contributor
  • 8440 Views
  • 1 replies
  • 0 kudos

Resolved! Other memory of the driver is high even in a newly spun cluster

Hi Team Experts,    I am experiencing a high memory consumption in the other part in the memory utilization part in the metrics tab. Right now am not running any jobs but still out of 8gb driver memory 6gb is almost full by other and only 1.5 gb is t...

  • 8440 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16539034020
Contributor II
  • 0 kudos

Hello,  Thanks for contacting Databricks Support.  Seems you are concern with high memory consumption in the "other" category in the driver node of a Spark cluster. As there are no logs/detail information provided, I only can address several potentia...

  • 0 kudos
houstonamoeba
by New Contributor III
  • 3579 Views
  • 7 replies
  • 1 kudos

Resolved! examples on python sdk for install libraries

Hi Everyone,I'm planning to use databricks python cli "install_libraries"can some one pls post examples on function install_libraries https://github.com/databricks/databricks-cli/blob/main/databricks_cli/libraries/api.py

  • 3579 Views
  • 7 replies
  • 1 kudos
Latest Reply
Loop-Insist
New Contributor II
  • 1 kudos

Here you go using Python SDKfrom databricks.sdk import WorkspaceClientfrom databricks.sdk.service import computew = WorkspaceClient(host="yourhost", token="yourtoken")# Create an array of Library objects to be installedlibraries_to_install = [compute...

  • 1 kudos
6 More Replies
JVesely
by New Contributor III
  • 1225 Views
  • 1 replies
  • 0 kudos

Resolved! DLT CDC SCD-1 pipeline not showing stats when reading from parquet file

Hi,I followed the tutorial here: https://docs.databricks.com/en/delta-live-tables/cdc.html#how-is-cdc-implemented-with-delta-live-tablesThe only change I did is that data is not appended to a table but is read from a parquet file. In practice this me...

  • 1225 Views
  • 1 replies
  • 0 kudos
Latest Reply
JVesely
New Contributor III
  • 0 kudos

My bad - waiting a bit and doing a proper screen refresh does show the numbers. 

  • 0 kudos
Anonymous
by Not applicable
  • 5157 Views
  • 8 replies
  • 2 kudos
  • 5157 Views
  • 8 replies
  • 2 kudos
Latest Reply
djhs
New Contributor III
  • 2 kudos

I also tried to leverage this endpoint (inferred from devtools): https://<workspace_id>.cloud.databricks.com/sql/api/dashboards/import with the exported dashboard (the dbdash file) in the request payload. It returns a 200 but nothing happens. Maybe s...

  • 2 kudos
7 More Replies
rt-slowth
by Contributor
  • 3123 Views
  • 5 replies
  • 1 kudos

Resolved! CRAS in @dlt

The Delta Table created as a result of the Dataframe returned by @dlt.create_table is confirmed to be overwritten when checked with the DECREASE HISTORY command.I want this to be handled as a CRAS, or CREATE AS SELECT, but how can I do this in python...

  • 3123 Views
  • 5 replies
  • 1 kudos
Latest Reply
siddhathPanchal
New Contributor III
  • 1 kudos

Hi @rt-slowth You can review this open source code base of Delta to know more about the DeltaTableBuilder's implementation in Python.  https://github.com/delta-io/delta/blob/master/python/delta/tables.py

  • 1 kudos
4 More Replies
msj50
by New Contributor III
  • 9522 Views
  • 11 replies
  • 1 kudos

Spark Running Really slow - help required

My company urgently needs help, we are having severe performance problems with spark and are having to switch to a different solution if we don't get to the bottom of it. We are on 1.3.1, using spark SQL, ORC Files with partitions and caching in me...

  • 9522 Views
  • 11 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Hi @msj50 , Thank you for posting your question in our community! We are happy to assist you. To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your ...

  • 1 kudos
10 More Replies
alj_a
by New Contributor III
  • 1841 Views
  • 3 replies
  • 2 kudos

Resolved! Delta Live Table - not reading the changed record from cloud file

Hi,I am trying to ingest the data from cloudfile to bronze table. DLT is working fist time and loading the data into Bronze table. but when i add new record and change a filed in existing record the DLT pipeline goes success but it should be inserted...

Data Engineering
Databricks Delta Live Table
  • 1841 Views
  • 3 replies
  • 2 kudos
Latest Reply
alj_a
New Contributor III
  • 2 kudos

Thank you Emil. I tried all the suggestions. .read works fine it picks up the new data or changed data. but my problem is it is bronze table  as target. in this case my bronze table has duplicate records. However, let me look at the other options to ...

  • 2 kudos
2 More Replies
bradleyjamrozik
by New Contributor III
  • 2155 Views
  • 4 replies
  • 1 kudos

DLT pipelines in the same job sharing compute

If I have a job like this that orchestrates N DLT pipelines, what setting do I need to adjust so that they use the same compute resources between steps rather than spinning up and shutting down for each individual pipeline? 

bradleyjamrozik_0-1698343161181.png
  • 2155 Views
  • 4 replies
  • 1 kudos
Latest Reply
shan_chandra
Esteemed Contributor
  • 1 kudos

@bradleyjamrozik  - under the DLT settings, notebooks can be listed all together. It will deploy a single compute resource for all the tasks. 

  • 1 kudos
3 More Replies
giuseppegrieco
by New Contributor III
  • 14230 Views
  • 5 replies
  • 6 kudos

Workflow service principle owned can't checkout git repository

I am trying to deploy a workflow where the owner is a service principal and I am using git integration (backend with azure devops), when I run the workflow it says that it doesn't have permissions to checkout the repo.run failed with error message F...

  • 14230 Views
  • 5 replies
  • 6 kudos
Latest Reply
Anonymous
Not applicable
  • 6 kudos

Hi @Giuseppe Grieco​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us ...

  • 6 kudos
4 More Replies
Alexc01
by New Contributor
  • 343 Views
  • 0 replies
  • 0 kudos

Buy Trustpilot Account

Buy Trustpilot AccountDo you want to buy Trustpilot account? Our store is the best place where you can buy Trustpilot accounts. Only fully verified Trustpilot accounts on our storeBuy Trustpilot Account

trustpilot.jpg
  • 343 Views
  • 0 replies
  • 0 kudos
Alexrc
by New Contributor
  • 246 Views
  • 0 replies
  • 0 kudos

Buy Amazon Account

Buy Amazon AccountDo you want to buy Amazon account? Our store is the best place where you can buy Amazon accounts. Only fully verified Amazon accounts on our storeBuy Amazon Account

amazon.jpg
  • 246 Views
  • 0 replies
  • 0 kudos
AlexR
by New Contributor
  • 609 Views
  • 0 replies
  • 0 kudos

Buy Tripadvisor Account

Buy Tripadvisor AccountDo you want to buy Tripadvisor account? Our store is the best place where you can buy Tripadvisor accounts. Only fully verified Tripadvisor accounts on our storeBuy Tripadvisor Account

tripadvisor.jpg
  • 609 Views
  • 0 replies
  • 0 kudos
alex97
by New Contributor
  • 330 Views
  • 0 replies
  • 0 kudos

Buy Facebook account

Buy Facebook accountDo you want to buy Upwork account? Our store is the best place where you can buy Upwork accounts. Only fully verified Upwork accounts on our storeBuy Facebook account

facebook.jpg
  • 330 Views
  • 0 replies
  • 0 kudos
alexc96
by New Contributor
  • 248 Views
  • 0 replies
  • 0 kudos

Buy Match Account

Buy Match AccountDo you want to buy Match Match account? Our store is the best place where you can buy Match accounts. Only fully verified Match accounts on our storeBuy Match Account

Match.jpg
  • 248 Views
  • 0 replies
  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels