cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Madalian
by New Contributor III
  • 2242 Views
  • 2 replies
  • 0 kudos

DownLoad CSV files from Delta Lake

We have around 1800 tables in Parq format (Delta Lake). These 1800 tables are very big, we have all these 1800 tables are converted into tables. But we have a requirement that, we need to download in CSV. (from PowerBI / any other reporting tool). Cu...

  • 2242 Views
  • 2 replies
  • 0 kudos
Latest Reply
Madalian
New Contributor III
  • 0 kudos

Dear Kaniz,Thank you. one doubt 1) converting tables data into CSV and saving again one more of storage layer.IS there any way on fly we can convert these tables into CSV's. and export into PowerBI? and again i see in powerBI has limitations around >...

  • 0 kudos
1 More Replies
jenshumrich
by Contributor
  • 4873 Views
  • 3 replies
  • 1 kudos

Filter not using partition

I have the following code:spark.sparkContext.setCheckpointDir("dbfs:/mnt/lifestrategy-blob/checkpoints") result_df.repartitionByRange(200, "IdStation") result_df_checked = result_df.checkpoint(eager=True) unique_stations = result_df.select("IdStation...

  • 4873 Views
  • 3 replies
  • 1 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 1 kudos

it seems like there is a filter being apply according to this.  Filter (isnotnull(IdStation#2678) AND (IdStation#2678 = 1119844))  I would like to share the following notebook that covers in detail this topic, in case you would like to check it out h...

  • 1 kudos
2 More Replies
EhsanSaba
by New Contributor
  • 7262 Views
  • 1 replies
  • 0 kudos

RocksDB results in empty stream/stream joins dataframe

Since we enable RocksDB in our spark.conf the stream to stream joins/unions results in empty dataframe, does anyone else have the same experience? it is on AWSspark.conf.set("spark.sql.streaming.stateStore.providerClass","com.databricks.sql.streaming...

  • 7262 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

Did you also update the checkpoint? You might need to use a new checkpoint after you enable the RocksDB state store.

  • 0 kudos
Brammer88
by New Contributor III
  • 3996 Views
  • 5 replies
  • 2 kudos

Trying to run databricks academy labs, but execution fails due to method to clearcache not whilelist

Hi there,Im trying to run DE 2.1 - Querying Files Directly on my workspace with a default cluster configuration for found below,but I cannot seem to run this file (or any other labs) as it gives me this error message  Resetting the learning environme...

Brammer88_0-1713340930496.png
  • 3996 Views
  • 5 replies
  • 2 kudos
Latest Reply
Brammer88
New Contributor III
  • 2 kudos

Hi @Retired_mod and databricks team,Did you already found some other solution for this? Thanks,Bram

  • 2 kudos
4 More Replies
chloeh
by New Contributor II
  • 1272 Views
  • 0 replies
  • 0 kudos

Chaining window aggregations in SQL

In my SQL data transformation pipeline, I'm doing chained/cascading window aggregations: for example, I want to do average over the last 5 minutes, then compute average over the past day on top of the 5 minute average, so that my aggregations are mor...

  • 1272 Views
  • 0 replies
  • 0 kudos
Fresher
by New Contributor II
  • 1411 Views
  • 0 replies
  • 0 kudos

users are deleted/ unsynced from azure AD to databricks

In azure AD, it's shows users are synced to Databricks. But in Databricks, it's showing users is not a part of the group. The user is not part of only one group , he is part of remaining groups. All the syncing works fine till yesterday. I don't now ...

  • 1411 Views
  • 0 replies
  • 0 kudos
Darian
by New Contributor II
  • 1998 Views
  • 2 replies
  • 0 kudos

Delta Live table getting error of garbage collection after running few days

Hi, i am using delta live table in continuous mode for a real time streaming data pipeline. After running the pipeline like 2-3 days i am getting this garbage collection error:Driver/10.15.0.73 paused the JVM process 68 seconds during the past 120 se...

Darian_0-1714426883477.png Darian_1-1714426964675.png
  • 1998 Views
  • 2 replies
  • 0 kudos
Latest Reply
Darian
New Contributor II
  • 0 kudos

Here are the metrics:The size/type:Thanks!   

  • 0 kudos
1 More Replies
al_joe
by Contributor
  • 14810 Views
  • 5 replies
  • 3 kudos

Resolved! Split a code cell at cursor position? Add a cell above/below?

In JupyterLab notebooks, we can --In edit mode, you can press Ctrl+Shift+Minus to split the current cell into two at the cursor position In command mode, you can click A or B to add a cell Above or Below the current cellare there equivalent shortcuts...

  • 14810 Views
  • 5 replies
  • 3 kudos
Latest Reply
DavidKxx
Contributor
  • 3 kudos

What's the status of the ctrl-alt-minus shortcut for splitting a cell?  That keyboard combination does absolutely nothing in my interface (running Databricks via Chrome on GCP).

  • 3 kudos
4 More Replies
Lazloo
by New Contributor III
  • 18283 Views
  • 6 replies
  • 4 kudos

databricks-connect version 13: spark-class2.cmd not found

I install the newest version "databricks-connect==13.0.0". Now get the issue    Command C:\Users\Y\AppData\Local\pypoetry\Cache\virtualenvs\X-py3.9\Lib\site-packages\pyspark\bin\spark-class2.cmd"" not found   konnte nicht gefunden werden.   Traceback...

  • 18283 Views
  • 6 replies
  • 4 kudos
Latest Reply
Susumu_Asaga
New Contributor II
  • 4 kudos

Use this code:from databricks.connect import DatabricksSession spark = DatabricksSession.builder.getOrCreate() 

  • 4 kudos
5 More Replies
jitesh
by Databricks Partner
  • 1226 Views
  • 0 replies
  • 0 kudos

Code reusability for silver table transformations

How/how many databricks notebooks should be created to populate multiple silver delta tables, all having different and complex transformations ? What's the best practice -1. create a notebook each for a silver table ?2. push SQL transformation logic ...

  • 1226 Views
  • 0 replies
  • 0 kudos
Ruby8376
by Valued Contributor
  • 2174 Views
  • 1 replies
  • 0 kudos

Databricks sql warehouse has Serverless compute as a public preview.

There is a risk form infosec as it is processed in the control plane shared with other azure clients. s there any control to mitigate the risk?

  • 2174 Views
  • 1 replies
  • 0 kudos
Latest Reply
PL_db
Databricks Employee
  • 0 kudos

You can find more information on that topic here. "With Databricks, your serverless workloads are protected by multiple layers of security. These security layers form the foundation of Databricks’ commitment to providing a secure and reliable environ...

  • 0 kudos
astrobil
by New Contributor II
  • 1455 Views
  • 1 replies
  • 0 kudos

Tab Stops Indenting in SQL Editor

I am utilizing Databricks via Azure, and I've been consistently experiencing an issue with the SQL Editor. The tab button, instead of indenting, redirects my cursor to seemingly random parts of the page. This problem has persisted since I began using...

  • 1455 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

which DBR version are you using? which web browser are you using?

  • 0 kudos
kartikmnc
by New Contributor
  • 1512 Views
  • 1 replies
  • 1 kudos

Regarding Exam got Suspended at middle without any reason.

Hi Team,My Databricks Certified Data Engineer Associate exam got suspended on 17th December and it is in progress state.I was there continuously in front of the camera and suddenly the alert appeared, and support person asked me to show the desk and ...

  • 1512 Views
  • 1 replies
  • 1 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 1 kudos

Adding @Retired_mod for visibility on this request

  • 1 kudos
Labels