cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Stokholm
by New Contributor III
  • 15014 Views
  • 9 replies
  • 1 kudos

Pushdown of datetime filter to date partition.

Hi Everybody,I have 20 years of data, 600m rows.I have partitioned them on year and month to generated a files size which seems reasonable.(128Mb)All data is queried using timestamp, as all queries needs to filter on the exact hours.So my requirement...

  • 15014 Views
  • 9 replies
  • 1 kudos
Latest Reply
Stokholm
New Contributor III
  • 1 kudos

Hi Guys, thanks for your advices. I found a solution. We upgrade the Databricks Runtime to 12.2 and now the pushdown of the partitionfilter works. The documentation said that 10.4 would be adequate, but obviously it wasn't enough.

  • 1 kudos
8 More Replies
nolanlavender00
by New Contributor
  • 1278 Views
  • 1 replies
  • 0 kudos

Garbage Collection on AutoLoader

Once a week, I get very long run times with AutoLoader. The spark job says it is done, but garbage collection keeps rising on the driver. I assume this is because of the backfill interval that I am using with FileNotification Type. I have this set to...

  • 1278 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @nolanlavender008​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us...

  • 0 kudos
Shubham039
by New Contributor III
  • 13632 Views
  • 8 replies
  • 6 kudos

Databricks notebook ipywidgets not working as expected ( button click issue)

I am working on Azure databricks(IDE). I wanted to create a button which takes a text value as input and on the click of a button a function needed to be run which prints the value entered.For that I created this code:from IPython.display import disp...

  • 13632 Views
  • 8 replies
  • 6 kudos
Latest Reply
Anonymous
Not applicable
  • 6 kudos

Hi @Shubham Ringne​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us s...

  • 6 kudos
7 More Replies
jlgr
by New Contributor II
  • 4243 Views
  • 2 replies
  • 0 kudos

How disable disk cache in SQL Warehouse (Azure Databricks)?

Hi! I want to disable disk cache for SQL Warehouse in Azure Databricks, but it seems that is not possible. Is it correct?You can't use this configuration for SQL Warehouse (https://learn.microsoft.com/en-US/azure/databricks/optimizations/disk-cache#-...

  • 4243 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @jlgr jlgr​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we ...

  • 0 kudos
1 More Replies
T__V__K__Hanuma
by New Contributor II
  • 7735 Views
  • 4 replies
  • 0 kudos

I am struggling to optimize my Spark Application Code. Is there someone who can assist me in optimizing it? I am using Spark over Hadoop Yarn.

I will elaborate my problem. I am using a 6-node Spark cluster over Hadoop Yarn out of which one node acts as the master and the other 5 are acting as worker nodes. I am running my Spark application over the cluster. After completion, when I check th...

01_Jobs 02_DAG_and_Metrics 03_Event_Timeline 04_Tasks
  • 7735 Views
  • 4 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @T. V. K. Hanuman​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us...

  • 0 kudos
3 More Replies
EDDatabricks
by Contributor
  • 1468 Views
  • 2 replies
  • 3 kudos

SQL endpoint increased response times

We have observed that an SQL endpoint has increased response times after a long time being idle. This endpoint is always running and does not terminate. Are there any checks/overheads due to being idle that could impact performance?

  • 1468 Views
  • 2 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @EDDatabricks EDDatabricks​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, pleas...

  • 3 kudos
1 More Replies
140015
by New Contributor III
  • 2065 Views
  • 2 replies
  • 1 kudos

Pyspark 3.3.0 exceptAll working on 11.3 LTS but not locally

Hello,Currently I'm in process of upgrading the DBR version in my jobs to version 11.3 LTS. After upgrading pyspark version to 3.3.0 on my local machine I found that exceptAll function is broken (it looks like others have similar problem). It throws ...

Local error
  • 2065 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Jacek Dembowiak​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us ...

  • 1 kudos
1 More Replies
Himanshu_90
by New Contributor III
  • 6223 Views
  • 8 replies
  • 7 kudos

Databricks sql not able to evaluate expression current_user

Hi,I have a table as below:create table default.test_user(ID bigint NOT NULL GENERATED BY DEFAULT AS IDENTITY (START WITH 1 INCREMENT BY 1),usr1 varchar(255) NOT NULL,ts1 timestamp NOT NULL,usr2 varchar(255) NOT NULL,ts2 timestamp NOT NULL) USING Del...

  • 6223 Views
  • 8 replies
  • 7 kudos
Latest Reply
Anonymous
Not applicable
  • 7 kudos

Hi @Himanshu Agrawal​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us...

  • 7 kudos
7 More Replies
Gaurav_784295
by New Contributor III
  • 1928 Views
  • 2 replies
  • 1 kudos

In delta while query on delta unable to see previous partition where as while reading data using parquet file format it is showing whole partition data column .

In delta while query on delta unable to see previous partition where as while reading data using parquet file format it is showing whole partition data column .Delta Format = spark.read.format("delta").load("") Parquet Format ==> spark.read.parquet("...

While reading through parquet Delta_Table_Screenshot
  • 1928 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Gaurav Rawat​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers ...

  • 1 kudos
1 More Replies
gentresh
by New Contributor III
  • 1230 Views
  • 2 replies
  • 0 kudos

Is it possible to generate Databricks tokens using an Azure Service Principal?

Our organization has setup a databricks service on top of Azure (that is, the Azure-managed service). These are all defined with terraform. Our intention is to use an Azure service principal (with correct permissions) to be able to generate tokens, p...

  • 1230 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Gent Reshtani​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so...

  • 0 kudos
1 More Replies
193801
by New Contributor
  • 5753 Views
  • 2 replies
  • 0 kudos

Autoloader and json

Hello, I am looking for help with autoloader. I have few questions. My target is to read the files in s3 location and get filename, fileDate, file content in one table and in another table want to convert the file content to json struct and read to 1...

  • 5753 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Neeharika Andavarapu​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best ...

  • 0 kudos
1 More Replies
Sandy84
by New Contributor II
  • 5054 Views
  • 3 replies
  • 2 kudos

Need help skipping previously executed cells in a failed Databricks job calling a notebook with multiple SQL cells

In Azure databricks, I have a job that calls a notebook which has multiple cells with sql queries. In case of any cell fails and when we restart the databricks job then how to skip previous cell which already ran and start only from the failed cell? ...

  • 5054 Views
  • 3 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Sandip Rath​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers y...

  • 2 kudos
2 More Replies
Kenny92
by New Contributor III
  • 9931 Views
  • 2 replies
  • 1 kudos

Resolved! How does Auto Loader ingest data?

I have recently completed the Data Engineering with Databricks v3 course on the Partner Academy. Some of the quiz questions have me mixed up.Specifically, I am wondering about this question from the "Build Data Pipelines with Delta Live Tables and Sp...

Which of the following correctly describes how Auto Loader ingests data_ Select one response.
  • 9931 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Kenny Shaevel​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so...

  • 1 kudos
1 More Replies
Pawelski
by New Contributor
  • 1424 Views
  • 1 replies
  • 0 kudos
  • 1424 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @PaweÅ‚ Tomczyk​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so...

  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels