cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

dukebaslangic
by New Contributor II
  • 2558 Views
  • 3 replies
  • 3 kudos

Resolved! Databricks performance related documentation/books

Hi,Do you know any good resources about Databricks performance improvements(like improving query performances, monitoring/resolving performance bottlenecks etc)?Thanks

  • 2558 Views
  • 3 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Ömer Özsakarya​  We haven't heard from you since the last response from @Lakshay Goel​ â€‹, and I was checking back to see if her suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be helpful to ...

  • 3 kudos
2 More Replies
Ram443
by New Contributor III
  • 42671 Views
  • 9 replies
  • 5 kudos

Resolved! I created a data frame but was not able to see the data

Code to create a data frame:from pyspark.sql import SparkSessionspark=SparkSession.builder.appName("oracle_queries").master("local[4]")\  .config("spark.sql.warehouse.dir", "C:\\softwares\\git\\pyspark\\hive").getOrCreate()from pyspark.sql.functions ...

  • 42671 Views
  • 9 replies
  • 5 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 5 kudos

@ramanjaneyulu kancharla​  can you please select my answer as best answer

  • 5 kudos
8 More Replies
Paul_Seattle
by New Contributor
  • 7469 Views
  • 1 replies
  • 0 kudos

A Quick Question on Running a job from CLI

Could anyone tell me what could be wrong with my command to submit a spark job with params( If I don’t have --spark-submit-params, it’s fine). Please see the attached snapshot.

image
  • 7469 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16539034020
Databricks Employee
  • 0 kudos

yes, there is no need for spark-submit-params. databricks jobs run-now --job-id ***reference: https://docs.databricks.com/dev-tools/cli/jobs-cli.html

  • 0 kudos
RonanStokes_DB
by Databricks Employee
  • 3158 Views
  • 1 replies
  • 1 kudos

Can you apply a specific cluster policy when launching a Databricks job via Azure Data Factory

When using Azure Data Factory to coordinate the launch of Databricks jobs - can you specify which cluster policy to apply to the job, either explicitly or implicitly?

  • 3158 Views
  • 1 replies
  • 1 kudos
Latest Reply
mvandeborne
New Contributor II
  • 1 kudos

you could, but not from ADF's UI. You need to edit the json of the linked service, adding a 'policyId' parameter in the 'typeProperties' object, pointing to the cluster policy ID from Databricks (which you could find in Databricks' URL).

  • 1 kudos
pcriado
by New Contributor III
  • 7723 Views
  • 2 replies
  • 1 kudos

Resolved! Requested array size exceeds VM limit when saving to feature table

Hi, I'm trying to process a small dataset (less than 300 Mb) composed by five queries that run with spark. The end result of those queries is parsed using python and merged into a data frame. Then I try to write this to a delta lake table using featu...

  • 7723 Views
  • 2 replies
  • 1 kudos
Latest Reply
pcriado
New Contributor III
  • 1 kudos

Hello, we have recently found that it's my user in particular that casues the memory issue. Two other users in my organization can run the same notebook without problems, but my user consistenly consumes all available ram and crashes the cluster... a...

  • 1 kudos
1 More Replies
gustavomcarmo-h
by New Contributor III
  • 4279 Views
  • 5 replies
  • 2 kudos

Resolved! Is there a way to list the dlt maintenance jobs through the API?

After creating the delta pipeline, I would like to get details from the dlt maintenance job automatically created by Databricks, like the scheduled time when the dlt maintenance tasks will be executed. However, it seems the Job API 2.1 doesn't cover ...

  • 4279 Views
  • 5 replies
  • 2 kudos
Latest Reply
gustavomcarmo-h
New Contributor III
  • 2 kudos

Hi @Debayan Mukherjee​ ,Actually the Databricks Jobs API documentation has not been fixed yet. The parameter `job_type` should be included in the list endpoint request documentation. Please do this in order to avoid unnecessary questions here in the ...

  • 2 kudos
4 More Replies
iptkrisna
by New Contributor III
  • 5777 Views
  • 2 replies
  • 1 kudos

Clear Cache From a Notebook, not from a Cluster

Hi, I'm running all my jobs on one big cluster, I'm just concerned is there a solution on how we could clear cache resulted by a notebook in the end of the job when its done? hence it does not causing any memory problem sometime from one to another, ...

  • 5777 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @krisna math​ We haven't heard from you since the last response from @Debayan Mukherjee​ â€‹, and I was checking back to see if her suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be helpful to...

  • 1 kudos
1 More Replies
sree1567
by New Contributor II
  • 1386 Views
  • 1 replies
  • 1 kudos

Azure-EventHub Schema Registry with Spark-Scala

Hi all,Is there a way to consume the schemas from schema registry defined in Azure EventHub using apache spark and scala.

  • 1386 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @sreeranjani thevan​ Great to meet you, and thanks for your question!Let's see if your peers in the community have an answer to your question. Thanks.

  • 1 kudos
PearceR
by New Contributor III
  • 7684 Views
  • 1 replies
  • 2 kudos

try and except in DLT pipelines

Good Morning,I am having some issues with my DLT pipeline. I have a scenario where I am loading in bronze-silver tables programatically from a SQL database (each row corresponds to a table to create). This leaves me in a situation where sometimes onl...

  • 7684 Views
  • 1 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Robert Pearce​  Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 2 kudos
Kaijser
by New Contributor II
  • 2348 Views
  • 1 replies
  • 2 kudos

Installing private python Azure DevOps repository without revealing personal access token in pyproject.toml

I want to install a .whl file on my Databricks cluster which includes a private Azure DevOps repository as a dependency in its pyproject.toml file, i.e.:[project] name = "test" description = "test_description." version = "0.1.0" authors = [ { name ...

  • 2348 Views
  • 1 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Aaron Kaijser​  Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 2 kudos
iptkrisna
by New Contributor III
  • 1432 Views
  • 1 replies
  • 2 kudos

Jobs Data Pipeline Runtime Increase Significantly

Hi, I am facing an issue where one of my jobs taking so long since certain time, previously its only needs less than 1 hour to run a batch job that load json data and do a truncate and load to a delta table, but since june 2nd, it become so long that...

  • 1432 Views
  • 1 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @krisna math​  Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 2 kudos
Rajesh_P
by New Contributor II
  • 1474 Views
  • 1 replies
  • 1 kudos

Does JDBC and ODBC connection to Databricks Cluster or SQL Warehouse encrypted?

Hello, our consumers (Dell Boomi and other apps) needs data from Databricks. Databricks provides JDBC and ODBC drivers. Does JDBC and ODBC connection to Databricks Cluster or SQL Warehouse encrypted? I am talking about the data in-transit between Dat...

  • 1474 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Rajesh Paul​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 1 kudos
opt
by New Contributor
  • 1499 Views
  • 1 replies
  • 1 kudos

how to execute "Build your Chat Bot with Dolly Demo" in my own VM?

I am trying to execute Build your Chat Bot with Dolly Demo using my own VM. At the first steps they are executing this command %run ./_resources/00-init $catalog=hive_metastore $db=dbdemos_llm  which is -as I understand- calling another python script...

  • 1499 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @alaa migdady​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 1 kudos
Sandesh87
by New Contributor III
  • 3550 Views
  • 1 replies
  • 2 kudos

apply a function across multiple smaller dataframes created from one big dataframe in scala

The dataframe 'big_df' looks like the below| id| index| timestamp||:---- |:------:| -----:|| abc| 1| 11:00:00|| abc| 1| 11:00:10|| abc| 1| 11:00:20|| abc| 1| 11:00:30|| abc| 1| 11:00:40|| abc| 1| 11:00:50|| abc| 2| 11:01:00|| abc| 2| 11:01:10|| abc| ...

  • 3550 Views
  • 1 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Sandesh Puligundla​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 2 kudos
spabba
by New Contributor II
  • 1552 Views
  • 1 replies
  • 2 kudos

How to change databricks email alert notifications subject line?

Currently, our email notification subject shows for error in below format:<[AWS Account]> Error in run <RUN ID> of <Job Name>In our current databricks environment we have multiple environment jobs such as Dev/QA/UAT/Prod and it is very hard for us to...

  • 1552 Views
  • 1 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Sanath Pabba​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 2 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels