cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Yash_542965
by New Contributor II
  • 2243 Views
  • 1 replies
  • 0 kudos

DLT aggregation problem

I'm utilizing SQL to perform aggregation operations within a gold layer of a DLT pipeline. However, I'm encountering an error when running the pipeline while attempting to return a data frame using spark.sql.Could anyone please assist me with the SQL...

  • 2243 Views
  • 1 replies
  • 0 kudos
Latest Reply
lucasrocha
Databricks Employee
  • 0 kudos

Hello @Yash_542965 , I hope this message finds you well. Could you please share a sample of code you are using so that we can check it further? Best regards,Lucas Rocha

  • 0 kudos
vijaykumarbotla
by New Contributor III
  • 2375 Views
  • 1 replies
  • 0 kudos

Databricks Notebook error : Analysis Exception with multiple datasets

Hi All,I am getting below error when trying to execute the code.AnalysisException: Column Is There a PO#17748 are ambiguous. It's probably because you joined several Datasets together, and some of these Datasets are the same. This column points to ...

  • 2375 Views
  • 1 replies
  • 0 kudos
Latest Reply
lucasrocha
Databricks Employee
  • 0 kudos

Hello @vijaykumarbotla , I hope you're doing well. This is probably because both DataFrames contain a column with the same name, and Spark is unable to determine which one you are referring to in the select statement. To resolve this issue, you can u...

  • 0 kudos
User16752244127
by Databricks Employee
  • 1699 Views
  • 1 replies
  • 0 kudos
  • 1699 Views
  • 1 replies
  • 0 kudos
Latest Reply
lucasrocha
Databricks Employee
  • 0 kudos

Hello @User16752244127 , I hope this message finds you well. Delta Live Tables supports loading data from any data source supported by Databricks. You can find the datasources supported here Connect to data sources, and JDBC is one of them. You can a...

  • 0 kudos
Sambit_S
by New Contributor III
  • 1398 Views
  • 1 replies
  • 0 kudos

Exceptions are Not Getting Handled In Autoloader Write Stream

I have below logic implemented using Databricks Autoloader. ## Autoloader Write stream: Its calling forEachBatch function to write into respective datatype catalog table#  and using checkpoint to keeps track of processing files.try:    ##Observe raw ...

Sambit_S_0-1717689309381.png
  • 1398 Views
  • 1 replies
  • 0 kudos
Latest Reply
raphaelblg
Databricks Employee
  • 0 kudos

Hello @Sambit_S ,In your scenario, there is a merge failure. Your query won't be able to progress as the problematic batch can't be committed to sink. Even if you handle the exception in a try catch block, it's impossible for the autoloader to update...

  • 0 kudos
tgen
by New Contributor II
  • 2754 Views
  • 1 replies
  • 0 kudos

Increase stack size Databricks

Hi everyoneI'm currently running a shell script in a notebook, and I'm encountering a segmentation fault. This is due to the stack size limitation. I'd like to increase the stack size using ulimit -s unlimited, but I'm facing issues with setting this...

  • 2754 Views
  • 1 replies
  • 0 kudos
satishnavik
by New Contributor II
  • 14751 Views
  • 5 replies
  • 0 kudos

How to connect Databricks Database with Springboot application using JPA

facing issue with integrating our Spring boot JPA supported application with Databricks.Below are the steps and setting we did for the integration.When we are starting the spring boot application we are getting a warning as :HikariPool-1 - Driver doe...

  • 14751 Views
  • 5 replies
  • 0 kudos
Latest Reply
172036
New Contributor II
  • 0 kudos

Was there any resolution to this?  Is Spring datasource supported now?

  • 0 kudos
4 More Replies
djburnham
by New Contributor III
  • 5989 Views
  • 2 replies
  • 1 kudos

Resolved! How to get a list of workspace users who have the "unrestricted cluster create" entitlement ?

Hello - I'm hoping somebody can help me with this ... I have a lot of users configured with access to a workspace (100's) and I want to write a report to see if any of the users have  "unrestricted cluster create" entitlement in the workspace. This i...

  • 5989 Views
  • 2 replies
  • 1 kudos
Latest Reply
djburnham
New Contributor III
  • 1 kudos

Many thanks for you help @Yeshwanth  it put me on the right track. The API does have a filter option and that looks like it complies with rfc7644 but my attempts to use it were rather hit and miss - I suspect as the API is preview it is not fully imp...

  • 1 kudos
1 More Replies
Anonymous
by Not applicable
  • 10454 Views
  • 11 replies
  • 2 kudos

Sql Serverless Option is missing when using Azure Databricks Workspace with No Public IP and VNET Injection

HelloAfter creating an Databricks Workspace in Azure with No Public IP and VNET Injection, I'm unable to use DBSQL Serverless because the option to enable it in SQL warehouse Settings is missing. ​Is it by design? Is it a limitation when using Privat...

  • 10454 Views
  • 11 replies
  • 2 kudos
Latest Reply
RomanLegion
New Contributor III
  • 2 kudos

Fixed, go to Profile -> Compute->  SQL Server Serverless -> On -> Save. For some reason this has been disabled for us.

  • 2 kudos
10 More Replies
jenshumrich
by Contributor
  • 5375 Views
  • 1 replies
  • 0 kudos

Resolved! R install - cannot open URL

Neither standard nor non standard repo seem available. Any idea how to debug/fix this? %r install.packages("gghighlight", lib="/databricks/spark/R/lib", repos = "http://cran.us.r-project.org") Warning: unable to access index for repository http://cra...

  • 5375 Views
  • 1 replies
  • 0 kudos
Latest Reply
jenshumrich
Contributor
  • 0 kudos

%sh nc -zv cran.us.r-project.org 80 It was a network issue. These lines above proved it and the network administrators had to open the IPs.

  • 0 kudos
BobBubble2000
by New Contributor II
  • 5643 Views
  • 4 replies
  • 0 kudos

Delta Live Tables with Common Data Model as source

Hi,I'm investigating whether it's possible to use Common Data Model CDM (in particular the Dynamics 365 exported csv and cdm files) as a Delta Live Tables data source? Can someone point me in the right direction?Thanks!

  • 5643 Views
  • 4 replies
  • 0 kudos
Latest Reply
Suryanarayan
New Contributor II
  • 0 kudos

Using Delta Live Tables with Common Data Model (CDM) as a Source in DatabricksI'm investigating the use of Delta Live Tables (DLT) to process Common Data Model (CDM) files exported from Dynamics 365, and I found a solution that works well. Here’s a q...

  • 0 kudos
3 More Replies
Jackson1111
by New Contributor III
  • 1365 Views
  • 3 replies
  • 1 kudos

get job detail API

Hello, is there an API interface for passing in batches of run_id to obtain job running details?

  • 1365 Views
  • 3 replies
  • 1 kudos
Latest Reply
mhiltner
Databricks Employee
  • 1 kudos

Maybe this could help. Its not batch, but you can get the run_id details  https://docs.databricks.com/en/workflows/jobs/jobs-2.0-api.html#runs-get-output

  • 1 kudos
2 More Replies
eva_mcmf
by New Contributor II
  • 1660 Views
  • 1 replies
  • 0 kudos

Autoloader with SQLite db files

Hi Everyone, Is it possible to ingest SQLite db files with Databricks Autoloader? Is it currently supported? If so, could you please share an example?

Data Engineering
autoloader
azure
ingestion
sqlite
  • 1660 Views
  • 1 replies
  • 0 kudos
Latest Reply
lucasrocha
Databricks Employee
  • 0 kudos

Hello @eva_mcmf , I hope this message finds you well. As per the documentation, Auto Loader incrementally and efficiently processes new data files as they arrive in cloud storage. Auto Loader can load data files from AWS S3, Azure Data Lake Storage G...

  • 0 kudos
Chengcheng
by New Contributor III
  • 4002 Views
  • 1 replies
  • 0 kudos

The default location of temporary file in Azure Synapse Connector(com.databricks.spark.sqldw)

Hi everone, I'm trying to query data in Azure Synapse Dedicated SQL Pool according to the documentaion using:.format("com.databricks.spark.sqldw") Query data in Azure Synapse AnalyticsIt says that a abfss temporary location is needed.But I found that...

Data Engineering
Azure Synapse Connector
Data Ingstion
JDBC
  • 4002 Views
  • 1 replies
  • 0 kudos
Latest Reply
lucasrocha
Databricks Employee
  • 0 kudos

Hello @Chengcheng , I hope this message finds you well. As per the documentation the "tempDir" parameter is a required one and there is no default value for it.Databricks Synapse connector options reference: https://docs.databricks.com/en/connect/ext...

  • 0 kudos
PabloCSD
by Valued Contributor II
  • 3225 Views
  • 4 replies
  • 1 kudos

Resolved! My Libraries are not being installed in dbx-pipelines

Hello,I have some libraries on Azure Artifacts, but when I'm using notebooks, they are unreachable even though I'm explicitly adding the pip extra-url option (I have validated the tokens). So, I had to install them manually by downloading the wheel f...

PabloFelipe_0-1717599844578.png
Data Engineering
Databricks
dbx
  • 3225 Views
  • 4 replies
  • 1 kudos
Latest Reply
PabloCSD
Valued Contributor II
  • 1 kudos

@shan_chandrawe solved it, it was an issue with the DevOps key-vault token associated of the artifacts token.

  • 1 kudos
3 More Replies
AH
by New Contributor III
  • 1056 Views
  • 1 replies
  • 0 kudos

Resolved! Delta Lake Table Daily Read and Write job optimization

I have created 7 job for each business system to extract product data from each postgress source then write all job data into one data lake delta table [raw_product].each business system product table has around 20 GB of data.do the same thing for 15...

AH_0-1717569489175.png AH_1-1717572455868.png AH_3-1717572644640.png AH_2-1717572557758.png
  • 1056 Views
  • 1 replies
  • 0 kudos
Latest Reply
shan_chandra
Databricks Employee
  • 0 kudos

@AH  - we can try out the config  if read or fetch from postgres is slow , we can increase the fetchsize , numPartitions (to increase parallelism). kindly try to do a df.count() to check on slowness.  https://spark.apache.org/docs/latest/sql-data-sou...

  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels