cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

sabooalex
by New Contributor II
  • 681 Views
  • 0 replies
  • 0 kudos

SCD type2 snowflake

I have monthly files which comes in S3 bucket. I want to implement SCD type2 in snowflake.I am ok to read the new files, clean it.My question is about comparing what I have read from the files, with what is stored in the snowflake table already(milli...

  • 681 Views
  • 0 replies
  • 0 kudos
Anonymous
by Not applicable
  • 881 Views
  • 3 replies
  • 0 kudos

The Next Databricks Office HoursOur next Office Hours session is scheduled for January 25, 2022 - 8:00 am PDT Do you have questions about how to set u...

The Next Databricks Office HoursOur next Office Hours session is scheduled for January 25, 2022 - 8:00 am PDTDo you have questions about how to set up or use Databricks? Do you want to get best practices for deploying your use case or tips on data ar...

  • 881 Views
  • 3 replies
  • 0 kudos
Latest Reply
Hanna0805050
New Contributor II
  • 0 kudos

Thank you for the opportunity to communicate. I work at https://www.eliteimagingsystems.com/ and know how important it is for our customers to be able to communicate with us 24/7.

  • 0 kudos
2 More Replies
jwilliam
by Contributor
  • 2607 Views
  • 4 replies
  • 4 kudos

Resolved! What is the maximum of concurrent streaming jobs for a cluster?

What is the maximum of concurrent streaming jobs for a cluster? How can I have the right amount of concurrent streaming jobs for different cluster configuration?Should I use multiple cluster for different jobs or combine it into a big cluster to hand...

  • 2607 Views
  • 4 replies
  • 4 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 4 kudos

Hi @John William​ , We haven't heard from you on the last response from @Prabakar​ , and I was checking back to see if his suggestions helped you. Or else, If you have any solution, please share it with the community as it can be helpful to others.Al...

  • 4 kudos
3 More Replies
DipakBachhav
by New Contributor III
  • 8930 Views
  • 5 replies
  • 1 kudos

How to store SQL query result data on a local disk?

I am a newbie to data bricks and trying to write results into the excel/ CSV file using the below command but getting DataFrame' object has no attribute 'to_csv' errors while executing.I am using a notebook to execute my SQL queries and now want to s...

  • 8930 Views
  • 5 replies
  • 1 kudos
Latest Reply
Vidula
Honored Contributor
  • 1 kudos

Hi there @Dipak Bachhav​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from yo...

  • 1 kudos
4 More Replies
Ruby8376
by Valued Contributor
  • 3076 Views
  • 9 replies
  • 1 kudos

Resolved! Anti pattern : moving data from cloud to on-prem

Hi there,In my current project, Current status: Az databricks streaming jobs migrate Json file from kafka to raw layer(parquet file), then parsing logic is applied and 8 tables are created in raw standardized layer.Requirement: Business team wants to...

  • 3076 Views
  • 9 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Hi @Ruby Rubi​ , We haven’t heard from you on the last response from @Werner Stinckens​, and I was checking back to see if his suggestions helped you. Or else, If you have any solution, please do share that with the community as it can be helpful to ...

  • 1 kudos
8 More Replies
MrT
by New Contributor II
  • 2335 Views
  • 3 replies
  • 3 kudos

Python databricks-sql-connector TLS issue - client tries to negotiate v1 which fails many times then randomly tries to negotiate v1.3 which works

This issue is oddly only on an Azure Windows 10 VM. I Dont have this on my workstation or my personal computer so it seems to be host config related. The VM where the issue is i have a simple python script that connects to the Azure Databricks SQL en...

image
  • 2335 Views
  • 3 replies
  • 3 kudos
Latest Reply
Vidula
Honored Contributor
  • 3 kudos

Hello @Wayne Theron​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Th...

  • 3 kudos
2 More Replies
Karthe
by New Contributor III
  • 7127 Views
  • 4 replies
  • 2 kudos

I would like to access S3 data in databricks

Hi all,I am new to the databricks. I am trying to get the data from S3. The video tutoirals from the streaming platforms are accessing via access ID and secret access key. However, databricks is throwing a different options. I dont know what to fill...

  • 7127 Views
  • 4 replies
  • 2 kudos
Latest Reply
Vidula
Honored Contributor
  • 2 kudos

Hi @Karthikeyan Palanisamy​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from...

  • 2 kudos
3 More Replies
Cano
by New Contributor III
  • 11158 Views
  • 15 replies
  • 0 kudos

Connecting Databricks Spark Cluster to Postgresql RDS Instance

I am trying to connect my Spark cluster to a Postgresql RDS instance. The Python notebook code that was used is seen below:df = ( spark.read \ .format("jdbc") \ .option("url", "jdbc:postgresql://<connection-string>:5432/database”)\ .option("dbt...

  • 11158 Views
  • 15 replies
  • 0 kudos
Latest Reply
User16873043099
Contributor
  • 0 kudos

"Caused by: java.net.SocketTimeoutException: connect timed out" indicate the network connection between Databricks cluster and the postgress database on 5432 port was not established and eventually timed out.As a first step, please ensure the connect...

  • 0 kudos
14 More Replies
aj19
by New Contributor
  • 3385 Views
  • 2 replies
  • 0 kudos

How to trigger Azure Logic App from Azure Databricks?

I have an Azure Logic app which triggers whenever a HTTP Post request is received. I want to send this request from my notebook present in Azure Databricks workspace using scala and spark.​ Is it possible? If yes, then please guide on how to do it. T...

  • 3385 Views
  • 2 replies
  • 0 kudos
Latest Reply
Vidula
Honored Contributor
  • 0 kudos

Hey there @Ayushri Jain​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from yo...

  • 0 kudos
1 More Replies
andrej
by New Contributor II
  • 1921 Views
  • 4 replies
  • 1 kudos

Partition pruning with generated columns

I have a large table which contains a date_time column.The table contains 2 generated columns year, and month which are extracted from the date_time values and are used for partitioning.I have the following question.If I run the querySELECT *FROM tab...

  • 1921 Views
  • 4 replies
  • 1 kudos
Latest Reply
Vidula
Honored Contributor
  • 1 kudos

Hi @Andrej Znidarsic​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.T...

  • 1 kudos
3 More Replies
VikasSinha
by New Contributor
  • 4102 Views
  • 2 replies
  • 0 kudos

Which is better - Azure Databricks or GCP Databricks?

Which cloud hosting environment is best to use for Databricks? My question pins down to the fact that there must be some difference between the latency, throughput, result consistency & reproducibility between different cloud hosting environments of ...

  • 4102 Views
  • 2 replies
  • 0 kudos
Latest Reply
Vidula
Honored Contributor
  • 0 kudos

Hi @Vikas Sinha​ Does @Prabakar Ammeappin​ response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

  • 0 kudos
1 More Replies
Vidyasankar
by New Contributor
  • 1643 Views
  • 3 replies
  • 1 kudos
  • 1643 Views
  • 3 replies
  • 1 kudos
Latest Reply
Vidula
Honored Contributor
  • 1 kudos

Hi @Vidya sankar​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thank...

  • 1 kudos
2 More Replies
Vignesh2806
by New Contributor II
  • 8574 Views
  • 2 replies
  • 3 kudos

I would like to get the below error solved Cluster scoped init script dbfs:/FileStore/tables/***.sh failed: Script exit status is non-zero

I am trying to run the databricks cluster, but at times the cluster takes long time to get set up & After some time it throws the below error. Cluster scoped init script dbfs:/FileStore/tables/***.sh failed: Script exit status is non-zeroThe init scr...

  • 8574 Views
  • 2 replies
  • 3 kudos
Latest Reply
Vidula
Honored Contributor
  • 3 kudos

Hi @Vignesh Ravichandran​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from y...

  • 3 kudos
1 More Replies
chandan_a_v
by Valued Contributor
  • 8778 Views
  • 2 replies
  • 4 kudos

Best way to run the Databricks notebook in a parallel way

Hi All,I need to run a Databricks notebook in a parallel way for different arguments. I tried with the threading approach but only the first 2 threads successfully execute the notebook and the rest fail. Please let me know if there is any best way to...

  • 8778 Views
  • 2 replies
  • 4 kudos
Latest Reply
Vidula
Honored Contributor
  • 4 kudos

Hey there @Chandan Angadi​ Does @Hubert Dudek​  response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

  • 4 kudos
1 More Replies
lizou
by Contributor II
  • 2398 Views
  • 4 replies
  • 2 kudos

call saved query in sql warehouse

in python cursor.executecan you call a saved query with a parameter? like call a stored procedure in relational db?https://docs.microsoft.com/en-us/azure/databricks/dev-tools/python-sql-connector#cursor-method

  • 2398 Views
  • 4 replies
  • 2 kudos
Latest Reply
Vidula
Honored Contributor
  • 2 kudos

Hi @lizou​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 2 kudos
3 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels