cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Ruby8376
by Valued Contributor
  • 5432 Views
  • 8 replies
  • 1 kudos

Resolved! Anti pattern : moving data from cloud to on-prem

Hi there,In my current project, Current status: Az databricks streaming jobs migrate Json file from kafka to raw layer(parquet file), then parsing logic is applied and 8 tables are created in raw standardized layer.Requirement: Business team wants to...

  • 5432 Views
  • 8 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

You could indeed use ADF to copy the data from cloud to on-prem.However, depending on the size of the data, this can take a while.I use the same pattern, but for aggregated processed data, which is not an issue at all.You could also look at Azure Syn...

  • 1 kudos
7 More Replies
MrT
by New Contributor II
  • 4358 Views
  • 3 replies
  • 3 kudos

Python databricks-sql-connector TLS issue - client tries to negotiate v1 which fails many times then randomly tries to negotiate v1.3 which works

This issue is oddly only on an Azure Windows 10 VM. I Dont have this on my workstation or my personal computer so it seems to be host config related. The VM where the issue is i have a simple python script that connects to the Azure Databricks SQL en...

image
  • 4358 Views
  • 3 replies
  • 3 kudos
Latest Reply
Vidula
Honored Contributor
  • 3 kudos

Hello @Wayne Theron​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Th...

  • 3 kudos
2 More Replies
Karthe
by New Contributor III
  • 23847 Views
  • 4 replies
  • 2 kudos

I would like to access S3 data in databricks

Hi all,I am new to the databricks. I am trying to get the data from S3. The video tutoirals from the streaming platforms are accessing via access ID and secret access key. However, databricks is throwing a different options. I dont know what to fill...

  • 23847 Views
  • 4 replies
  • 2 kudos
Latest Reply
Vidula
Honored Contributor
  • 2 kudos

Hi @Karthikeyan Palanisamy​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from...

  • 2 kudos
3 More Replies
Cano
by New Contributor III
  • 18736 Views
  • 15 replies
  • 0 kudos

Connecting Databricks Spark Cluster to Postgresql RDS Instance

I am trying to connect my Spark cluster to a Postgresql RDS instance. The Python notebook code that was used is seen below:df = ( spark.read \ .format("jdbc") \ .option("url", "jdbc:postgresql://<connection-string>:5432/database”)\ .option("dbt...

  • 18736 Views
  • 15 replies
  • 0 kudos
Latest Reply
User16873043099
Databricks Employee
  • 0 kudos

"Caused by: java.net.SocketTimeoutException: connect timed out" indicate the network connection between Databricks cluster and the postgress database on 5432 port was not established and eventually timed out.As a first step, please ensure the connect...

  • 0 kudos
14 More Replies
aj19
by New Contributor
  • 5607 Views
  • 1 replies
  • 0 kudos

How to trigger Azure Logic App from Azure Databricks?

I have an Azure Logic app which triggers whenever a HTTP Post request is received. I want to send this request from my notebook present in Azure Databricks workspace using scala and spark.​ Is it possible? If yes, then please guide on how to do it. T...

  • 5607 Views
  • 1 replies
  • 0 kudos
Latest Reply
Vidula
Honored Contributor
  • 0 kudos

Hey there @Ayushri Jain​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from yo...

  • 0 kudos
andrej
by New Contributor II
  • 3759 Views
  • 4 replies
  • 1 kudos

Partition pruning with generated columns

I have a large table which contains a date_time column.The table contains 2 generated columns year, and month which are extracted from the date_time values and are used for partitioning.I have the following question.If I run the querySELECT *FROM tab...

  • 3759 Views
  • 4 replies
  • 1 kudos
Latest Reply
Vidula
Honored Contributor
  • 1 kudos

Hi @Andrej Znidarsic​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.T...

  • 1 kudos
3 More Replies
Vidyasankar
by New Contributor
  • 3195 Views
  • 2 replies
  • 1 kudos
  • 3195 Views
  • 2 replies
  • 1 kudos
Latest Reply
Vidula
Honored Contributor
  • 1 kudos

Hi @Vidya sankar​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thank...

  • 1 kudos
1 More Replies
Vignesh2806
by New Contributor II
  • 10878 Views
  • 2 replies
  • 3 kudos

I would like to get the below error solved Cluster scoped init script dbfs:/FileStore/tables/***.sh failed: Script exit status is non-zero

I am trying to run the databricks cluster, but at times the cluster takes long time to get set up & After some time it throws the below error. Cluster scoped init script dbfs:/FileStore/tables/***.sh failed: Script exit status is non-zeroThe init scr...

  • 10878 Views
  • 2 replies
  • 3 kudos
Latest Reply
Vidula
Honored Contributor
  • 3 kudos

Hi @Vignesh Ravichandran​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from y...

  • 3 kudos
1 More Replies
chandan_a_v
by Valued Contributor
  • 10214 Views
  • 2 replies
  • 4 kudos

Best way to run the Databricks notebook in a parallel way

Hi All,I need to run a Databricks notebook in a parallel way for different arguments. I tried with the threading approach but only the first 2 threads successfully execute the notebook and the rest fail. Please let me know if there is any best way to...

  • 10214 Views
  • 2 replies
  • 4 kudos
Latest Reply
Vidula
Honored Contributor
  • 4 kudos

Hey there @Chandan Angadi​ Does @Hubert Dudek​  response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

  • 4 kudos
1 More Replies
lizou
by Contributor III
  • 3857 Views
  • 4 replies
  • 2 kudos

call saved query in sql warehouse

in python cursor.executecan you call a saved query with a parameter? like call a stored procedure in relational db?https://docs.microsoft.com/en-us/azure/databricks/dev-tools/python-sql-connector#cursor-method

  • 3857 Views
  • 4 replies
  • 2 kudos
Latest Reply
Vidula
Honored Contributor
  • 2 kudos

Hi @lizou​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 2 kudos
3 More Replies
BeginnerBob
by New Contributor III
  • 25977 Views
  • 4 replies
  • 2 kudos

Flatten a complex JSON file and load into a delta table

Hi,I am loading a JSON file into Databricks by simply doing the following:from pyspark.sql.functions import *from pyspark.sql.types import *bronze_path="wasbs://....../140477.json"df_incremental = spark.read.option("multiline","true").json(bronze_pat...

image
  • 25977 Views
  • 4 replies
  • 2 kudos
Latest Reply
Vidula
Honored Contributor
  • 2 kudos

Hi @Lloyd Vickery​ Does @Werner Stinckens​  response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

  • 2 kudos
3 More Replies
Dan-K
by New Contributor III
  • 33043 Views
  • 6 replies
  • 6 kudos

Resolved! How to display markdown output in databricks notebook from a python cell

With IPython/Jupyter it's possible to output markdown using the IPython display module and its `MarkDown`class. QuestionHow can I accomplish this with Azure Databricks?What I triedDatabricks `display`Tried using Databrick's display with the IPython M...

Example markdown output Example failed ouput Failed display_markdown Databricks display
  • 33043 Views
  • 6 replies
  • 6 kudos
Latest Reply
Debayan
Databricks Employee
  • 6 kudos

Hi, Thanks for reaching out to community.databricks.com.In a notebook cell, type "%md" and type some markdown and it will render. Please refer: https://community.databricks.com/s/question/0D53f00001HKHhNCAX/markup-in-databricks-notebook

  • 6 kudos
5 More Replies
al_joe
by Contributor
  • 2414 Views
  • 1 replies
  • 0 kudos

Can we have a better UI for navigating Workspace and Repos?

Navigating through multiple vertical panes of information as we navigate deeper into a folder structure is not very convenient -- we lose the context of parent folder and sibling folders very soon.Can we not have a simple tree view (similar to VS Cod...

image.png
  • 2414 Views
  • 1 replies
  • 0 kudos
Latest Reply
Vidula
Honored Contributor
  • 0 kudos

Hey there @Al Jo​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so ...

  • 0 kudos
dbrick
by New Contributor II
  • 1971 Views
  • 1 replies
  • 1 kudos

Multiple Jobs with different resource requirements on the same cluster

I have a big cluster with the auto-scaling(min:1, max: 25) feature enabled. I want to run multiple jobs on that cluster with different values of spark properties( `--executor-cores` and `–executor-memory) but I don't see any option to specify the sam...

  • 1971 Views
  • 1 replies
  • 1 kudos
Latest Reply
Vidula
Honored Contributor
  • 1 kudos

Hi @Neelesh databricks​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell ...

  • 1 kudos
BananaHotSauce
by New Contributor III
  • 1336 Views
  • 1 replies
  • 3 kudos

Can I use PrivateLink and Customer Managed Policy for Cross Account Role

Hello, Im trying to enable Privatelink on my AWS Databricks quickstart, ​I use the customer managed VPC policy for the ​cross account role and supply it on the template. Im having an error that it cannot create a VPC Endpoint.​Do i need to change the...

  • 1336 Views
  • 1 replies
  • 3 kudos
Latest Reply
Debayan
Databricks Employee
  • 3 kudos

Hi @Chris Joshua Manuel​ , Thanks for reaching out to Community.databricks.com. Cross account VPC access works. Please refer below: https://tomgregory.com/cross-account-vpc-access-in-aws/Also, please let us know in case of any further clarification n...

  • 3 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels