cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Osky_Rosky
by New Contributor II
  • 10161 Views
  • 2 replies
  • 0 kudos

Combine Python + R in data manipulation in Databricks Notebook

Want to combine Py + Rfrom pyspark.sql import SparkSessionspark = SparkSession.builder.appName("CreateDataFrame").getOrCreate()# Create a sample DataFramedata = [("Alice", 25), ("Bob", 30), ("Charlie", 35), ("Oscar",36), ("Hiromi",41), ("Alejandro", ...

  • 10161 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Oscar CENTENO MORA​ :To combine Py and R in a Databricks notebook, you can use the magics command %python and %rto switch between Python and R cells. Here's an example of how to create a Spark DataFrame in Python and then use it in R:from pyspark.sq...

  • 0 kudos
1 More Replies
Phani1
by Valued Contributor II
  • 1329 Views
  • 3 replies
  • 0 kudos

Performance issue while loading bulk data into Post Gress DB from data bricks.

We are facing a performance issue while loading bulk data into Postgress DB from data bricks. We are using spark JDBC connections to move the data. However, the rate of transfer is very low which is causing performance bottleneck. is there any better...

  • 1329 Views
  • 3 replies
  • 0 kudos
Latest Reply
User16502773013
Contributor II
  • 0 kudos

Hello @Janga Reddy​ @Daniel Sahal​ and @Vidula Khanna​ ,To enhance performance in general we need to design for more parallelism, in Spark JDBC context this controlled by the number of partitions for the data to be writtenThe example here shows how t...

  • 0 kudos
2 More Replies
Avvar2022
by Contributor
  • 2509 Views
  • 2 replies
  • 2 kudos

Resolved! I am new to data bricks. setting up Workspace for NON-prod environment Separate workspaces for DEV, QA or Just one work space for NON-prod ?

What i learned based on learning materials, documents, etc.. For data bricks it is a good practice to set up 1 non-prod workspace but separate clusters for Dev, QA, SIT, etc.Is it best practice to set up only 1 NON-PROD Workspace instead of separate ...

Databricks non-prod workspace set up options
  • 2509 Views
  • 2 replies
  • 2 kudos
Latest Reply
Avvar2022
Contributor
  • 2 kudos

Thank you. This helps.

  • 2 kudos
1 More Replies
Hubert-Dudek
by Esteemed Contributor III
  • 909 Views
  • 1 replies
  • 5 kudos

Exciting news for #azure users! The #databricks runtime 12.2 has been officially released as a long-term support (LTS) version, providing a stable and...

Exciting news for #azure users! The #databricks runtime 12.2 has been officially released as a long-term support (LTS) version, providing a stable and reliable platform for users to build and deploy their applications. As part of this release, the en...

122
  • 909 Views
  • 1 replies
  • 5 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 5 kudos

Hi @Hubert Dudek​,We express appreciation for the informative content you've contributed to our community.Your posts have sparked engaging discussions and proven invaluable resources for our members.You've truly made a difference in our community, an...

  • 5 kudos
Lu_Wang_SA_DBX
by New Contributor III
  • 875 Views
  • 1 replies
  • 2 kudos

We will host the first Databricks Bay Area User Group meeting in the Databricks Mountain View office on March 14 2:30-5:00pm PT.We'll have Dave Ma...

We will host the first Databricks Bay Area User Group meeting in the Databricks Mountain View office on March 14 2:30-5:00pm PT.We'll have Dave Mariani - CTO & Founder at AtScale, and Riley Phillips - Enterprise Solution Engineer at Matillion to shar...

  • 875 Views
  • 1 replies
  • 2 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 2 kudos

Hi @Lu Wang​, Thank you for your outstanding efforts in sharing important announcements within our community. Your proactive approach to communication has helped maintain a well-informed and cohesive work environment for our team.Your dedication to k...

  • 2 kudos
Hubert-Dudek
by Esteemed Contributor III
  • 1199 Views
  • 1 replies
  • 6 kudos

Exciting news for #azure users! The #databricks runtime 12.2 has been officially released as a long-term support (LTS) version, providing a stable and...

Exciting news for #azure users! The #databricks runtime 12.2 has been officially released as a long-term support (LTS) version, providing a stable and reliable platform for users to build and deploy their applications. As part of this release, the en...

122
  • 1199 Views
  • 1 replies
  • 6 kudos
Latest Reply
jose_gonzalez
Moderator
  • 6 kudos

Thank you for sharing @Hubert Dudek​ !!!

  • 6 kudos
Hubert-Dudek
by Esteemed Contributor III
  • 1277 Views
  • 1 replies
  • 7 kudos

Starting from #databricks 12.2 LTS, the explode function can be used in the FROM statement to manipulate data in new and powerful ways. This function ...

Starting from #databricks 12.2 LTS, the explode function can be used in the FROM statement to manipulate data in new and powerful ways. This function takes an array column as input and returns a new row for each element in the array, offering new pos...

ezgif-3-f42040b788
  • 1277 Views
  • 1 replies
  • 7 kudos
Latest Reply
jose_gonzalez
Moderator
  • 7 kudos

Thank you for sharing @Hubert Dudek​ 

  • 7 kudos
Lu_Wang_SA_DBX
by New Contributor III
  • 3670 Views
  • 1 replies
  • 3 kudos

We will host the first Databricks Bay Area User Group meeting in the Databricks Mountain View office on March 14 2:45-5:00 pm PT.We'll have Dave M...

We will host the first Databricks Bay Area User Group meeting in the Databricks Mountain View office on March 14 2:45-5:00 pm PT.We'll have Dave Mariani - CTO & Founder at AtScale, and Riley Phillips - Enterprise Solution Engineer at Matillion to sha...

David Mariana - CTO, AtScale Riley Phillips - Enterprise Solution Engineer, Matillion
  • 3670 Views
  • 1 replies
  • 3 kudos
Latest Reply
amitabharora
New Contributor II
  • 3 kudos

Looking forward.

  • 3 kudos
andrcami1990
by New Contributor II
  • 5054 Views
  • 2 replies
  • 2 kudos

Resolved! Connect GraphQL to Data Bricks

Hi I am new to Databricks however I need to expose data found in the delta lake directly to GraphQL to be queried by several applications. Is there a connector or something similar to GraphQL that works with Databricks?

  • 5054 Views
  • 2 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Andrew Camilleri​ Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feed...

  • 2 kudos
1 More Replies
Hubert-Dudek
by Esteemed Contributor III
  • 884 Views
  • 1 replies
  • 5 kudos

Starting from #databricks runtime 12.2 LTS, implicit lateral column aliasing is now supported. This feature enables you to reuse an expression defined...

Starting from #databricks runtime 12.2 LTS, implicit lateral column aliasing is now supported. This feature enables you to reuse an expression defined earlier in the same SELECT list, thus avoiding repetition of the same calculation.For instance, in ...

ezgif-3-d3fac0139c
  • 884 Views
  • 1 replies
  • 5 kudos
Latest Reply
Anonymous
Not applicable
  • 5 kudos

Thanks for sharing this with the Databricks community.

  • 5 kudos
Rishabh-Pandey
by Esteemed Contributor
  • 1258 Views
  • 2 replies
  • 5 kudos

"Hey everyone, it seems like there's some confusion about enhanced autoscaling in Databricks lately. If you're feeling lost or unsure abo...

"Hey everyone, it seems like there's some confusion about enhanced autoscaling in Databricks lately. If you're feeling lost or unsure about how it works, don't worry - you're not"Enhanced autoscaling is a feature in Databricks that enables dynamic sc...

  • 1258 Views
  • 2 replies
  • 5 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 5 kudos

Very informativeThanks for sharing!

  • 5 kudos
1 More Replies
Mado
by Valued Contributor II
  • 2838 Views
  • 2 replies
  • 0 kudos

Overwriting the existing table in Databricks; Mechanism and History?

Hi,Assume that I have a delta table stored on an Azure storage account. When new records arrive, I repeat the transformation and overwrite the existing table. (DF.write   .format("delta")   .mode("overwrite")   .option("...

  • 2838 Views
  • 2 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

the overwrite will add new files, keep the old ones and in a log keeps track of what is current data and what is old data.If the overwrite fails, you will get an error message in the spark program, and the data to be overwritten will still be the cur...

  • 0 kudos
1 More Replies
Hubert-Dudek
by Esteemed Contributor III
  • 675 Views
  • 1 replies
  • 5 kudos

Exciting news for Databricks users! #databricks launched a new feature that allows users to run job workflows continuously. Setting up a continuous jo...

Exciting news for Databricks users! #databricks launched a new feature that allows users to run job workflows continuously. Setting up a continuous job workflow is straightforward: create a job and select the continuous trigger option in the scheduli...

ezgif-1-1c3322d3f9
  • 675 Views
  • 1 replies
  • 5 kudos
Latest Reply
jose_gonzalez
Moderator
  • 5 kudos

Thank you for sharing!!!

  • 5 kudos
Hubert-Dudek
by Esteemed Contributor III
  • 1681 Views
  • 3 replies
  • 7 kudos

Starting from #databricks runtime 12.2 LTS, implicit lateral column aliasing is now supported. This feature enables you to reuse an expression defined...

Starting from #databricks runtime 12.2 LTS, implicit lateral column aliasing is now supported. This feature enables you to reuse an expression defined earlier in the same SELECT list, thus avoiding repetition of the same calculation.For instance, in ...

ezgif-3-d3fac0139c
  • 1681 Views
  • 3 replies
  • 7 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 7 kudos

Informative Thanks for sharing.

  • 7 kudos
2 More Replies
Ajay-Pandey
by Esteemed Contributor III
  • 1812 Views
  • 1 replies
  • 6 kudos

Variable explorer in Databricks With Databricks Runtime 12.1 and above, you can directly observe current Python variables in the notebook UI.To open t...

Variable explorer in DatabricksWith Databricks Runtime 12.1 and above, you can directly observe current Python variables in the notebook UI.To open the variable explorer, click in the right sidebar. The variable explorer opens, showing the value and ...

image Untitled
  • 1812 Views
  • 1 replies
  • 6 kudos
Latest Reply
jose_gonzalez
Moderator
  • 6 kudos

Thank you for sharing

  • 6 kudos
Labels