Data Engineering

Forum Posts

Sorted by:

by Prototype998 • New Contributor III

07-13-2022 11:51:20 PM

6407 Views
5 replies
2 kudos

Resolved! reading multiple csv files using pathos.multiprocessing

I'm using PySpark and Pathos to read numerous CSV files and create many DF, but I keep getting this problem.code for the same:-from pathos.multiprocessing import ProcessingPooldef readCsv(path): return spark.read.csv(path,header=True)csv_file_list =...

Data Engineering

6407 Views
5 replies
2 kudos

07-13-2022 11:51:20 PM

View Replies

Latest Reply

Prototype998
New Contributor III

12-22-2022 1:30:41 AM

2 kudos

@Ajay Pandey @Rishabh Pandey

2 kudos

12-22-2022 1:30:41 AM

4 More Replies

by ratnakarsinha • New Contributor II

04-06-2020 4:36:28 AM

25129 Views
3 replies
0 kudos

How to get full result using DataFrame.Display method

Hi, Dataframe.Display method in Databricks notebook fetches only 1000 rows by default. Is there a way to change this default to display and download full result (more than 1000 rows) in python? Thanks, Ratnakar.

Data Engineering

25129 Views
3 replies
0 kudos

04-06-2020 4:36:28 AM

View Replies

Latest Reply

ramravi
Contributor II

12-22-2022 1:14:07 AM

0 kudos

display method doesn't have the option to choose the number of rows. Use the show method. It is not neat and you can't do visualizations and downloads.

0 kudos

12-22-2022 1:14:07 AM

2 More Replies

by prasannar • New Contributor II

12-21-2022 10:10:20 PM

29017 Views
3 replies
1 kudos

Resolved! How to write Spark dataframe to Oracle database from databricks environment ?

Data Engineering

29017 Views
3 replies
1 kudos

12-21-2022 10:10:20 PM

View Replies

Latest Reply

Ajay-Pandey
Databricks MVP

12-21-2022 11:10:35 PM

1 kudos

Hi @Prasanna Lakshmi you can use JDBC API to read and write the data from databricks.

1 kudos

12-21-2022 11:10:35 PM

2 More Replies

by Trodenn • New Contributor III

12-21-2022 3:01:01 PM

10461 Views
4 replies
1 kudos

How to merge two separate DELTA LIVE TABLE?

So I have two delta live tables. One that is the master table that contains all the prior data, and another table that contains all the new data for that specific day. I want to be able to merge those two table so that the master table contains would...

Data Engineering

10461 Views
4 replies
1 kudos

12-21-2022 3:01:01 PM

View Replies

Latest Reply

Ajay-Pandey
Databricks MVP

12-21-2022 7:55:02 PM

1 kudos

@Rishabh Pandey

1 kudos

12-21-2022 7:55:02 PM

3 More Replies

by Mahesh_789 • Databricks Partner

12-21-2022 9:04:43 PM

1258 Views
0 replies
1 kudos

While accessing the data on recipient side using delta_sharing.load_table_changes_as_spark(), it shows data of all versions.

When I tried to access specific version data and set the arguments value to the specific number, I get all version data.data1 = delta_sharing.load_table_changes_as_spark(table_url, starting_version=1, ending_version=1)data2 = delta_sharing.load_table...

Data Engineering

1258 Views
0 replies
1 kudos

12-21-2022 9:04:43 PM

by kmckee • New Contributor II

12-21-2022 11:48:51 AM

1587 Views
0 replies
1 kudos

Trouble Displaying Full Size Images from Spark Dataframe

Hi, I have followed this guide (https://learn.microsoft.com/en-us/azure/databricks/_static/notebooks/image-data-source.html) to successfully load some image data into a spark df and display it as a thumbnail. I would like to display a single image fr...

Data Engineering

1587 Views
0 replies
1 kudos

12-21-2022 11:48:51 AM

by weldermartins • Honored Contributor

12-20-2022 2:20:51 PM

5287 Views
3 replies
6 kudos

Resolved! Function When + Dictionary.

Hey everyone, I'm avoiding repeating the When Function for 12x, so I thought of the dictionary. I don't know if it's a limitation of the Spark function or a Logic error. Does the function allow this concatenation?

Data Engineering

5287 Views
3 replies
6 kudos

12-20-2022 2:20:51 PM

View Replies

Latest Reply

weldermartins
Honored Contributor

12-21-2022 9:05:00 AM

6 kudos

Hello everyone, I found this alternative to reduce repeated code.custoDF = (custoDF.withColumn('month', col('Nummes').cast('string')) .replace(months, subset=['month']))

6 kudos

12-21-2022 9:05:00 AM

2 More Replies

by sfalquier • New Contributor II

12-21-2022 2:29:18 AM

3601 Views
3 replies
0 kudos

HTTP 403 on git-credentials API

Hi,I am trying to set git credentials for my service principal. I follow the process described here but I get a 403 error when making the POST request to ${DATABRICKS_HOST}/api/2.0/git-credentials with service principal token.By the way, I also canno...

Data Engineering

3601 Views
3 replies
0 kudos

12-21-2022 2:29:18 AM

View Replies

Latest Reply

Vivian_Wilfred
Databricks Employee

12-21-2022 7:48:45 AM

0 kudos

Hi @Sébastien FALQUIER it works for me, there are no restrictions. Maybe the PAT token you generated for the service principle got expired. Can you generate a new token and try to run GET/git-credentials API?How are you creating PAT for service prin...

0 kudos

12-21-2022 7:48:45 AM

2 More Replies

by martcerv • New Contributor II

07-15-2022 1:07:23 PM

4541 Views
4 replies
2 kudos

Cloud provider launch failure

When I want to create a cluster a get this error message:DetailsAWS API error code: InvalidGroup.NotFoundAWS error message: The security group 'sg-0ded75eefd66bf421' does not exist in VPC 'vpc-0ec7da3d5977f6ec9'And when I inspect the security groups ...

Data Engineering

4541 Views
4 replies
2 kudos

07-15-2022 1:07:23 PM

View Replies

Latest Reply

AminChad_22427
New Contributor II

10-19-2022 5:55:39 AM

2 kudos

Hi, I am running into a similar issue. but in my case, the security has been deleted by mistake.Is there a way to make Databricks recreate the missing group ?@Kaniz Fatma , where can the CreateSecurityGroup command be ran ? Does it change the securi...

2 kudos

10-19-2022 5:55:39 AM

3 More Replies

by sudhanshu1 • New Contributor III

12-21-2022 4:43:11 AM

1087 Views
0 replies
0 kudos

Structured Streaming

I need some solution for below problem.We have set of json files which are keep coming to aws s3, these files contains details for a property . please note 1 property can have 10-12 rows in this json file. Attached is sample json file.We need to read...

Data Engineering

1087 Views
0 replies
0 kudos

12-21-2022 4:43:11 AM

by KVNARK • Honored Contributor II

12-13-2022 4:28:08 AM

5275 Views
4 replies
13 kudos

Resolved! To practice Databricks SQL

Is there any sand box kind of thing where we can do some hands-on on Databricks SQL/run the Note books attaching to the Clusters apart from the free trial provided by Databricks.

Data Engineering

5275 Views
4 replies
13 kudos

12-13-2022 4:28:08 AM

View Replies

Latest Reply

Harun
Honored Contributor

12-14-2022 4:09:25 AM

13 kudos

Databricks SQL workspace will be available only for Databricks Premium service. If you have Azure Pass subscription, then you can able to get it for practicing it.

13 kudos

12-14-2022 4:09:25 AM

3 More Replies

by Jyo777 • Contributor

12-19-2022 7:06:09 PM

2348 Views
4 replies
0 kudos

Hi, Has anyone cleared professional DE? please advise on professional data engineer exam. will advance DE learning path be sufficient? Or need to fol...

Hi,Has anyone cleared professional DE? please advise on professional data engineer exam. will advance DE learning path be sufficient? Or need to follow some other resource as well.

Data Engineering

2348 Views
4 replies
0 kudos

12-19-2022 7:06:09 PM

View Replies

Latest Reply

youssefmrini
Databricks Employee

12-21-2022 1:51:45 AM

0 kudos

Hello Have a look at this link http://msdatalab.net/how-to-pass-the-professional-databricks-data-engineering/

0 kudos

12-21-2022 1:51:45 AM

3 More Replies

by architect • New Contributor

12-21-2022 12:45:52 AM

2310 Views
1 replies
0 kudos

Does Databricks provide a mechanism to have rate limiting for receivers?

from pyspark.sql import SparkSession scala_version = '2.12' spark_version = '3.3.0' packages = [ f'org.apache.spark:spark-sql-kafka-0-10_{scala_version}:{spark_version}', 'org.apache.kafka:kafka-clients:3.2.1' ] spark = SparkSession.bui...

Data Engineering

2310 Views
1 replies
0 kudos

12-21-2022 12:45:52 AM

View Replies

Latest Reply

Rajani
Contributor II

12-21-2022 1:48:56 AM

0 kudos

hi @Software Architect i dont think so

0 kudos

12-21-2022 1:48:56 AM

by Pranjan • Databricks Partner

12-12-2022 1:50:32 AM

4716 Views
7 replies
1 kudos

Resolved! Badge Not Received for - Databricks Lakehouse Fundamentals Accreditation (V2)

Hi TeamI have passed the Databricks Lakehouse Fundamentals Accreditation (V2) on Dec 8th.Still have not received the Badge in credentials or any email of that kind.Please have a look.@Kaniz Fatma

Data Engineering

4716 Views
7 replies
1 kudos

12-12-2022 1:50:32 AM

View Replies

Latest Reply

Tromen026
New Contributor II

12-20-2022 9:45:08 PM

1 kudos

I wonder how much attempt you set to create this type of excellent informative web site.marco's pizza starr avedomino's pizza price

1 kudos

12-20-2022 9:45:08 PM

6 More Replies

by Smitha1 • Databricks Partner

12-01-2022 12:57:06 PM

3353 Views
3 replies
2 kudos

December exam free voucher for Databricks Certified Associate Developer for Apache Spark 3.0 exam.

Dear @Vidula Khanna Hope you're having great day. This is of HIGH priority for me, I've to schedule exam in December before slots are full.I gave Databricks Certified Associate Developer for Apache Spark 3.0 exam on 30th Nov but missed by one perc...

Data Engineering

3353 Views
3 replies
2 kudos

12-01-2022 12:57:06 PM

View Replies

Latest Reply

Aviral-Bhardwaj
Esteemed Contributor III

12-17-2022 11:31:12 PM

2 kudos

hey @Smitha Nelapati ,you can attend the below webinars and get the 75% off in Jan

2 kudos

12-17-2022 11:31:12 PM

2 More Replies

Databricks Community

Forum Posts

Resolved! reading multiple csv files using pathos.multiprocessing

How to get full result using DataFrame.Display method

Resolved! How to write Spark dataframe to Oracle database from databricks environment ?

How to merge two separate DELTA LIVE TABLE?

While accessing the data on recipient side using delta_sharing.load_table_changes_as_spark(), it shows data of all versions.

Trouble Displaying Full Size Images from Spark Dataframe

Resolved! Function When + Dictionary.

HTTP 403 on git-credentials API

Cloud provider launch failure

Structured Streaming

Resolved! To practice Databricks SQL

Hi, Has anyone cleared professional DE? please advise on professional data engineer exam. will advance DE learning path be sufficient? Or need to fol...

Does Databricks provide a mechanism to have rate limiting for receivers?

Resolved! Badge Not Received for - Databricks Lakehouse Fundamentals Accreditation (V2)

December exam free voucher for Databricks Certified Associate Developer for Apache Spark 3.0 exam.

File Arrival Trigger - Multiple tables

Issue while handling Deletes and Inserts in Struct...

DLT with CDC and schema changes in streaming pipel...

how to update not tracked column only in new row v...

Databricks Cost Estimation Template