cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Prototype998
by New Contributor III
  • 6384 Views
  • 5 replies
  • 2 kudos

Resolved! reading multiple csv files using pathos.multiprocessing

I'm using PySpark and Pathos to read numerous CSV files and create many DF, but I keep getting this problem.code for the same:-from pathos.multiprocessing import ProcessingPooldef readCsv(path):  return spark.read.csv(path,header=True)csv_file_list =...

dbx_error
  • 6384 Views
  • 5 replies
  • 2 kudos
Latest Reply
Prototype998
New Contributor III
  • 2 kudos

@Ajay Pandey​ @Rishabh Pandey​ 

  • 2 kudos
4 More Replies
ratnakarsinha
by New Contributor II
  • 25089 Views
  • 3 replies
  • 0 kudos

How to get full result using DataFrame.Display method

Hi, Dataframe.Display method in Databricks notebook fetches only 1000 rows by default. Is there a way to change this default to display and download full result (more than 1000 rows) in python? Thanks, Ratnakar.

  • 25089 Views
  • 3 replies
  • 0 kudos
Latest Reply
ramravi
Contributor II
  • 0 kudos

display method doesn't have the option to choose the number of rows. Use the show method. It is not neat and you can't do visualizations and downloads.

  • 0 kudos
2 More Replies
Trodenn
by New Contributor III
  • 10437 Views
  • 4 replies
  • 1 kudos

How to merge two separate DELTA LIVE TABLE?

So I have two delta live tables. One that is the master table that contains all the prior data, and another table that contains all the new data for that specific day. I want to be able to merge those two table so that the master table contains would...

  • 10437 Views
  • 4 replies
  • 1 kudos
Latest Reply
Ajay-Pandey
Databricks MVP
  • 1 kudos

@Rishabh Pandey​ 

  • 1 kudos
3 More Replies
Mahesh_789
by New Contributor II
  • 1247 Views
  • 0 replies
  • 1 kudos

While accessing the data on recipient side using delta_sharing.load_table_changes_as_spark(), it shows data of all versions.

When I tried to access specific version data and set the arguments value to the specific number, I get all version data.data1 = delta_sharing.load_table_changes_as_spark(table_url, starting_version=1, ending_version=1)data2 = delta_sharing.load_table...

  • 1247 Views
  • 0 replies
  • 1 kudos
kmckee
by New Contributor II
  • 1583 Views
  • 0 replies
  • 1 kudos

Trouble Displaying Full Size Images from Spark Dataframe

Hi, I have followed this guide (https://learn.microsoft.com/en-us/azure/databricks/_static/notebooks/image-data-source.html) to successfully load some image data into a spark df and display it as a thumbnail. I would like to display a single image fr...

  • 1583 Views
  • 0 replies
  • 1 kudos
weldermartins
by Honored Contributor
  • 5269 Views
  • 3 replies
  • 6 kudos

Resolved! Function When + Dictionary.

Hey everyone, I'm avoiding repeating the When Function for 12x, so I thought of the dictionary. I don't know if it's a limitation of the Spark function or a Logic error. Does the function allow this concatenation?

image
  • 5269 Views
  • 3 replies
  • 6 kudos
Latest Reply
weldermartins
Honored Contributor
  • 6 kudos

Hello everyone, I found this alternative to reduce repeated code.custoDF = (custoDF.withColumn('month', col('Nummes').cast('string')) .replace(months, subset=['month']))

  • 6 kudos
2 More Replies
sfalquier
by New Contributor II
  • 3584 Views
  • 3 replies
  • 0 kudos

HTTP 403 on git-credentials API

Hi,I am trying to set git credentials for my service principal. I follow the process described here but I get a 403 error when making the POST request to ${DATABRICKS_HOST}/api/2.0/git-credentials with service principal token.By the way, I also canno...

  • 3584 Views
  • 3 replies
  • 0 kudos
Latest Reply
Vivian_Wilfred
Databricks Employee
  • 0 kudos

Hi @Sébastien FALQUIER​ it works for me, there are no restrictions. Maybe the PAT token you generated for the service principle got expired. Can you generate a new token and try to run GET/git-credentials API?How are you creating PAT for service prin...

  • 0 kudos
2 More Replies
martcerv
by New Contributor II
  • 4524 Views
  • 4 replies
  • 2 kudos

Cloud provider launch failure

When I want to create a cluster a get this error message:DetailsAWS API error code: InvalidGroup.NotFoundAWS error message: The security group 'sg-0ded75eefd66bf421' does not exist in VPC 'vpc-0ec7da3d5977f6ec9'And when I inspect the security groups ...

  • 4524 Views
  • 4 replies
  • 2 kudos
Latest Reply
AminChad_22427
New Contributor II
  • 2 kudos

Hi, I am running into a similar issue. but in my case, the security has been deleted by mistake.Is there a way to make Databricks recreate the missing group ?@Kaniz Fatma​ , where can the CreateSecurityGroup command be ran ? Does it change the securi...

  • 2 kudos
3 More Replies
sudhanshu1
by New Contributor III
  • 1076 Views
  • 0 replies
  • 0 kudos

Structured Streaming

I need some solution for below problem.We have set of json files which are keep coming to aws s3, these files contains details for a property . please note 1 property can have 10-12 rows in this json file. Attached is sample json file.We need to read...

  • 1076 Views
  • 0 replies
  • 0 kudos
KVNARK
by Honored Contributor II
  • 5241 Views
  • 4 replies
  • 13 kudos

Resolved! To practice Databricks SQL

Is there any sand box kind of thing where we can do some hands-on on Databricks SQL/run the Note books attaching to the Clusters apart from the free trial provided by Databricks.

  • 5241 Views
  • 4 replies
  • 13 kudos
Latest Reply
Harun
Honored Contributor
  • 13 kudos

Databricks SQL workspace will be available only for Databricks Premium service. If you have Azure Pass subscription, then you can able to get it for practicing it.

  • 13 kudos
3 More Replies
Jyo777
by Contributor
  • 2345 Views
  • 4 replies
  • 0 kudos

Hi, Has anyone cleared professional DE? please advise on professional data engineer exam. will advance DE learning path be sufficient? Or need to fol...

Hi,Has anyone cleared professional DE? please advise on professional data engineer exam. will advance DE learning path be sufficient? Or need to follow some other resource as well.

  • 2345 Views
  • 4 replies
  • 0 kudos
Latest Reply
youssefmrini
Databricks Employee
  • 0 kudos

Hello Have a look at this link http://msdatalab.net/how-to-pass-the-professional-databricks-data-engineering/

  • 0 kudos
3 More Replies
architect
by New Contributor
  • 2306 Views
  • 1 replies
  • 0 kudos

Does Databricks provide a mechanism to have rate limiting for receivers?

from pyspark.sql import SparkSession   scala_version = '2.12' spark_version = '3.3.0'   packages = [ f'org.apache.spark:spark-sql-kafka-0-10_{scala_version}:{spark_version}', 'org.apache.kafka:kafka-clients:3.2.1' ]   spark = SparkSession.bui...

  • 2306 Views
  • 1 replies
  • 0 kudos
Latest Reply
Rajani
Contributor II
  • 0 kudos

hi @Software Architect​  i dont think so

  • 0 kudos
Pranjan
by New Contributor II
  • 4702 Views
  • 7 replies
  • 1 kudos

Resolved! Badge Not Received for - Databricks Lakehouse Fundamentals Accreditation (V2)

Hi TeamI have passed the Databricks Lakehouse Fundamentals Accreditation (V2) on Dec 8th.Still have not received the Badge in credentials or any email of that kind.Please have a look.@Kaniz Fatma​ â€‹ 

  • 4702 Views
  • 7 replies
  • 1 kudos
Latest Reply
Tromen026
New Contributor II
  • 1 kudos

I wonder how much attempt you set to create this type of excellent informative web site.marco's pizza starr avedomino's pizza price

  • 1 kudos
6 More Replies
Smitha1
by Valued Contributor II
  • 3340 Views
  • 3 replies
  • 2 kudos

December exam free voucher for Databricks Certified Associate Developer for Apache Spark 3.0 exam.

Dear @Vidula Khanna​  Hope you're having great day. This is of HIGH priority for me, I've to schedule exam in December before slots are full.I gave Databricks Certified Associate Developer for Apache Spark 3.0 exam on 30th Nov but missed by one perc...

  • 3340 Views
  • 3 replies
  • 2 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 2 kudos

hey @Smitha Nelapati​ ,you can attend the below webinars and get the 75% off in Jan ​ 

  • 2 kudos
2 More Replies
Labels