cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

Nazar
by New Contributor II
  • 4441 Views
  • 5 replies
  • 5 kudos

Resolved! Incremental write

Hi All,I have a daily spark job that reads and joins 3-4 source tables and writes the df in a parquet format. This data frame consists of 100+ columns. As this job run daily, our deduplication logic identifies the latest record from each of source t...

  • 4441 Views
  • 5 replies
  • 5 kudos
Latest Reply
Nazar
New Contributor II
  • 5 kudos

Thanks werners

  • 5 kudos
4 More Replies
William_Scardua
by Valued Contributor
  • 5925 Views
  • 5 replies
  • 3 kudos

Resolved! Read just the new file ???

Hi guys,How can I read just the new file in a batch process ?Can you help me ? pleasThank you

  • 5925 Views
  • 5 replies
  • 3 kudos
Latest Reply
Ryan_Chynoweth
Honored Contributor III
  • 3 kudos

What type of file? Is the file stored in a storage account? Typically, you would read and write data with something like the following code: # read a parquet file df = spark.read.format("parquet").load("/path/to/file")   # write the data as a file df...

  • 3 kudos
4 More Replies
Meaz10
by New Contributor III
  • 1233 Views
  • 4 replies
  • 2 kudos

Resolved! Current DBR is not yet available to this notebook

Any one has an idea why i am getting this error:"The current DBR is not yet available to this notebook. Give it a second and try again!"

  • 1233 Views
  • 4 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

@Meysam az​ - Thank you for letting us know that the issue has been resolved and for the extra information.

  • 2 kudos
3 More Replies
Kaniz_Fatma
by Community Manager
  • 1833 Views
  • 2 replies
  • 1 kudos
  • 1833 Views
  • 2 replies
  • 1 kudos
Latest Reply
dazfuller
Contributor III
  • 1 kudos

A class is the definition and you can create many instances of them, just like classes in any other language. An object is the instance of the class, a singleton, and can be used to create features you might recognise as static methods.Often when wri...

  • 1 kudos
1 More Replies
Kaniz_Fatma
by Community Manager
  • 3024 Views
  • 1 replies
  • 1 kudos
  • 3024 Views
  • 1 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

We can check using this method:-import boto3from botocore.errorfactory import ClientErrors3 = boto3.client('s3')try: s3.head_object(Bucket='bucket_name', Key='file_path')except ClientError: # Not found pass

  • 1 kudos
User15787040559
by New Contributor III
  • 1099 Views
  • 2 replies
  • 0 kudos

What subset of mysql sql syntax we support in spark sql?

https://spark.apache.org/docs/latest/sql-ref-syntax.html

  • 1099 Views
  • 2 replies
  • 0 kudos
Latest Reply
brickster_2018
Esteemed Contributor
  • 0 kudos

Spark 3 has experimental support for ANSI. Read more here:https://spark.apache.org/docs/3.0.0/sql-ref-ansi-compliance.html

  • 0 kudos
1 More Replies
HafidzZulkifli
by New Contributor II
  • 12681 Views
  • 8 replies
  • 0 kudos

How to import data and apply multiline and charset UTF8 at the same time?

I'm running Spark 2.2.0 at the moment. Currently I'm facing an issue when importing data of Mexican origin, where the characters can have special characters and with multiline for certain columns. Ideally, this is the command I'd like to run: T_new_...

  • 12681 Views
  • 8 replies
  • 0 kudos
Latest Reply
DianGermishuize
New Contributor II
  • 0 kudos

You could also potentially use the .withColumns() function on the data frame, and use the pyspark.sql.functions.encode function to convert the characterset to the one you need. Convert the Character Set/Encoding of a String field in a PySpark DataFr...

  • 0 kudos
7 More Replies
AndreStarker
by New Contributor III
  • 1608 Views
  • 3 replies
  • 2 kudos

Certification status

I've passed the "Databricks Certified Associate Developer for Apache Spark 3.0 - Scala" certification exam on 7/17/2021. The Webassessor record says I should receive certification status from Databricks within a week. I have not received any communi...

  • 1608 Views
  • 3 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

@Andre Starker​ - Congratulations!!!

  • 2 kudos
2 More Replies
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels