cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

Phani1
by Valued Contributor II
  • 3290 Views
  • 2 replies
  • 3 kudos

Resolved! Terminated with exception: Could not initialize class org.rocksdb.Options

Problem Statement : When running Delta Live tables ,it is giving the error.Error Message : Could not initialize class org.rocksdb.Optionsorg.apache.spark.sql.streaming.StreamingQueryException: Query cpicpg_us_tgt_amz_bronze [id = a42eec82-0ee8-41b4-9...

  • 3290 Views
  • 2 replies
  • 3 kudos
Latest Reply
Phani1
Valued Contributor II
  • 3 kudos

Hi Team ,Thanks for your response, I faced this issue while executing the Delta Live tables / pipeline.Initially i choose product edition as Core and attached 4 notebooks to the pipeline and each notebook have Bronze and silver tables creation. duri...

  • 3 kudos
1 More Replies
Phani1
by Valued Contributor II
  • 5323 Views
  • 1 replies
  • 0 kudos

Execute tasks parallel to process multiple files parallel

Hi all, If we have multiple tasks under the job, How to invoke a specific task under a job.Do we have any API to invoke Job and its specific tasks instead of Job.Use case: When we receive multiple messages from the event hub, each underlying task in ...

  • 5323 Views
  • 1 replies
  • 0 kudos
Latest Reply
Phani1
Valued Contributor II
  • 0 kudos

Thanks for your response, My question is ,if we have multiple tasks in a job ,How can we invoke specific task, I can see API to invoke the job but not a particular task in it. Kindly find attachment for your reference.

  • 0 kudos
klllmmm
by New Contributor II
  • 4305 Views
  • 3 replies
  • 1 kudos

Error as no such file when reading CSV file using pandas

I'm trying to read a CSV file saved in data using pandas read_csv function. But it gives No such file error.%fs ls /FileStore/tables/   df= pd.read_csv('/dbfs/FileStore/tables/CREDIT_1.CSV')     df= pd.read_csv('/dbfs:/FileStore/tables/CREDIT_1.CSV')...

image
  • 4305 Views
  • 3 replies
  • 1 kudos
Latest Reply
klllmmm
New Contributor II
  • 1 kudos

Thanks to @Werner Stinckens​ for the answer.I understood that I have to use spark to read data from clusters.

  • 1 kudos
2 More Replies
yopbibo
by Contributor II
  • 5165 Views
  • 3 replies
  • 4 kudos

Resolved! Column name, starting with a number

Hi,I see it is possible to start a column name with a number, like `123_test`And store in a hive table with a location in delta.On that documentation https://www.stitchdata.com/docs/destinations/databricks-delta/reference#transformations--column-nami...

  • 5165 Views
  • 3 replies
  • 4 kudos
Latest Reply
yopbibo
Contributor II
  • 4 kudos

ha ha, yes, I try to find back the right page in DB documentation. If you have it, please, share.

  • 4 kudos
2 More Replies
auser85
by New Contributor III
  • 2470 Views
  • 2 replies
  • 4 kudos

Resolved! Cache Select on Temp Table?

How might I cache a temp table?The documentation suggests it is possible: https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-cache.htmlConsider the following on DBR 10.5 and Spark 3.2.1:```%pythondf.createOrReplaceTempView("chan...

  • 2470 Views
  • 2 replies
  • 4 kudos
Latest Reply
auser85
New Contributor III
  • 4 kudos

Thank you! The newer documentation does indeed work for me.

  • 4 kudos
1 More Replies
Vibhor
by Contributor
  • 5845 Views
  • 5 replies
  • 2 kudos

Get current date as string in databricks using scala

I want to get current date in scala as a string for example today current date is 3rd jan, want to store it as a new variable dynamically as below, how to get it.val currdate : String = “20220103”when I am using val currdate = Calendar.getInstance.ge...

  • 5845 Views
  • 5 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hey @Vibhor Sethi​ Hope you are well!Thank you for posting your question and letting us know that you were able to resolve the issue. Would you be happy to mark it as the best solution? It would be really helpful for the other members too.Cheers!

  • 2 kudos
4 More Replies
SailajaB
by Valued Contributor III
  • 3056 Views
  • 2 replies
  • 5 kudos

An error occurred while calling o303.mount: Operation failed: "This request is not authorized to perform this operation

Hi Team,We are unable to mount storage container in below scenario We created Gen 2 using VNet and added firewall restrictions (i.e allow trusted sources)And deployed Data bricks workspace with out VNet injection. Is it possible to add databricks pub...

  • 3056 Views
  • 2 replies
  • 5 kudos
Latest Reply
Anonymous
Not applicable
  • 5 kudos

Hey @Sailaja B​ Hope everything is great!Does Hubert's response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?Thanks!

  • 5 kudos
1 More Replies
sheree
by New Contributor III
  • 2733 Views
  • 3 replies
  • 1 kudos

Resolved! I can't access to my account.

I can't access to my account.This acccount was created today(not community, after 14 days trial it will chargable)when I'm try to access my account it gives meInvalid email address or passwordNote: Emails/usernames are case-sensitiveI tried to reset ...

  • 2733 Views
  • 3 replies
  • 1 kudos
Latest Reply
sheree
New Contributor III
  • 1 kudos

I got a reset link from the community. Actually the problem was with my username ,it did not identify a character within my username which was my email id.

  • 1 kudos
2 More Replies
oussamak
by New Contributor II
  • 3120 Views
  • 1 replies
  • 2 kudos

How to install JAR libraries from ADLS? I'm having an error

I mounted the ADLS to my Azure Databricks resource and I keep on getting this error when I try to install a JAR from a container:Library installation attempted on the driver node of cluster 0331-121709-buk0nvsq and failed. Please refer to the followi...

  • 3120 Views
  • 1 replies
  • 2 kudos
Latest Reply
" src="" />
This widget could not be displayed.
This widget could not be displayed.
This widget could not be displayed.
  • 2 kudos

This widget could not be displayed.
I mounted the ADLS to my Azure Databricks resource and I keep on getting this error when I try to install a JAR from a container:Library installation attempted on the driver node of cluster 0331-121709-buk0nvsq and failed. Please refer to the followi...

This widget could not be displayed.
  • 2 kudos
This widget could not be displayed.
chandan_a_v
by Valued Contributor
  • 14599 Views
  • 6 replies
  • 6 kudos

Resolved! Spark Driver Out of Memory Issue

Hi, I am executing a simple job in Databricks for which I am getting below error. I increased the Driver size still I faced same issue. Spark config :from pyspark.sql import SparkSessionspark_session = SparkSession.builder.appName("Demand Forecasting...

  • 14599 Views
  • 6 replies
  • 6 kudos
Latest Reply
chandan_a_v
Valued Contributor
  • 6 kudos

I am getting the above issue while writing a Spark DF as a parquet file to AWS S3. Not doing any broadcast join actually.

  • 6 kudos
5 More Replies
William_Scardua
by Valued Contributor
  • 2241 Views
  • 1 replies
  • 2 kudos

Resolved! Best way to encrypt PII data

Hi guys, I have around 600GB per load, in you opnion, what is the best way to encrypt PII data in terms of performance ? (lib, cluster type, etc.)Thank youWilliam

  • 2241 Views
  • 1 replies
  • 2 kudos
Latest Reply
Prabakar
Databricks Employee
  • 2 kudos

Hello @William Scardua​ please check if the blog helps you.https://databricks.com/blog/2020/11/20/enforcing-column-level-encryption-and-avoiding-data-duplication-with-pii.html

  • 2 kudos
rahul3
by New Contributor
  • 2738 Views
  • 1 replies
  • 1 kudos

Facing mount/unmount issue while running same job parallelly with scala.

 Using above configuration in cluster, when I run databricks job parallelly with multiple request at a same time, then I am getting mount/unmount issue. For an example : When I make three request to databricks job , it run 3 jobs parallelly but somet...

image.png image.png
  • 2738 Views
  • 1 replies
  • 1 kudos
Latest Reply
Prabakar
Databricks Employee
  • 1 kudos

hi @rahul upadhyay​ are you using the same mount path /mnt/rahul in all the 3 jobs? Could you please add the full error message?

  • 1 kudos
Devarsh
by Contributor
  • 8389 Views
  • 3 replies
  • 7 kudos

Resolved! Getting the error 'No such file or directory', when trying to access the json file

I am trying to write in my google sheet through Databricks but when it comes to reading the json, file containing the credentials, I am getting the error that No such file or directory exists.import gspread     gc = gspread.service_account(filename='...

  • 8389 Views
  • 3 replies
  • 7 kudos
Latest Reply
Noopur_Nigam
Databricks Employee
  • 7 kudos

Hi @Devarsh Shah​ The issue is not with json file but the location you are specifying while reading.As suggested by @Werner Stinckens​ please start using spark API to read the json file as below:spark.read.format("json").load("testjson")Please check ...

  • 7 kudos
2 More Replies
BhagS
by New Contributor II
  • 5219 Views
  • 2 replies
  • 5 kudos

Resolved! Write Empty Delta file in Datalake

hi all,Currently, i am trying to write an empty delta file in data lake, to do this i am doing the following:Reading parquet file from my landing zone ( this file consists only of the schema of SQL tables)df=spark.read.format('parquet').load(landingZ...

image
  • 5219 Views
  • 2 replies
  • 5 kudos
Latest Reply
Noopur_Nigam
Databricks Employee
  • 5 kudos

Hi @bhagya s​ Since your source file is empty, there is no data file inside the centralizedZonePath directory i.e .parquet file is not created in the target location. However, _delta_log is the transaction log that holds the metadata of the delta for...

  • 5 kudos
1 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels