Data Engineering

Forum Posts

Sorted by:

by Matt_Johnston • New Contributor III

01-04-2022 8:04:19 AM

2466 Views
4 replies
4 kudos

Resolved! Disk Type in Azure Databricks

Hi There,How are the disks tiers determined in Azure Databricks? We are currently using a pool which is using Standard DS3 v2 Virtual Machines, all with Premium SSD disks. Is there a way to change the tier of the disks?Thanks

Data Engineering

2466 Views
4 replies
4 kudos

01-04-2022 8:04:19 AM

View Replies

Latest Reply

Atanu
Esteemed Contributor

01-11-2022 8:10:21 PM

4 kudos

I think we do not have option to change the disk type at this moment. but I would like to request you to raise a feature request through azure support if you are azure databricks user. if aws you can do the same from - https://docs.databricks.com/res...

4 kudos

01-11-2022 8:10:21 PM

3 More Replies

by Shridhar • New Contributor

10-17-2018 6:24:35 PM

12244 Views
2 replies
2 kudos

Resolved! Load multiple csv files into a dataframe in order

I can load multiple csv files by doing something like: paths = ["file_1", "file_2", "file_3"] df = sqlContext.read .format("com.databricks.spark.csv") .option("header", "true") .load(paths) But this doesn't seem to preserve the...

Data Engineering

12244 Views
2 replies
2 kudos

10-17-2018 6:24:35 PM

View Replies

Latest Reply

Jaswanth_Saniko
New Contributor III

01-12-2022 4:43:10 AM

2 kudos

val diamonds = spark.read.format("csv") .option("header", "true") .option("inferSchema", "true") .load("/FileStore/tables/11.csv","/FileStore/tables/12.csv","/FileStore/tables/13.csv") display(diamonds)This is working for me @Shridhar

2 kudos

01-12-2022 4:43:10 AM

1 More Replies

by Reza • New Contributor III

01-05-2022 8:34:29 AM

2167 Views
2 replies
0 kudos

Resolved! Can we order the widgets?

I have two text widgets (dbutils.widgets.text). One is called "start date" and another one is "end date". When I create them, they will be shown in alphabetic order (end_date, start_date). Is there any way that we can set the order when we create the...

Data Engineering

2167 Views
2 replies
0 kudos

01-05-2022 8:34:29 AM

View Replies

Latest Reply

Atanu
Esteemed Contributor

01-11-2022 8:16:22 PM

0 kudos

https://docs.databricks.com/notebooks/widgets.html all options available here I think. @Reza Rajabi , but we can crosscheck

0 kudos

01-11-2022 8:16:22 PM

1 More Replies

by timothy_uk • New Contributor III

12-13-2021 8:19:34 AM

1437 Views
4 replies
0 kudos

Resolved! Zombie .Net Spark Databricks Job (CourseGrainedExecutorBackend)

Hi all,Environment:Nodes: Standard_E8s_v3Databricks Runtime: 9.0.NET for Apache Spark 2.0.0I'm invoking spark submit to run a .Net Spark job hosted in Azure Databricks. The job is written in C#.Net with its only transformation and action, reading a C...

Data Engineering

1437 Views
4 replies
0 kudos

12-13-2021 8:19:34 AM

View Replies

Latest Reply

jose_gonzalez
Moderator

01-05-2022 5:22:23 PM

0 kudos

Hi @Timothy Lin ,I will recommend to not use spark.stop() or System.exit(0) in your code because it will explicitly stop the Spark context but the graceful shutdown and handshake with databricks' job service does not happen.

0 kudos

01-05-2022 5:22:23 PM

3 More Replies

by Braxx • Contributor II

01-10-2022 7:07:19 AM

3053 Views
4 replies
3 kudos

Resolved! spark.read excel with formula

For some reason spark is not reading the data correctly from xlsx file in the column with a formula. I am reading it from a blob storage.Consider this simple data set The column "color" has formulas for all the cells like=VLOOKUP(A4,C3:D5,2,0)In case...

Data Engineering

3053 Views
4 replies
3 kudos

01-10-2022 7:07:19 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

01-10-2022 7:45:44 AM

3 kudos

the formula itself isprobably what is actually stored in the excel file.Excel translates this to NA.I only know of setErrorCellsToFallbackValues but I doubt if this is applicable in your case here.You could use a matching function (regexp f.e.) to d...

3 kudos

01-10-2022 7:45:44 AM

3 More Replies

by chandan_a_v • Valued Contributor

01-07-2022 5:21:17 AM

1840 Views
8 replies
4 kudos

Resolved! Spark Error : RScript (1243) terminated unexpectedly: Cannot call r_RBufferinitialize().

grid_slice %>% sdf_copy_to( sc = sc, name = "grid_slice", overwrite = TRUE ) %>% sdf_repartition( partitions = min(n_executors * 3, NROW(grid_slice)), partition_by = "variable" ) %>% spark_apply( f = slice_data_wrapper, columns = c( variable...

Data Engineering

1840 Views
8 replies
4 kudos

01-07-2022 5:21:17 AM

View Replies

Latest Reply

chandan_a_v
Valued Contributor

01-10-2022 6:12:38 AM

4 kudos

Hi @Kaniz FatmaDid you find any solution? Please let us know

4 kudos

01-10-2022 6:12:38 AM

7 More Replies

by RiyazAli • Valued Contributor

01-04-2022 5:26:00 AM

2625 Views
4 replies
3 kudos

Resolved! Where does the files downloaded from wget get stored in Databricks?

Hey Team!All I'm trying is to download a csv file stored on S3 and read it using Spark.Here's what I mean:!wget https://s3.amazonaws.com/nyc-tlc/trip+data/yellow_tripdata_2020-01.csvIf i download this "yellow_tripdata_2020-01.csv" where exactly it wo...

Data Engineering

2625 Views
4 replies
3 kudos

01-04-2022 5:26:00 AM

View Replies

Latest Reply

RiyazAli
Valued Contributor

01-10-2022 10:56:13 PM

3 kudos

Hi @Kaniz Fatma , thanks for the remainder.Hey @Hubert Dudek - thank you very much for your prompt response.Initially, I was using urllib3 to 'GET' the data residing in the URL. So, I wanted an alternative for the same. Unfortunately, requests libr...

3 kudos

01-10-2022 10:56:13 PM

3 More Replies

by TheDataDexter • New Contributor III

01-07-2022 1:39:04 AM

2051 Views
3 replies
3 kudos

Resolved! Single-Node cluster works but Multi-Node clusters do not read data.

I am currently working with a VNET injected databricks workspace. At the moment I have mounted a the databricks cluster on an ADLS G2 resource. When running notebooks on a single node that read, transform, and write data we do not encounter any probl...

Data Engineering

2051 Views
3 replies
3 kudos

01-07-2022 1:39:04 AM

View Replies

Latest Reply

TheDataDexter
New Contributor III

01-10-2022 11:38:43 PM

3 kudos

@Werner Stinckens thank you for your reply. I will take a look into the netwerk configurations today.

3 kudos

01-10-2022 11:38:43 PM

2 More Replies

by GlenLewis • New Contributor III

01-09-2022 4:48:26 PM

2563 Views
3 replies
0 kudos

Resolved! Markup and table of contents is no longer working on Notebooks

Around 2 days ago, Markdown in our notebooks stopped working (the %md tag isn't visible but the headings appear as #Heading1. In addition, there are no longer any table of contents on any of my workbooks. Trying a different instance in Microsoft Az...

Data Engineering

2563 Views
3 replies
0 kudos

01-09-2022 4:48:26 PM

View Replies

Latest Reply

Anonymous
Not applicable

01-10-2022 8:07:42 AM

0 kudos

@Glen Lewis - Thank you for coming to the community with this. Would you be happy to mark your answer as best so other members can find the solution more readily?

0 kudos

01-10-2022 8:07:42 AM

2 More Replies

by saltuk • Contributor

01-09-2022 2:10:21 PM

843 Views
0 replies
0 kudos

Using Parquet, passing Partition on Insert Overwrite. Partition parenthesis includes equitation and it gives an error.

I am new on Spark sql, we are migrating our Cloudera to Databricks. there are a lot of SQLs done, only a few are on going. We are having some troubles during passing an argument and using it in an equitation on Partition section. LOGDATE is an argu...

Data Engineering

843 Views
0 replies
0 kudos

01-09-2022 2:10:21 PM

by Oricus_semicon • New Contributor

01-08-2022 8:10:39 PM

225 Views
0 replies
0 kudos

oricus-semicon.com

Oricus Semicon Solutions is an innovative Semiconductor Tools manufacturing company who, with almost 100 years of collective expertise, craft high tech bespoke tooling solutions for the global Semiconductor Assembly and Test industry.https://oricus-s...

Data Engineering

225 Views
0 replies
0 kudos

01-08-2022 8:10:39 PM

by SankaraiahNaray • New Contributor II

12-24-2016 1:01:28 AM

19852 Views
10 replies
6 kudos

Resolved! Not able to read text file from local file path - Spark CSV reader

We are using Spark CSV reader to read the csv file to convert as DataFrame and we are running the job on yarn-client, its working fine in local mode. We are submitting the spark job in edge node. But when we place the file in local file path instead...

Data Engineering

19852 Views
10 replies
6 kudos

12-24-2016 1:01:28 AM

View Replies

Latest Reply

Kaniz
Community Manager

01-07-2022 9:23:14 PM

6 kudos

Hi @Sankaraiah Narayanasamy , Seems like a bug in spark-shell command when reading a local file, But there is a workaround while running spark-submit command just specify in the command.--conf "spark.authenticate=false"SPARK-23476 for reference.

6 kudos

01-07-2022 9:23:14 PM

9 More Replies

by chaitanya • New Contributor II

08-13-2021 8:27:54 AM

2285 Views
3 replies
4 kudos

Resolved! While loading Data from blob to delta lake facing below issue

I'm calling the stored proc then store into pandas dataframe then creating list while creating list getting below error Databricks execution failed with error state Terminated. For more details please check the run page url: path An error occurred w...

Data Engineering

2285 Views
3 replies
4 kudos

08-13-2021 8:27:54 AM

View Replies

Latest Reply

shan_chandra
Honored Contributor III

01-06-2022 2:37:31 PM

4 kudos

@chaitanya , could you please try disabling arrow optimization and see if this resolves the issue?spark.sql.execution.arrow.enabled falsespark.sql.execution.arrow.pyspark.enabled false

4 kudos

01-06-2022 2:37:31 PM

2 More Replies

by sanjoydas6 • New Contributor III

11-16-2021 4:52:41 AM

4059 Views
15 replies
3 kudos

Resolved! Problem faced while trying to Reset my Community Edition Password

I have forgotten my Databricks Community Edition Password and is trying to Reset the same using the Forgot Password link. It is saying that an Email will be sent with the link to reset the password but the Email is not coming. However Databricks mail...

Data Engineering

4059 Views
15 replies
3 kudos

11-16-2021 4:52:41 AM

View Replies

Latest Reply

Kaniz
Community Manager

12-22-2021 9:51:12 AM

3 kudos

Hi @Sanjoy Das , We could not find an account associated to your email. Did you pass the correct email or did you delete your account? Can you please create a CE account or pass the correct email address over mail with which you used to browse the C...

3 kudos

12-22-2021 9:51:12 AM

14 More Replies

by maranBH • New Contributor III

11-24-2021 6:21:55 AM

1113 Views
4 replies
1 kudos

Resolved! Trained model artifact, CI/CD and Databricks without MLFlow.

Hi all,We are constructing our CI/CD pipelines with the Repos feature following this guide:https://databricks.com/blog/2021/09/20/part-1-implementing-ci-cd-on-databricks-using-databricks-notebooks-and-azure-devops.htmlI'm trying to implement my pipes...

Data Engineering

1113 Views
4 replies
1 kudos

11-24-2021 6:21:55 AM

View Replies

Latest Reply

sean_owen
Honored Contributor II

01-05-2022 7:14:39 PM

1 kudos

So you are managing your models with MLflow, and want to include them in a git repository?You can do that in a CI/CD process; it would run the mlflow CLI to copy the model you want (e.g. model:/my_model/production) to a git checkout and then commit i...

1 kudos

01-05-2022 7:14:39 PM

3 More Replies

User

Count

1602

736

344

284

247

Databricks

Forum Posts

Resolved! Disk Type in Azure Databricks

Resolved! Load multiple csv files into a dataframe in order

Resolved! Can we order the widgets?

Resolved! Zombie .Net Spark Databricks Job (CourseGrainedExecutorBackend)

Resolved! spark.read excel with formula

Resolved! Spark Error : RScript (1243) terminated unexpectedly: Cannot call r_RBufferinitialize().

Resolved! Where does the files downloaded from wget get stored in Databricks?

Resolved! Single-Node cluster works but Multi-Node clusters do not read data.

Resolved! Markup and table of contents is no longer working on Notebooks

Using Parquet, passing Partition on Insert Overwrite. Partition parenthesis includes equitation and it gives an error.

oricus-semicon.com

Resolved! Not able to read text file from local file path - Spark CSV reader

Resolved! While loading Data from blob to delta lake facing below issue

Resolved! Problem faced while trying to Reset my Community Edition Password

Resolved! Trained model artifact, CI/CD and Databricks without MLFlow.

Best way to parse Google Analytics data in Databri...

DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Not able to set run_as service_principal_name

Pyspark operations slowness in CLuster 14.3LTS as ...

[Databricks Assets Bundles] Workflow trigger on fi...

Resolved! Disk Type in Azure Databricks

Resolved! Load multiple csv files into a dataframe in order

Resolved! Can we order the widgets?

Resolved! Zombie .Net Spark Databricks Job (CourseGrainedExecutorBackend)

Resolved! spark.read excel with formula

Resolved! Spark Error : RScript (1243) terminated unexpectedly: Cannot call r___RBuffer__initialize().

Resolved! Where does the files downloaded from wget get stored in Databricks?

Resolved! Single-Node cluster works but Multi-Node clusters do not read data.

Resolved! Markup and table of contents is no longer working on Notebooks

Resolved! Not able to read text file from local file path - Spark CSV reader

Resolved! While loading Data from blob to delta lake facing below issue

Resolved! Problem faced while trying to Reset my Community Edition Password

Resolved! Trained model artifact, CI/CD and Databricks without MLFlow.

Resolved! Spark Error : RScript (1243) terminated unexpectedly: Cannot call r_RBufferinitialize().