Data Engineering

Forum Posts

Sorted by:

by JJ_LVS1 • New Contributor III

02-15-2023 1:47:04 PM

1734 Views
8 replies
7 kudos

Command to install spark package from notebook cell

Hi All, I need to install a spark-xml package from a notebook cell (hoping it will work on a DLT cluster). Maven Package: com.databricks:spark-xml_2.12:0.16.0Can anyone help me with the command to install from the notebook cell? Fairly new to all thi...

Data Engineering

1734 Views
8 replies
7 kudos

02-15-2023 1:47:04 PM

View Replies

Latest Reply

Anonymous
Not applicable

03-10-2023 7:30:57 PM

7 kudos

Hi @Jason Johnson I'm sorry you could not find a solution to your problem in the answers provided.Our community strives to provide helpful and accurate information, but sometimes an immediate solution may only be available for some issues.I suggest ...

7 kudos

03-10-2023 7:30:57 PM

7 More Replies

by bk5723 • New Contributor

02-15-2023 2:53:36 PM

2412 Views
5 replies
4 kudos

Resolved! Databricks Community version - Unable to clone a public git repository

Databricks Community version - Unable to clone a public git repository, as the 'Repository' tab that should appear below 'Workspace' tab on the portal does not appear and I am not aware of any alternate method. I have referred to some documents on th...

Data Engineering

2412 Views
5 replies
4 kudos

02-15-2023 2:53:36 PM

View Replies

Latest Reply

Anonymous
Not applicable

03-10-2023 7:28:14 PM

4 kudos

Hi @Jay Kumar Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedback wi...

4 kudos

03-10-2023 7:28:14 PM

4 More Replies

by HDW_14 • New Contributor

02-17-2023 9:12:30 AM

1468 Views
2 replies
1 kudos

Resolved! Is there a way to run Databricks query on vba?

Currently, I just manually copy paste a code from an excel sheet and paste it on a databricks notebook and run for results, then, copy paste the results to the same workbook. I'm sure there's a faster way to do it. The only solutions i can find is u...

Data Engineering

1468 Views
2 replies
1 kudos

02-17-2023 9:12:30 AM

View Replies

Latest Reply

Anonymous
Not applicable

03-10-2023 7:27:21 PM

1 kudos

Hi @Hanna Wade Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

1 kudos

03-10-2023 7:27:21 PM

1 More Replies

by kkawka1 • New Contributor III

02-16-2023 8:42:28 AM

1421 Views
6 replies
4 kudos

How to delete strings from the /FileStore/

We have just started working with databricks in one of my university modules, and the lecturers gave us a set of commands to practice saving data in the FileStore. One of the commands was the following:dbutils .fs.cp("/ databricks - datasets / weathh...

Data Engineering

1421 Views
6 replies
4 kudos

02-16-2023 8:42:28 AM

View Replies

Latest Reply

Anonymous
Not applicable

03-10-2023 7:25:40 PM

4 kudos

Hi @Konrad Kawka I'm sorry you could not find a solution to your problem in the answers provided.Our community strives to provide helpful and accurate information, but sometimes an immediate solution may only be available for some issues.I suggest ...

4 kudos

03-10-2023 7:25:40 PM

5 More Replies

by alvaro_databric • New Contributor III

02-13-2023 6:39:29 AM

989 Views
2 replies
2 kudos

How to access hard disk attached to cluster?

Hi,I am using the VM family Lasv3, which incorporate a NVMe SSD. I would like to take advantage of this huge amount of space but I cannot find where this disk is mounted. Does someone know where this disk is mounted and if it can be used as local dri...

Data Engineering

989 Views
2 replies
2 kudos

02-13-2023 6:39:29 AM

View Replies

Latest Reply

Anonymous
Not applicable

03-10-2023 7:13:09 PM

2 kudos

Hi @Alvaro Moure Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thank...

2 kudos

03-10-2023 7:13:09 PM

1 More Replies

by oteng • New Contributor III

02-16-2023 4:03:10 PM

1047 Views
2 replies
1 kudos

SET configuration in SQL DLT pipeline not working

I'm not able to get the SET command to work when using sql in DLT pipeline. I am copying the code from this documentation https://docs.databricks.com/workflows/delta-live-tables/delta-live-tables-sql-ref.html#sql-spec (relevant code below). When I ru...

Data Engineering

1047 Views
2 replies
1 kudos

02-16-2023 4:03:10 PM

View Replies

Latest Reply

Anonymous
Not applicable

03-10-2023 6:06:52 PM

1 kudos

Hi @Oliver Teng Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks...

1 kudos

03-10-2023 6:06:52 PM

1 More Replies

by pranathisg97 • New Contributor III

02-15-2023 4:59:57 AM

1414 Views
7 replies
0 kudos

Resolved! Fetch new data from kinesis for every minute.

I want to fetch new data from kinesis source for every minute. I'm using "minFetchPeriod" option and specified 60s. But this doesn't seem to be working.Streaming query: spark \ .readStream \ .format("kinesis") \ .option("streamName", kinesis_stream_...

Data Engineering

1414 Views
7 replies
0 kudos

02-15-2023 4:59:57 AM

View Replies

Latest Reply

Anonymous
Not applicable

03-10-2023 6:04:19 PM

0 kudos

Hi @Pranathi Girish Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedb...

0 kudos

03-10-2023 6:04:19 PM

6 More Replies

by ima94 • New Contributor II

02-13-2023 8:01:42 AM

2298 Views
2 replies
1 kudos

read cdm error: java.util.NoSuchElementException: None.get

Hi all, I'm trying to read cdm file and get the error in the image (I replaced the names in uppercase). Any ideas on how to solve it?Thank you!

Data Engineering

2298 Views
2 replies
1 kudos

02-13-2023 8:01:42 AM

View Replies

Latest Reply

Anonymous
Not applicable

03-10-2023 6:02:41 PM

1 kudos

Hi @imma marra Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers yo...

1 kudos

03-10-2023 6:02:41 PM

1 More Replies

by elgeo • Valued Contributor II

02-15-2023 4:50:42 AM

1047 Views
2 replies
1 kudos

Resolved! Declaring parameters - SQL options

Hello. Following an older question SQL Declare Variable equivalent in databricks, we managed to find through the following article Converting Stored Procedures to Databricks | by Ryan Chynoweth | Dec, 2022 | Medium, a way to declaring more complicate...

Data Engineering

1047 Views
2 replies
1 kudos

02-15-2023 4:50:42 AM

View Replies

Latest Reply

Anonymous
Not applicable

03-10-2023 6:00:20 PM

1 kudos

Hi @ELENI GEORGOUSI Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Th...

1 kudos

03-10-2023 6:00:20 PM

1 More Replies

by Fed • New Contributor III

03-10-2023 12:39:45 PM

902 Views
1 replies
2 kudos

Resolved! Ray as a cluster library instead of notebook-scoped library

This article rightly suggests to install `ray` with `%pip`, although it fails to mention that installing it as a cluster library won't work.The reason, I think, is that `setup_ray_cluster` will use `sys.executable` (ie `/local_disk0/.ephemeral_nfs/en...

Data Engineering

902 Views
1 replies
2 kudos

03-10-2023 12:39:45 PM

View Replies

Latest Reply

Fed
New Contributor III

03-10-2023 1:03:14 PM

2 kudos

Ugly, but this seems to work for nowimport sys import os import shutil from ray.util.spark import setup_ray_cluster, shutdown_ray_cluster shutil.copy( "/local_disk0/.ephemeral_nfs/cluster_libraries/python/bin/ray", os.path.dirname(sys.execu...

2 kudos

03-10-2023 1:03:14 PM

by KVNARK • Honored Contributor II

03-09-2023 9:18:39 PM

1349 Views
2 replies
4 kudos

Resolved! enabling the cell in Databricks.

Suppose if the query is long one and its commented due to some issues later I wanted to run that cell, is there any shortcut to enable the entire cell. The cell is with 800 lines of code and each line is commented with # symbol and I want to enable i...

Data Engineering

1349 Views
2 replies
4 kudos

03-09-2023 9:18:39 PM

View Replies

Latest Reply

pvignesh92
Honored Contributor

03-10-2023 2:30:25 AM

4 kudos

@KVNARK . Hi,If I understand correctly, you want to either comment or disable comment on your entire cell using a Shortcut.If that's the case, To do for the whole cell -> Do a Ctrl + A, then you can use Ctrl + / in Windows. It will add # to all your...

4 kudos

03-10-2023 2:30:25 AM

1 More Replies

by serg-v • New Contributor III

11-05-2022 8:30:20 AM

2144 Views
6 replies
3 kudos

Resolved! databricks-connect 11.3

Would there be databricks-connect for cluster version 11.3 ?If yes, when we should expect it?

Data Engineering

2144 Views
6 replies
3 kudos

11-05-2022 8:30:20 AM

View Replies

Latest Reply

Oliver_Floyd
Contributor

03-10-2023 12:43:07 AM

3 kudos

It looks like there are other issues. I saved the model generated with the code above in mlflowWhen I try to reload it with this code:import mlflow model = mlflow.spark.load_model('runs:/cb6ff62587a0404cabeadd47e4c9408a/model') It works in a notebook...

3 kudos

03-10-2023 12:43:07 AM

5 More Replies

by bb2312 • New Contributor II

02-20-2023 4:44:59 PM

1914 Views
2 replies
1 kudos

Issue with inserting multiple rows in Delta table with identity column

Running DBR 11.3 / Azure DatabricksTable definition below:%sql CREATE OR REPLACE TABLE demo2 ( id BIGINT GENERATED BY DEFAULT AS IDENTITY, product_type STRING, sales BIGINT ) USING DELTA LOCATION '/folderlocation/' TBLPROPERTIES ( 'delta.column...

Data Engineering

1914 Views
2 replies
1 kudos

02-20-2023 4:44:59 PM

View Replies

Latest Reply

bb2312
New Contributor II

03-09-2023 8:38:49 PM

1 kudos

Just updating it is possible this issue has now been addressed.As before working on Azure Databricks 11.3 DBRInserting into managed table:Also appears to be addressed for autoloader insertion into unmanaged table

1 kudos

03-09-2023 8:38:49 PM

1 More Replies

by Hubert-Dudek • Esteemed Contributor III

03-09-2023 2:25:07 PM

572 Views
1 replies
7 kudos

Starting from databricks 12.2 LTS, the explode function can be used in the FROM statement to manipulate data in new and powerful ways. This function t...

Starting from databricks 12.2 LTS, the explode function can be used in the FROM statement to manipulate data in new and powerful ways. This function takes an array column as input and returns a new row for each element in the array, offering new poss...

Data Engineering

572 Views
1 replies
7 kudos

03-09-2023 2:25:07 PM

View Replies

Latest Reply

Ajay-Pandey
Esteemed Contributor III

03-09-2023 8:08:51 PM

7 kudos

It's very useful for SQL developers.

7 kudos

03-09-2023 8:08:51 PM

by Mado • Valued Contributor II

03-08-2023 7:53:06 PM

1914 Views
2 replies
0 kudos

Overwriting the existing table in Databricks; Mechanism and History?

Hi,Assume that I have a delta table stored on an Azure storage account. When new records arrive, I repeat the transformation and overwrite the existing table. (DF.write .format("delta") .mode("overwrite") .option("...

Data Engineering

1914 Views
2 replies
0 kudos

03-08-2023 7:53:06 PM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

03-09-2023 3:57:29 AM

0 kudos

the overwrite will add new files, keep the old ones and in a log keeps track of what is current data and what is old data.If the overwrite fails, you will get an error message in the spark program, and the data to be overwritten will still be the cur...

0 kudos

03-09-2023 3:57:29 AM

1 More Replies

User

Count

1601

736

343

284

246

Databricks

Forum Posts

Command to install spark package from notebook cell

Resolved! Databricks Community version - Unable to clone a public git repository

Resolved! Is there a way to run Databricks query on vba?

How to delete strings from the /FileStore/

How to access hard disk attached to cluster?

SET configuration in SQL DLT pipeline not working

Resolved! Fetch new data from kinesis for every minute.

read cdm error: java.util.NoSuchElementException: None.get

Resolved! Declaring parameters - SQL options

Resolved! Ray as a cluster library instead of notebook-scoped library

Resolved! enabling the cell in Databricks.

Resolved! databricks-connect 11.3

Issue with inserting multiple rows in Delta table with identity column

Starting from databricks 12.2 LTS, the explode function can be used in the FROM statement to manipulate data in new and powerful ways. This function t...

Overwriting the existing table in Databricks; Mechanism and History?

DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Not able to set run_as service_principal_name

Pyspark operations slowness in CLuster 14.3LTS as ...

[Databricks Assets Bundles] Workflow trigger on fi...

Addressing Pipeline Error Handling in Databricks b...