Hi All, I need to install a spark-xml package from a notebook cell (hoping it will work on a DLT cluster). Maven Package: com.databricks:spark-xml_2.12:0.16.0Can anyone help me with the command to install from the notebook cell? Fairly new to all thi...
Hi @Jason Johnson​ I'm sorry you could not find a solution to your problem in the answers provided.Our community strives to provide helpful and accurate information, but sometimes an immediate solution may only be available for some issues.I suggest ...
Databricks Community version - Unable to clone a public git repository, as the 'Repository' tab that should appear below 'Workspace' tab on the portal does not appear and I am not aware of any alternate method. I have referred to some documents on th...
Hi @Jay Kumar​ Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedback wi...
Currently, I just manually copy paste ​a code from an excel sheet and paste it on a databricks notebook and run for results, then, copy paste the results to the same workbook. I'm sure there's a faster way to do it. The only solutions i can find is u...
Hi @Hanna Wade​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!
We have just started working with databricks in one of my university modules, and the lecturers gave us a set of commands to practice saving data in the FileStore. One of the commands was the following:dbutils .fs.cp("/ databricks - datasets / weathh...
Hi @Konrad Kawka​ I'm sorry you could not find a solution to your problem in the answers provided.Our community strives to provide helpful and accurate information, but sometimes an immediate solution may only be available for some issues.I suggest ...
Hi,I am using the VM family Lasv3, which incorporate a NVMe SSD. I would like to take advantage of this huge amount of space but I cannot find where this disk is mounted. Does someone know where this disk is mounted and if it can be used as local dri...
Hi @Alvaro Moure​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thank...
I'm not able to get the SET command to work when using sql in DLT pipeline. I am copying the code from this documentation https://docs.databricks.com/workflows/delta-live-tables/delta-live-tables-sql-ref.html#sql-spec (relevant code below). When I ru...
Hi @Oliver Teng​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks...
I want to fetch new data from kinesis source for every minute. I'm using "minFetchPeriod" option and specified 60s. But this doesn't seem to be working.Streaming query: spark \ .readStream \ .format("kinesis") \ .option("streamName", kinesis_stream_...
Hi @Pranathi Girish​ Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedb...
Hi @imma marra​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers yo...
Hello. Following an older question SQL Declare Variable equivalent in databricks, we managed to find through the following article Converting Stored Procedures to Databricks | by Ryan Chynoweth | Dec, 2022 | Medium, a way to declaring more complicate...
Hi @ELENI GEORGOUSI​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Th...
This article rightly suggests to install `ray` with `%pip`, although it fails to mention that installing it as a cluster library won't work.The reason, I think, is that `setup_ray_cluster` will use `sys.executable` (ie `/local_disk0/.ephemeral_nfs/en...
Ugly, but this seems to work for nowimport sys
import os
import shutil
from ray.util.spark import setup_ray_cluster, shutdown_ray_cluster
shutil.copy(
"/local_disk0/.ephemeral_nfs/cluster_libraries/python/bin/ray",
os.path.dirname(sys.execu...
Suppose if the query is long one and its commented due to some issues later I wanted to run that cell, is there any shortcut to enable the entire cell. The cell is with 800 lines of code and each line is commented with # symbol and I want to enable i...
@KVNARK .​ Hi,If I understand correctly, you want to either comment or disable comment on your entire cell using a Shortcut.If that's the case, To do for the whole cell -> Do a Ctrl + A, then you can use Ctrl + / in Windows. It will add # to all your...
It looks like there are other issues. I saved the model generated with the code above in mlflowWhen I try to reload it with this code:import mlflow
model = mlflow.spark.load_model('runs:/cb6ff62587a0404cabeadd47e4c9408a/model') It works in a notebook...
Just updating it is possible this issue has now been addressed.As before working on Azure Databricks 11.3 DBRInserting into managed table:Also appears to be addressed for autoloader insertion into unmanaged table
Starting from databricks 12.2 LTS, the explode function can be used in the FROM statement to manipulate data in new and powerful ways. This function takes an array column as input and returns a new row for each element in the array, offering new poss...
Hi,Assume that I have a delta table stored on an Azure storage account. When new records arrive, I repeat the transformation and overwrite the existing table. (DF.write
.format("delta")
.mode("overwrite")
.option("...
the overwrite will add new files, keep the old ones and in a log keeps track of what is current data and what is old data.If the overwrite fails, you will get an error message in the spark program, and the data to be overwritten will still be the cur...