Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
When i run my command for readstream using .option("cloudFiles.useNotifications", "true") it start reading the files from Azure blob (please note that i did not provide the configuration like subscription id , clint id , connect string and all while...
Hi,I would like to share the following docs that might be able to help you with this issue. https://docs.databricks.com/ingestion/auto-loader/file-notification-mode.html#required-permissions-for-configuring-file-notification-for-adls-gen2-and-azure-b...
I have two github repo configured in Databricks Repos folder. repo_1 is run using a job and repo_2 is run/called from repo_1 using Dbutils.notebook.run command. dbutils.notebook.run("/Repos/repo_2/notebooks/notebook", 0, args)i am getting the follo...
I am having a similar issue... ecw_staging_nb_List = ['/Workspace/Repos/PRIMARY/UVVC_DATABRICKS_EDW/silver/nb_UPSERT_stg_ecw_insurance', '/Repos/PRIMARY/UVVC_DATABRICKS_EDW/silver/nb_UPSERT_stg_ecw_facilitygroups'] Adding workspace d...
Greetings, When trying to run the following command: %sh
mlflow sagemaker build-and-push-containerI get the following error:/databricks/python3/lib/python3.9/site-packages/click/core.py:2309: UserWarning: Virtualenv support is still experimental and ...
I am working on Databricks Notebook and trying to display a map using Floium and I keep getting this error > Command result size exceeds limit: Exceeded 20971520 bytes (current = 20973510)How can I get increase the memory limit?I already reduced the...
Hi, I have the same problem with keplergl, and the save to disk option, whilst helpful isn't super practical... So how does one plot large datasets in kepler?Any thought welcome
Is there a way to delete recursively files using a command in notebookssince in the below directory i have many combination of files like .txt,,png,.jpg but i only want to delete files with .csv example dbfs:/FileStore/.csv*
Hi @Rakesh Reddy Gopidi You can use the os module to iterate over a directory.By using a loop over the directory, you can check what the file ends with using .endsWith(".csv).After fetching all the files, you can remove it. Hope this helps..Cheers.
I have a command that is running notebooks in parallel using threading. I want the command to fail whenever one of the notebooks that is running fails. Right now it is just continuing to run the command.Below is the command line that I'm currently ru...
I wish to run a scala command, which I believe would normally be run from a scala command line rather than from within a notebook. It happens to be:scala [-cp scalatest-<version>.jar:...] org.scalatest.tools.Runner [arguments](scalatest_2.12__3.0.8.j...
Hi @David Vardy Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks...
Hello Team,I am using df.write command and the table is getting created. If you refer the below screenshot the table got created in Tables folder in dedicated sql pool. But i required in the External Tables folder. RegardsRK
if you actually write into Synapse, it is not an external table. the data resides on synapse.If you want to have an external table, write the data on your data lake in parquet/delta lake format and then create an external table on that location in s...
Hi all,I'm trying to run some functions from another notebook (data_process_notebook) in my main notebook, using the %run command command. When I run the command: %run ../path/to/data_process_notebook, it is able to complete successfully, no path, pe...
Hi, Since yesterday, without a known reason, some commands that used to run daily are now stuck in a "Running command" state. Commands as:dataframe.toPandas() dataframe.show(n=1) dataframe.description() dataframe.write.format("csv").save(location) ge...
Hi @Luiz Carneiro ,Could you split your Spark's actions into more CMD (paragraphs) and run one at a time to check where it could be taking the extra time. Also, Pandas only runs on your driver. Have you try to use Python or Scala APIs instead? in ca...
Hi all, So far I have been successfully using the CLI interface to upload files from my local machine to DBFS/FileStore/tables. Specifically, I have been using my terminal and the following command: databricks fs cp -r <MyLocalDataset> dbfs:/FileStor...
hi @Ignacio Castineiras ,If Arjun.kr's fully answered your question, would you be happy to mark their answer as best so that others can quickly find the solution?Please let us know if you still are having this issue.
Is there any reason this command works well:%sql
SELECT * FROM datanase.table WHERE salary > 1000returning 2 rows, while the below:%sql
delete FROM datanase.table WHERE salary > 1000ErrorError in SQL statement: AssertionError: assertion failed:...
DELETE FROM (and similarly UPDAT. aren't supported on the Parquet files - right now on Databricks, it's supported for Delta format. You can convert your parquet files into delta using CONVERT TO DELTA, and then this command will work for you.