cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

shiva1212
by New Contributor II
  • 889 Views
  • 1 replies
  • 0 kudos

Clone/copy python file

We are using Databricks extensively in the company. We found out that we can’t clone/copy *.py file using UI. We can clone notebooks but not python file. If we clone folder, we only clone notebooks inside folder but not python file. 

  • 889 Views
  • 1 replies
  • 0 kudos
Latest Reply
Rishabh-Pandey
Esteemed Contributor
  • 0 kudos

@shiva1212  When working with Databricks and managing Python files, it's true that the UI limitations can sometimes be restrictive. you can use Databricks CLI and REST API for file management and copying.

  • 0 kudos
Smriti1
by New Contributor
  • 1447 Views
  • 1 replies
  • 0 kudos

How to pass parameters to different notebooks in a workflow?

I have three notebooks: Notebook-1, Notebook-2, and Notebook-3, with a workflow dependency sequence of 1 -> 2 -> 3.Notebook-1 dynamically receives parameters, such as entity-1 and entity-2.Since these parameters change with each run, how can I pass t...

  • 1447 Views
  • 1 replies
  • 0 kudos
Latest Reply
adbooth01
New Contributor II
  • 0 kudos

As long as the parameter names are the same in all Notebooks, whatever value you trigger the workflow with will automatically be sent to all the notebooks. For example, if all three notebooks in the workflow have parameters entity-1 and entity-2, and...

  • 0 kudos
mailsathesh
by New Contributor II
  • 1086 Views
  • 2 replies
  • 1 kudos

Databricks Cluster Start and Stop

I want to send out an email if the cluster failed to start, i used to start the cluster using Databricks cli and then terminate them. In some cases ,my cluster is not starting at all and there are some errors.My use case is to send an email using dat...

  • 1086 Views
  • 2 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @mailsathesh ,You can write a script that will use databricks cli to start cluster. You can you --timeout flag to set maximum amount of time to reach running state. If this amoun is exceeded or if there is any error you can then send an email with...

  • 1 kudos
1 More Replies
CarterM
by New Contributor III
  • 6976 Views
  • 3 replies
  • 2 kudos

Resolved! Why Spark Streaming from S3 is returning thousands of files when there are only 9?

I am attempting to stream JSON endpoint responses from an s3 bucket into a spark DLT. I have been very successful in this practice previously, but the difference this time is that I am storing the responses from multiple endpoints in the same s3 buck...

8_9 endpoint response structure Soccer  endpoint  9 9 endpoint responses in same s3 bucket
  • 6976 Views
  • 3 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

@Carter Mooring​ Thank you SO MUCH for coming back to provide a solution to your thread! Happy you were able to figure this out so quickly. And I am sure that this will help someone in the future with the same issue.

  • 2 kudos
2 More Replies
Mangeysh
by New Contributor
  • 736 Views
  • 2 replies
  • 0 kudos

Converting databricks query output in JSON and creating end point

Hello  I am very new to DB and building an UI where I need to show data from databricks table. Unfortunately , I am getting access delta sharing feature by administrator.  Planning to develop own API and expose endpoint with JSON output. I am sure th...

  • 736 Views
  • 2 replies
  • 0 kudos
Latest Reply
menotron
Valued Contributor
  • 0 kudos

Hi @Mangeysh,You could achieve this using Databricks SQL Statement Execution API. I would recommend going through the docs and looking at the functionality and limitations and see if it serves your need before planning to develop your own APIs.

  • 0 kudos
1 More Replies
Kayla
by Valued Contributor II
  • 4601 Views
  • 12 replies
  • 22 kudos

Resolved! Environment magic command?

Noticed something strange in the Databricks version control- I'm getting hidden commands with %environment that I can only see in the UI for version control. No idea what it's from, just a minor nuisance I'm curious if anyone can shed light on.+ %env...

  • 4601 Views
  • 12 replies
  • 22 kudos
Latest Reply
justinbreese
Databricks Employee
  • 22 kudos

Hello all, this is Justin from the PM team at Databricks. We are sorry about the friction this caused you. This was a feature related to the new way that we are doing dependency management in our serverless offerings - Environments. We are going to r...

  • 22 kudos
11 More Replies
PraveenReddy21
by New Contributor III
  • 759 Views
  • 2 replies
  • 0 kudos

how to create catalog

Hi ,I am trying to create  catalog  and database  its not  allowing  databricks  , please  suggest  .Here my code .base_dir = "/mnt/files"spark.sql(f"CREATE CATALOG IF NOT EXISTS dev")spark.sql(f"CREATE DATABASE IF NOT EXISTS dev.demo_db") first i ne...

  • 759 Views
  • 2 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @PraveenReddy21 ,What error do you get? Do you use Unity Catalog or Hive metastore?

  • 0 kudos
1 More Replies
jonathan-dufaul
by Valued Contributor
  • 3834 Views
  • 6 replies
  • 6 kudos

Why is writing to MSSQL Server 12.0 so slow directly from spark but nearly instant when I write to a csv and read it back

I have a dataframe that inexplicably takes forever to write to an MS SQL Server even though other dataframes, even much larger ones, write nearly instantly. I'm using this code:my_dataframe.write.format("jdbc") .option("url",sqlsUrl) .optio...

  • 3834 Views
  • 6 replies
  • 6 kudos
Latest Reply
plondon
New Contributor II
  • 6 kudos

Had a similar issue. I can do 1-4 million rows in 1 minute via SSIS ETL on SQL server. Table is 15 fields long. Looking at your code it seems you have many fields but nothing like 300-400 fields which can affect performance. You can check SQL Server ...

  • 6 kudos
5 More Replies
yopbibo
by Contributor II
  • 21381 Views
  • 4 replies
  • 4 kudos

How can I connect to an Azure SQL db from a Databricks notebook?

I know how to do it with spark, and read/write tables (like https://docs.microsoft.com/en-gb/azure/databricks/data/data-sources/sql-databases#python-example )But this time, I need to only update a field of a specific row in a table. I do not think I ...

  • 21381 Views
  • 4 replies
  • 4 kudos
Latest Reply
yopbibo
Contributor II
  • 4 kudos

thanks for the link.I am maybe wrong, but they describe how to connect with spark. They do not provide a connection engine that we could use directly (like with pyodbc) or an engine that we could use in pandas, for example.

  • 4 kudos
3 More Replies
sunnyday
by New Contributor
  • 1457 Views
  • 0 replies
  • 0 kudos

Naming jobs in the Spark UI in Databricks Runtime 15.4

I am asking almost the same question as: https://community.databricks.com/t5/data-engineering/how-to-improve-spark-ui-job-description-for-pyspark/td-p/48959 .  I would like to know how to improve the readability of the Spark UI by naming jobs.   I am...

  • 1457 Views
  • 0 replies
  • 0 kudos
doodateika
by New Contributor III
  • 1824 Views
  • 4 replies
  • 1 kudos

Resolved! How to execute stored procedures on synapse sql pool from databricks

In the current version of databricks, previous methods to execute stored procedures seem to fail. spark.sparkContext._gateway.jvm.java.sql.DriverManager/spark._sc._gateway.jvm.java.sql.DriverManager returns that it is JVM dependent and will not work....

  • 1824 Views
  • 4 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

can you create a connection to external data in unity catalog, and then:use <connectiondb>;exec <sp>

  • 1 kudos
3 More Replies
Siegel_Af_
by New Contributor
  • 704 Views
  • 1 replies
  • 0 kudos

playcasinosnj.com

One of the best platforms where you can find games like online slots is https://playcasinosnj.com/. It is in my opinion the safest site to find the games that suit you. I also had the chance to try out the different games and they all worked for me. ...

  • 704 Views
  • 1 replies
  • 0 kudos
Latest Reply
miorickybort
New Contributor II
  • 0 kudos

Cool, I love slots. Thanks for the tip. I like that they are very diverse and easy to learn, so I often enjoy spending a couple of hours on these games in the evenings after work. Also, I recently found some betting apps not on Gamstop that turned ou...

  • 0 kudos
pinaki1
by New Contributor III
  • 433 Views
  • 1 replies
  • 0 kudos

Peformnace improvement of Databricks Spark Job

Hi,I need performance improvement for data bricks job in my project. Here are some steps being done in the project1. Read csv/Json files with small size (100MB,50MB) from multiple locations in s32. Write the data in bronze layer in delta/parquet form...

  • 433 Views
  • 1 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

In case of performance issues, always look for 'expensive' operations. Mainly wide operations (shuffle) and collecting data to the driver.Start with checking how long the bronze part takes, then silver etc.Pinpoint where it starts to get slow, then d...

  • 0 kudos
BricksGuy
by New Contributor III
  • 1827 Views
  • 7 replies
  • 0 kudos

WATER MARK ERROR WHILE JOINING WITH MULTIPLE STREAM TABLES

I am creating a ETL pipeline where i am reading multiple stream table into temp tables and at the end am trying to join those tables to get the output feed into another live table. So for that am using below method where i am giving list of tables as...

  • 1827 Views
  • 7 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

it is necessary for the join so if the dataframe has a watermark that's enough.No need to define it multiple times.

  • 0 kudos
6 More Replies
SrinuM
by New Contributor III
  • 615 Views
  • 0 replies
  • 0 kudos

Workspace Client dbutils issue

 host = "https://adb-xxxxxx.xx.azuredatabricks.net"token = "dapxxxxxxx"we are using databricksconnect from databricks.sdk import WorkspaceClientdbutil = WorkspaceClient(host=host,token=token).dbutilsfiles = dbutil.fs.ls("abfss://container-name@storag...

  • 615 Views
  • 0 replies
  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels