cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

doodateika
by New Contributor III
  • 2983 Views
  • 4 replies
  • 1 kudos

Resolved! How to execute stored procedures on synapse sql pool from databricks

In the current version of databricks, previous methods to execute stored procedures seem to fail. spark.sparkContext._gateway.jvm.java.sql.DriverManager/spark._sc._gateway.jvm.java.sql.DriverManager returns that it is JVM dependent and will not work....

  • 2983 Views
  • 4 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

can you create a connection to external data in unity catalog, and then:use <connectiondb>;exec <sp>

  • 1 kudos
3 More Replies
Siegel_Af_
by New Contributor
  • 933 Views
  • 1 replies
  • 0 kudos

playcasinosnj.com

One of the best platforms where you can find games like online slots is https://playcasinosnj.com/. It is in my opinion the safest site to find the games that suit you. I also had the chance to try out the different games and they all worked for me. ...

  • 933 Views
  • 1 replies
  • 0 kudos
Latest Reply
miorickybort
New Contributor II
  • 0 kudos

Cool, I love slots. Thanks for the tip. I like that they are very diverse and easy to learn, so I often enjoy spending a couple of hours on these games in the evenings after work. Also, I recently found some betting apps not on Gamstop that turned ou...

  • 0 kudos
pinaki1
by New Contributor III
  • 747 Views
  • 1 replies
  • 0 kudos

Peformnace improvement of Databricks Spark Job

Hi,I need performance improvement for data bricks job in my project. Here are some steps being done in the project1. Read csv/Json files with small size (100MB,50MB) from multiple locations in s32. Write the data in bronze layer in delta/parquet form...

  • 747 Views
  • 1 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

In case of performance issues, always look for 'expensive' operations. Mainly wide operations (shuffle) and collecting data to the driver.Start with checking how long the bronze part takes, then silver etc.Pinpoint where it starts to get slow, then d...

  • 0 kudos
BricksGuy
by New Contributor III
  • 3553 Views
  • 7 replies
  • 0 kudos

WATER MARK ERROR WHILE JOINING WITH MULTIPLE STREAM TABLES

I am creating a ETL pipeline where i am reading multiple stream table into temp tables and at the end am trying to join those tables to get the output feed into another live table. So for that am using below method where i am giving list of tables as...

  • 3553 Views
  • 7 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

it is necessary for the join so if the dataframe has a watermark that's enough.No need to define it multiple times.

  • 0 kudos
6 More Replies
SrinuM
by New Contributor III
  • 3423 Views
  • 0 replies
  • 0 kudos

Workspace Client dbutils issue

 host = "https://adb-xxxxxx.xx.azuredatabricks.net"token = "dapxxxxxxx"we are using databricksconnect from databricks.sdk import WorkspaceClientdbutil = WorkspaceClient(host=host,token=token).dbutilsfiles = dbutil.fs.ls("abfss://container-name@storag...

  • 3423 Views
  • 0 replies
  • 0 kudos
emorgoch
by New Contributor II
  • 15979 Views
  • 1 replies
  • 0 kudos

Passing variables from python to sql in a notebook using serverless compute

I've got a notebook that I've written that's going to execute some python code to parse the workspace id to figure out which of my environments that I'm in and set a value for it. I then want to take that value, and pass it through to a code block of...

  • 15979 Views
  • 1 replies
  • 0 kudos
Latest Reply
emorgoch
New Contributor II
  • 0 kudos

Thanks Kaniz, this is a great suggestion. I'll look into it and how it can work for my projects.

  • 0 kudos
MichaelO
by New Contributor III
  • 3594 Views
  • 1 replies
  • 0 kudos

Terminating cluster programmatically

Is there any python script that allows me to terminate (not delete)  a cluster in the notebook, similar to this R equivalent ofterminate_cluster(cluster_id, workspace, token = NULL, verbose = T, ...)

  • 3594 Views
  • 1 replies
  • 0 kudos
PraveenReddy21
by New Contributor III
  • 2516 Views
  • 7 replies
  • 2 kudos

Resolved! i created External database but unable to transferring table to Storage Acc(BLOBcontainer-Gold)

Hi , I done activities  Bronze and Silver , after i trying to saving table to Gold  container but unable to storing .i created External database .I want store  data to PARQUET but not supporting ,only DELTA.only  MANAGED LOCATION supporting but unabl...

  • 2516 Views
  • 7 replies
  • 2 kudos
Latest Reply
PraveenReddy21
New Contributor III
  • 2 kudos

Thank You  Rishabh.

  • 2 kudos
6 More Replies
rameshybr
by New Contributor II
  • 1699 Views
  • 4 replies
  • 0 kudos

DQ-Quality Check- what are the best method to validate the two parquet files .

DQ-Quality Check. we have to validate the data between landing data and bronze data with data quality . below are the data quality checks.  1. find the counts between the 2 files. if it is matched then go for 2 point.2. if counts are matched, then va...

  • 1699 Views
  • 4 replies
  • 0 kudos
Latest Reply
Rishabh-Pandey
Esteemed Contributor
  • 0 kudos

Try with this , this is for second point if first points already matches .# Define key columns key_columns = ["key_column1", "key_column2"] # Adjust according to your data schema # Perform an outer join to find mismatches joined_df = landing_df.ali...

  • 0 kudos
3 More Replies
CaptainJack
by New Contributor III
  • 2190 Views
  • 1 replies
  • 0 kudos

Upload files from Databricks to Google Drive

Is it possible to upload files from Databricks to Google Drive? How?

  • 2190 Views
  • 1 replies
  • 0 kudos
Latest Reply
daniel_sahal
Esteemed Contributor
  • 0 kudos

@CaptainJack You can use Python + Google Drive API. Example: https://medium.com/the-team-of-future-learning/integrating-google-drive-api-with-python-a-step-by-step-guide-7811fcd16c44

  • 0 kudos
shan_chandra
by Databricks Employee
  • 4158 Views
  • 0 replies
  • 1 kudos

How to calculate the individual file count, file size and number of rows on a Delta table?

There are instances where we need to know the individual file size or file count present in the delta table rather than the average size. we can use the below query to determine that. %sql select count(*) as rows, file_path, file_size from (select * ...

  • 4158 Views
  • 0 replies
  • 1 kudos
rameshybr
by New Contributor II
  • 2829 Views
  • 2 replies
  • 0 kudos

How to get the files one by one in blob storage using pyspark/python

how to write the pyspark/python to get the files one by one in blob storage.

  • 2829 Views
  • 2 replies
  • 0 kudos
Latest Reply
Rishabh-Pandey
Esteemed Contributor
  • 0 kudos

@rameshybr # List files in a directory files = dbutils.fs.ls("/mnt/<mount-name>/path/to/directory") for file in files: file_path = file.path # Read each file into a DataFrame / if you file format is parquet for example i am taking df = s...

  • 0 kudos
1 More Replies
alxsbn
by Contributor
  • 6347 Views
  • 5 replies
  • 7 kudos

How to change SQL editor / schema browser defalut catalog / database

On SQL editor / schema browser Is there a way to change the default catalog / database ? My mine always fixed on my unity catalog. 

  • 6347 Views
  • 5 replies
  • 7 kudos
Latest Reply
Debayan
Databricks Employee
  • 7 kudos

Hi, From the dropdown you can get the data objects.https://docs.databricks.com/sql/user/queries/queries.html#browse-data-objects-in-sql-editorPlease let us know if this helps. Also, please tag @Debayan​ with your next comment so that I will get notif...

  • 7 kudos
4 More Replies
alexgv12
by New Contributor III
  • 1065 Views
  • 2 replies
  • 0 kudos

isolated databricks cluster call from synapses or azure datafactory

https://learn.microsoft.com/en-us/answers/questions/1919424/isolated-databricks-cluster-call-from-synapses-orhow can I create a job in databricks with parameters of isolated from synapses or azure datafactory, because I can not find any option that a...

Captura de pantalla 2024-08-20 122534.png
  • 1065 Views
  • 2 replies
  • 0 kudos
Latest Reply
alexgv12
New Contributor III
  • 0 kudos

Hi warner thanks for your question, I share the link service in synapses updated, currently we have a pool in databricks then what we do with the link service is that it creates a job and uploads an instance with the resources of our pool but to uplo...

  • 0 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels