cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

erigaud
by Honored Contributor
  • 5868 Views
  • 2 replies
  • 3 kudos

Get total number of files of a Delta table

I'm looking to know programatically how many files a delta table is made of.I know I can do %sqlDESCRIBE DETAIL my_tableBut that would only give me the number of files of the current version. I am looking to know the total number of files (basically ...

  • 5868 Views
  • 2 replies
  • 3 kudos
Latest Reply
ADavid
New Contributor II
  • 3 kudos

What was the solution?

  • 3 kudos
1 More Replies
Brian-Nowak
by New Contributor II
  • 1368 Views
  • 3 replies
  • 5 kudos

DBR 15.4 LTS Beta Unable to Write Files to Azure Storage Account

Hi there!I believe I might have identified a bug with DBR 15.4 LTS Beta. The basic task of saving data to a delta table, as well as an even more basic operation of saving a file to cloud storage, is failing on 15.4, but working perfectly fine on 15.3...

  • 1368 Views
  • 3 replies
  • 5 kudos
Latest Reply
Ricklen
New Contributor III
  • 5 kudos

We have the same issue since yesterday (6/8/2024), running on DBR 15.3 or 15.4 LTS Beta. It seems to have something to do with large table's indeed. Tried with multiple .partition sizes.

  • 5 kudos
2 More Replies
Ricklen
by New Contributor III
  • 653 Views
  • 1 replies
  • 1 kudos

VSCode Databricks Extension Performance

Hello Everyone!I've been using the Databricks extension in VSCode for a while know and I'm syncing my repository to my Databricks workspace. In the beginning syncing files to my workspace was basically instant. But now it is starting to take a lot of...

  • 653 Views
  • 1 replies
  • 1 kudos
alm
by New Contributor III
  • 513 Views
  • 1 replies
  • 0 kudos

Define SQL table name using Python

I want to control which schema a notebook writes. I want it to depend on the user that runs the notebook.For now, the scope is to suport languages Python and SQL. I have written a Python function, `get_path`, that returns the full path of the destina...

  • 513 Views
  • 1 replies
  • 0 kudos
rajeevk
by New Contributor
  • 559 Views
  • 1 replies
  • 0 kudos

Is there a %%capture or equivalent possible in databricks notebook

I want to suppress all output of a cell, including text and charts plots, Is it possible to do in Data Bricks. I am able to do the same in other notebook environments, but exactly the same does not work in Databricks. Any insight or even understandab...

  • 559 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @rajeevk ,The one way is to use cell hiding:Databricks notebook interface and controls | Databricks on AWS

  • 0 kudos
Pawanukey12
by New Contributor
  • 346 Views
  • 1 replies
  • 0 kudos

How to get the details of the notebook i.e who is the owner of a notebook ?

I am using azure data bricks. we have a version control system git along with it . How do i get to know if this particular notebook is created or owned by whom ??

  • 346 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @Pawanukey12 ,There is no direct API to get the owner of a notebook using the notebook path in Databricks. However, you can manually check the owner of the notebook by the notebook name. You can manually go to the folder where the notebook is loca...

  • 0 kudos
ruoyuqian
by New Contributor II
  • 612 Views
  • 1 replies
  • 0 kudos

Resolved! Delta Live Table run outside out pipeline

I have created a notebook for my Delta Live Table pipeline and it runs without errors however if I run the notebook alone in my cluster it,  says not allowed and show this error.  Does it mean I can only run delta live table in the pipeline and canno...

  • 612 Views
  • 1 replies
  • 0 kudos
Latest Reply
Rishabh-Pandey
Esteemed Contributor
  • 0 kudos

Hi @ruoyuqian Delta Live Tables (DLT) have specific execution contexts and dependencies that are managed within their pipeline environment. This is why the code runs successfully only when executed within the pipeline, as DLT creates its own job clus...

  • 0 kudos
ShankarM
by Contributor
  • 808 Views
  • 2 replies
  • 0 kudos

Intelligent source to target mapping

I want to implement source to target mapping in such a way that source and target columns are auto mapped using intelligent AI mapping resulting in reduction of mapping efforts especially when there are 100+ columns in a table. Metadata information o...

  • 808 Views
  • 2 replies
  • 0 kudos
Latest Reply
ShankarM
Contributor
  • 0 kudos

Can you please reply to my latest follow up question?

  • 0 kudos
1 More Replies
thiagoawstest
by Contributor
  • 702 Views
  • 1 replies
  • 0 kudos

add or change roles

Hello, I have a Databricks environment provisioned by AWS. I would like to know if it is possible to add new roles or change existing roles. In my environment, Admin and User appear. I have the following need: how can I have a group, but the users th...

  • 702 Views
  • 1 replies
  • 0 kudos
SeyedA
by New Contributor
  • 358 Views
  • 0 replies
  • 0 kudos

Debug UDFs using VSCode extension

I am trying to debug my python script using Databricks VSCode extension. I am using udf and pandas_udf in my script. Everything works fine except when the execution gets to the udf and pandas_udf usages. It then complains that "SparkContext or SparkS...

  • 358 Views
  • 0 replies
  • 0 kudos
John_Rotenstein
by New Contributor II
  • 16092 Views
  • 8 replies
  • 3 kudos

Retrieve job-level parameters in Python

Parameters can be passed to Tasks and the values can be retrieved with:dbutils.widgets.get("parameter_name")More recently, we have been given the ability to add parameters to Jobs.However, the parameters cannot be retrieved like Task parameters.Quest...

  • 16092 Views
  • 8 replies
  • 3 kudos
Latest Reply
lprevost
Contributor
  • 3 kudos

The only thing that has worked for me consistently in python is params = dbutils.widgets.getAll() where an empty dictionary is returned if I'm in interactive mode and the job/task params are returned if they are present.

  • 3 kudos
7 More Replies
vadi
by New Contributor
  • 334 Views
  • 1 replies
  • 0 kudos

csv file processing

whats best possible solution to process csv file in databricks.Please consider scalability,optimization, qa give m best solution...

  • 334 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @vadiIn my opinion, the best way is to use autoloader. For performance reason, it's also beneficial to provide schema upfront. Assigning schema manually also will improve the performance because it does not do the schema inference with the huge se...

  • 0 kudos
Rajdeepak
by New Contributor
  • 927 Views
  • 0 replies
  • 0 kudos

How to restart failed spark stream job from the failure point

I am setting up a ETL process using pyspark. My input is a kafka stream and i am writing output to multiple sink (one into kafka and another into cloud storage). I am writing checkpoints on the cloud storage. The issue i am facing is that, whenever m...

  • 927 Views
  • 0 replies
  • 0 kudos
reachrishav
by New Contributor II
  • 924 Views
  • 0 replies
  • 0 kudos

What is the equivalent of "if exists()" in databricks sql?

What is the equivalent of the below sql server syntax in databricks sql? there are cases where i need to execute a block of sql code on certain conditions. I know this can be achieved with spark.sql, but the problem with spark.sql()  is it does not p...

  • 924 Views
  • 0 replies
  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels