cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

tototox
by New Contributor III
  • 8633 Views
  • 3 replies
  • 2 kudos

how to check table size by partition?

I want to check the size of the delta table by partition.As you can see, only the size of the table can be checked, but not by partition.

  • 8633 Views
  • 3 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

@jin park​ :You can use the Databricks Delta Lake SHOW TABLE EXTENDED command to get the size of each partition of the table. Here's an example:%sql SHOW TABLE EXTENDED LIKE '<table_name>' PARTITION (<partition_column> = '<partition_value>') SELECT...

  • 2 kudos
2 More Replies
Yash_542965
by New Contributor II
  • 4800 Views
  • 2 replies
  • 3 kudos

Resolved! Access Excel file in delta live pipeline

I'm having an issue accessing the excel through dlt pipeline. the file is in ADLS I'm using pandas to read the Excel. It seems pandas are not able to understand abfss protocol is there any way to read Excel with pandas in dlt pipeline?I'm getting thi...

  • 4800 Views
  • 2 replies
  • 3 kudos
Latest Reply
Yash_542965
New Contributor II
  • 3 kudos

Thanks for the info. It works just need to install an additional library using "%pip install openpyxl".

  • 3 kudos
1 More Replies
Inna_M
by New Contributor III
  • 1244 Views
  • 1 replies
  • 1 kudos

Resolved! Is there any maintenance (patches , upgrade for VMs created by DataBricks on Azure) from DataBricks

We are using Databricks on Azure. Infra team noticed we have some VMs created in the past for DataBricks clusters on version Linux (ubuntu 18.04). Is there maintenance previewed for that, upgrade? Are there any patches for created in Azure objects by...

  • 1244 Views
  • 1 replies
  • 1 kudos
Latest Reply
Inna_M
New Contributor III
  • 1 kudos

Finally while I was posting this question, AzureDataBricks upgraded VMs to the supported version 20, not the latest , 22. It was a week after old version was no longer supported by Microsoft

  • 1 kudos
CoopCoop
by New Contributor III
  • 7262 Views
  • 6 replies
  • 7 kudos

Resolved! PDF Attachment on an Alert

Currently my Alert is an HTML table using data pointing to an SQL query.I was wondering if it is possible to attach the resulting table from this SQL query as a PDF to the alert email.If anyone has successfully implemented this, please let me know! T...

  • 7262 Views
  • 6 replies
  • 7 kudos
Latest Reply
Atanu
Esteemed Contributor
  • 7 kudos

Ok understood the concern, so basically the issue is with PDF rendering as much I understood. Let me know if I am wrong. Let me see if there is any improvement by our engineering team on this front.

  • 7 kudos
5 More Replies
Louis_Databrick
by New Contributor II
  • 1010 Views
  • 2 replies
  • 0 kudos

Registering a dataframe coming from a CDC data stream removes the CDC columns from the resulting temporary view, even when explicitly adding a copy of the column to the dataframe.

df_source_records.filter(F.col("_change_type").isin("delete", "insert", "update_postimage")) .withColumn("ROW_NUMBER", F.row_number().over(window)) .filter("ROW_NUMBE...

  • 1010 Views
  • 2 replies
  • 0 kudos
Latest Reply
Louis_Databrick
New Contributor II
  • 0 kudos

Seems to work now actually. No idea what changed, as I tried multiple times exactly in this way and it did.not.work.from pyspark.sql.functions import expr from pyspark.sql.utils import AnalysisException import pyspark.sql.functions as f     data = [(...

  • 0 kudos
1 More Replies
StuartKindness_
by New Contributor II
  • 1401 Views
  • 4 replies
  • 2 kudos

How to replace the SSO certifcate on our workspace?

We have Azure AD SSO setup on our workspace but the three year certificate is due to expire on Monday. I have logged onto the Admin Console & Single Sign-on tab. All the options are greyed out and there is no update or edit buttons as can be seen in ...

Databricks_sso_replacep
  • 1401 Views
  • 4 replies
  • 2 kudos
Latest Reply
StuartKindness_
New Contributor II
  • 2 kudos

@Debayan​  our version is branch-3.96-1682169174-f2e2f130 if this helps any?

  • 2 kudos
3 More Replies
harraz
by New Contributor III
  • 2761 Views
  • 1 replies
  • 0 kudos

Run result unavailable: run failed with error message Notebook not found:

I'm trying to create a workflow job that fetches the notebook from a remote git repository (Bitbucket cloud)I tried everything in the Path field and nothing is working. Note that the bitbucket repo is connected to databricks already and no issues che...

Screen Shot 2023-05-31 at 6.45.47 PM
  • 2761 Views
  • 1 replies
  • 0 kudos
Latest Reply
Debayan
Esteemed Contributor III
  • 0 kudos

Hi @harraz (Customer)​ , Could you please confirm if files in repos has been enabled? https://docs.databricks.com/files/workspace.html#configure-support-for-files-in-repos.You can use the command %sh pwd in a notebook inside a repo to check if Files ...

  • 0 kudos
harraz
by New Contributor III
  • 1058 Views
  • 2 replies
  • 0 kudos

how to setup the path to a remote notebook in bitbucket to run as a jobI tried everything in the path and nothing is workingI keep getting this error:...

how to setup the path to a remote notebook in bitbucket to run as a jobI tried everything in the path and nothing is workingI keep getting this error:Run result unavailable: run failed with error message Notebook not found:Note that I already connec...

Screen Shot 2023-05-31 at 6.45.47 PM
  • 1058 Views
  • 2 replies
  • 0 kudos
Latest Reply
Debayan
Esteemed Contributor III
  • 0 kudos

Hi @mohamed harraz​ , Could you please confirm if files in repos has been enabled? https://docs.databricks.com/files/workspace.html#configure-support-for-files-in-repos.You can use the command  %sh pwd in a notebook inside a repo to check if Files in...

  • 0 kudos
1 More Replies
cmilligan
by Contributor II
  • 629 Views
  • 1 replies
  • 1 kudos

Return notebook path from job that is run remotely from the repo

I'm wanting to set up some email alerts for issues in the data as a part of a job run. I am wanting to point the user to the notebook that the issue occurred in. I think this would be simple enough but another layer is that the job is going to be run...

  • 629 Views
  • 1 replies
  • 1 kudos
Latest Reply
Debayan
Esteemed Contributor III
  • 1 kudos

Hi, Could you please clarify what do you mean by return the file from the remote repo?Please tag @Debayan​ with your next response which will notify me, Thank you!

  • 1 kudos
etsyal1e2r3
by Honored Contributor
  • 6702 Views
  • 2 replies
  • 3 kudos

Resolved! Compiling Flattened Dataframe back to Struct Columns

I have a dataframe with this format of columns:[`first.second.third` , `alpha.bravo.test1` , `alpha.bravo.test2`]I'd like to get an output dataframe of this:[ `first` | `alpha` ] ---------------...

image
  • 6702 Views
  • 2 replies
  • 3 kudos
Latest Reply
etsyal1e2r3
Honored Contributor
  • 3 kudos

I have figured out the solution.

  • 3 kudos
1 More Replies
AdamRink
by New Contributor III
  • 1272 Views
  • 1 replies
  • 0 kudos

Using Repos CLI to create a repo but getting Parent Directory /Repos/develop/ does not exist.

I would like to create the directory develop under Repos as part of the script, then link it to github and update it? How can I do this?

  • 1272 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16539034020
Contributor II
  • 0 kudos

Hi, Adam:Repos CLI does not have specific functionality to create directories in Databricks Repos. Please check the following doc for more information: https://docs.databricks.com/dev-tools/cli/repos-cli.htmlYou cold use  run databricks workspace mkd...

  • 0 kudos
trang_le
by Contributor
  • 457 Views
  • 0 replies
  • 0 kudos

Announcing a new portfolio of Generative AI learning offerings on Databricks Academy Today, we launched new Generative AI, including LLMs, learning of...

Announcing a new portfolio of Generative AI learning offerings on Databricks AcademyToday, we launched new Generative AI, including LLMs, learning offerings for everyone from technical and business leaders to data practitioners, such as Data Scientis...

  • 457 Views
  • 0 replies
  • 0 kudos
ankris
by New Contributor III
  • 2882 Views
  • 2 replies
  • 0 kudos

Could you please guide us on connecting ServiceNow data in databricks

Would like to extract data like ticket info, resolve time, etc., from ServiceNow in databricks.Not finding much information in community and appreciate your guidance on the same.

  • 2882 Views
  • 2 replies
  • 0 kudos
Latest Reply
crannow
New Contributor II
  • 0 kudos

ServiceNow offers API capabilities. You can consume the ServiceNow API within a Databricks notebook to extract data from ServiceNow. Following is a suggested prompt to use with ChatGPT for example python code to connect to ServiceNow's api. PROMPT: ...

  • 0 kudos
1 More Replies
Neha_1688
by New Contributor II
  • 1523 Views
  • 2 replies
  • 3 kudos

Resolved! DLT pipeline that reads data from JDBC source

Could you please guide on how to create the DLT pipeline that directly reads the data from jdbc.When I created the DLT pipeline it give me error at Setting up table, If I ran interactively in notebooks it run successfully, but in non interactive mode...

  • 1523 Views
  • 2 replies
  • 3 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 3 kudos

What you try do to is not possible.dlt uses autoloader, not jdbcno jars (dlt is sql/python only)I'd skip DLT for this scenario and use an ordinary notebook, nothing wrong with that.

  • 3 kudos
1 More Replies
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels