cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

griffinw
by New Contributor III
  • 4336 Views
  • 5 replies
  • 3 kudos

Resolved! Unable to import tkinter in notebook

Hello,I am unable to import tkinter (or Tkinter) into a python notebook.I also tried %pip install tkinter at the top of the notebook.Has anyone else been successful at this, or if it's impossible, why? Thank you

  • 4336 Views
  • 5 replies
  • 3 kudos
Latest Reply
ahmedE_
New Contributor II
  • 3 kudos

Hi @Will Griffin​ Can you confirm if this worked for you? I get a message `ERROR: No matching distribution found for python3-tk`.

  • 3 kudos
4 More Replies
haylee
by New Contributor II
  • 1689 Views
  • 4 replies
  • 0 kudos

I added a secret scope to the databricks environment, and I get this error when trying to run either of the following:

Commands Attempted:dbutils.secrets.listScopes()dbutils.secrets.get(scope = "{InsertScope}", key = "{InsertKey}") Error: "shaded.v245.com.fasterxml.jackson.core.JsonParseException: Unexpected character ('<' (code 60)): expected a valid value (number, ...

  • 1689 Views
  • 4 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Moderator
  • 0 kudos

Hi @Haylee Gaddy​,Just a friendly follow-up. Did any of the responses help you to resolve your question? if it did, please mark it as best. Otherwise, please let us know if you still need help.

  • 0 kudos
3 More Replies
thushar
by Contributor
  • 1807 Views
  • 6 replies
  • 0 kudos

GeneratedAlwaysAs' along with dataframe.write

Is it possible to use a calculated column (as like in the delta table using generatedAlwaysAs) definition while writing the data frame as a delta file like df.write.format("delta").Any options are there with the dataframe.write method to achieve this...

  • 1807 Views
  • 6 replies
  • 0 kudos
Latest Reply
pvignesh92
Honored Contributor
  • 0 kudos

Hi @Thushar R​ ,This option is not a part of Dataframe write API as GeneratedAlwaysAs feature is only applicable to Delta format and df.write is a common API to handle writes for all formats. If you to achieve this programmatically, you can still use...

  • 0 kudos
5 More Replies
Dave_Nithio
by Contributor
  • 2856 Views
  • 5 replies
  • 7 kudos

Resolved! Delta Live Table Schema Comment

I predefined my schema for a Delta Live Table Autoload. This included comments for some attributes. When performing a standard readStream, my comments appear, but when in Delta Live Tables I get no comments. Is there anything I need to do get comment...

image
  • 2856 Views
  • 5 replies
  • 7 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 7 kudos

Hi @Dave Wilson​ ​, We haven’t heard from you since the last response from @Debayan Mukherjee​ and @Hubert Dudek​, and I was checking back to see if you have a resolution yet. If you have any solution, please share it with the community as it can be ...

  • 7 kudos
4 More Replies
AmineHY
by Contributor
  • 17042 Views
  • 4 replies
  • 1 kudos

Resolved! How to get rid of "Command result size exceeds limit"

I am working on Databricks Notebook and trying to display a map using Floium and I keep getting this error > Command result size exceeds limit: Exceeded 20971520 bytes (current = 20973510)How can I get increase the memory limit?I already reduced the...

  • 17042 Views
  • 4 replies
  • 1 kudos
Latest Reply
labromb
Contributor
  • 1 kudos

Hi, I have the same problem with keplergl, and the save to disk option, whilst helpful isn't super practical... So how does one plot large datasets in kepler?Any thought welcome

  • 1 kudos
3 More Replies
raj123
by New Contributor II
  • 1477 Views
  • 2 replies
  • 3 kudos

Resolved! Data lineage graph now working

I created the below tables but when I click the lineage graph not able to see the upstream or downstream table .... the + sign goes away after a few sec but not able to click it ... anyone else having this issue?CREATE TABLE IF NOT EXISTS lineage_d...

  • 1477 Views
  • 2 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Raj Sharma​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 3 kudos
1 More Replies
Chilangdon
by New Contributor
  • 1145 Views
  • 2 replies
  • 1 kudos

Resolved! How to load multiple xlsx that are storage in different folders with the same name in a blob storage in a delta table ?

Hi i have a blob storage with multile unzip folders with the same suffix folder_report_name_01_2023_01_02 -> file_name_2023_01_02.xlsxBut i want to load all of this data using pandas or pyspark and insert in my delta table.I'm trying to using widget...

  • 1145 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Fernando Vázquez​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.T...

  • 1 kudos
1 More Replies
Ajay-Pandey
by Esteemed Contributor III
  • 1475 Views
  • 2 replies
  • 2 kudos

Why Azure Databricks needs to store data in temp storage in Azure before writing to the synapse.

I was following the tutorial about data transformation with azure databricks, and it says before loading data into azure synapse analytics, the data transformed by azure databricks would be saved on temp storage in azure blob storage first before loa...

  • 1475 Views
  • 2 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

@Ajay Pandey​ Saving the transformed data to temporary storage in Azure Blob Storage before loading into Azure Synapse Analytics provides a number of benefits to ensure that the data is accurate, optimized, and performs well in the target environmen...

  • 2 kudos
1 More Replies
chhavibansal
by New Contributor III
  • 622 Views
  • 1 replies
  • 0 kudos

What is the upper bound limit for dataSkippingNumIndexedCols, to keeps stats in delta log file?

Is there an upper bound of number that i can assign to delta.dataSkippingNumIndexedCols for computing statistics. Is there some tradeoff benchmark available for increasing this number beyond 32.

  • 622 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Chhavi Bansal​ :The delta.dataSkippingNumIndexedCols configuration property controls the maximum number of columns that Delta Lake will build statistics on during data skipping. By default, this value is set to 32. There is no hard upper bound on th...

  • 0 kudos
nounou
by New Contributor II
  • 3474 Views
  • 1 replies
  • 1 kudos

how can i export my dashboard en format html using databriks api

hi everyone, i would like to export my dashbord in html format and embed it in my body of my email in order to send it to my teamso there is my code python for the databriks api  and i got this error  and when i put my htm in the body of my message i...

Capture Capture1 Capture3
  • 3474 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

@mathild noun​ :import databricks.workspace as workspace_api import requests   # set up your Databricks workspace credentials domain = "<your Databricks workspace domain>" token = "<your Databricks API token>"   # set up the workspace client workspac...

  • 1 kudos
VinayEmmadi
by New Contributor
  • 4485 Views
  • 1 replies
  • 0 kudos

How does hash shuffle join work in Spark?

Hi All, I am trying to understand the internals shuffle hash join. I want to check if my understanding of it is correct. Let’s say I have two tables t1 and t2 joined on column country (8 distinct values). If I set the number of shuffle partitions as ...

  • 4485 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Vinay Emmadi​ : In Spark, a hash shuffle join is a type of join that is used when joining two data sets on a common key. The data is first partitioned based on the join key, and then each partition is shuffled and sent to a node in the cluster. The ...

  • 0 kudos
Bartek
by Contributor
  • 2428 Views
  • 1 replies
  • 1 kudos

Save Spark DataFrame to shape file (.shp format)

Hello,I know how to create .shp file from Geopandas dataframe using code similar to this, also mentioned on SO:gpd_df = geopandas.GeoDataFrame(pandas_df, geometry='geom') gpd_df .to_file("username/nh.shp")However I have .parquet files that I can load...

  • 2428 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

@Bartosz Maciejewski​ :Spark does not have native support for writing Shapefiles directly. However, you can use a third-party library such as GeoPandas or PyShp to write your Spark DataFrame to a Shapefile.Here's an example of how to use GeoPandas to...

  • 1 kudos
KVNARK
by Honored Contributor II
  • 816 Views
  • 1 replies
  • 4 kudos

Resolved! Query related to Storage account authentication

Use Case: Copy data from SharePoint List to Blob using Power AutomateShort Description:To Access the blob storage account from Power Automate. There are three authentication type:1. Access Key2. Service Principal3. Azure AD IntegratedWhich authentica...

  • 816 Views
  • 1 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

@KVNARK .​ :It's recommended to use the Azure AD Integrated authentication type. This authentication type allows you to use Azure Active Directory (AD) to authenticate and manage access to Blob Storage resources at the folder or container level using...

  • 4 kudos
aki1
by New Contributor II
  • 1630 Views
  • 2 replies
  • 1 kudos

How to download a file in DBFS that contains multibyte characters in the file path?

I would like to download a file in DBFS using the FileStore Endpoint.If the file or folder name contains multibyte characters, the file path cannot be specified due to URL encoding and an error occurs.Question 1: If a file or folder name contains mul...

  • 1630 Views
  • 2 replies
  • 1 kudos
Latest Reply
Debayan
Esteemed Contributor III
  • 1 kudos

Hi,Databricks CLI can be used to download a file from DBFS. https://docs.databricks.com/dev-tools/cli/index.htmlAlso, you can refer to https://stackoverflow.com/questions/49019706/databricks-download-a-dbfs-filestore-file-to-my-local-machine , which ...

  • 1 kudos
1 More Replies
Tewks
by New Contributor
  • 1483 Views
  • 2 replies
  • 5 kudos

Resolved! Databricks SQL External Connections

Lakehouse architectures seem enticing, especially from the standpoint of querying the data lake directly as it sits (as opposed to first migrating the data to an external data warehouse). While documentation and support seems pretty clear regarding ...

  • 1483 Views
  • 2 replies
  • 5 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 5 kudos

these are really awesome details

  • 5 kudos
1 More Replies
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels