cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

calvinchan_iot
by New Contributor II
  • 1891 Views
  • 3 replies
  • 0 kudos

SparkRuntimeException: Sent message larger than max (10701549 vs. 10485760)

Hey Databricks team,I have been facing a weird error when i upgrade to use Unity Catalog. Actually where is the limit 10485760 (10MB) coming from?I have spark.sql.autoBroadcastJoinThreshold set to -1 already, and I can't find out any other spark conf...

  • 1891 Views
  • 3 replies
  • 0 kudos
Latest Reply
calvinchan_iot
New Contributor II
  • 0 kudos

Hi @szymon_dybczak , i did but the problem persisits

  • 0 kudos
2 More Replies
ashish577
by New Contributor III
  • 5993 Views
  • 4 replies
  • 2 kudos

Databricks asset bundles passing parameters using bundle run which are not declared

Hi,We recently decided to move to databricks asset bundles, one scenario that we are dealing with is we have different parameters passed to the same job which are handled in the notebook. With bundles when I try to pass parameters at runtime(which ar...

  • 5993 Views
  • 4 replies
  • 2 kudos
Latest Reply
HrushiM
New Contributor II
  • 2 kudos

Following syntax can be used.databricks bundle run -t ENV --params Param1=Value1,Param2=Value2 Job_NameJob Definition Parameter may look like this. 

  • 2 kudos
3 More Replies
Ziy_41
by New Contributor
  • 602 Views
  • 2 replies
  • 0 kudos

Hi i have uploaded excel file in databricks but it shows different language.

Hi,I have attach one excel file in data bricks edition but unfortunately it shows a diiferent langaue in ouput whice i wrote display(df). below im attaching the screenshot please let me now thanking you in advance.  

Ziy_41_0-1729504438295.png
  • 602 Views
  • 2 replies
  • 0 kudos
Latest Reply
Stefan-Koch
Valued Contributor II
  • 0 kudos

CSV and Excel are not the same datatype. You can load the excel data into a pandas dataframe and then convert it to a pyspark dataframe.first, you have to install the openpyxl library %pip install openpyxl Then import PySpark Pandas: import pyspark.p...

  • 0 kudos
1 More Replies
Miguel_Salas
by New Contributor II
  • 1324 Views
  • 1 replies
  • 0 kudos

How Install Pyrfc into AWS Databrick using Volumes

I'm trying to install Pyrfc in a Databricks Cluster (already tried in r5.xlarge, m5.xlarge, and c6gd.xlarge). I'm following these link.https://community.databricks.com/t5/data-engineering/how-can-i-cluster-install-a-c-python-library-pyrfc/td-p/8118Bu...

  • 1324 Views
  • 1 replies
  • 0 kudos
Latest Reply
Miguel_Salas
New Contributor II
  • 0 kudos

More details about the errorLibrary installation attempted on the driver node of cluster 0000-000000-00000 and failed. Please refer to the following error message to fix the library or contact Databricks support. Error code: DRIVER_LIBRARY_INSTALLATI...

  • 0 kudos
Arpi
by New Contributor II
  • 3853 Views
  • 4 replies
  • 4 kudos

Resolved! Database creation error

I am trying to create database with external location abfss but facing the below error.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got exception: shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs....

  • 3853 Views
  • 4 replies
  • 4 kudos
Latest Reply
source2sea
Contributor
  • 4 kudos

Changing it to a CLUSTER level for OAuth authentication helped me solve the problem.I wish the notebook AI bot could tell me the solution.before the changes, my configraiotn was at the notebook leve.and  it has below errorsAnalysisException: org.apac...

  • 4 kudos
3 More Replies
Kartikb
by New Contributor II
  • 590 Views
  • 4 replies
  • 3 kudos

Resolved! code execution from Databrick folder

We are able to run a notebook that references Python code using import statements from a Databricks repowith the source code checked out. However, we encounter a ModuleNotFoundError when executing the same code from a folder.Error: ModuleNotFoundErro...

  • 590 Views
  • 4 replies
  • 3 kudos
Latest Reply
Kartikb
New Contributor II
  • 3 kudos

Below worked as you have suggested.import os, sysproject_path = os.path.abspath("/Workspace/<folder-name-1>/<folder-name-2>/<top-level-code-folder>")if project_path not in sys.path:    sys.path.append(project_path) 

  • 3 kudos
3 More Replies
adrjuju
by New Contributor II
  • 854 Views
  • 3 replies
  • 0 kudos

S3 Data access through unity

Hey All I have the following issue : I've connected a s3 bucket through unity catalog as an external source. I perfectly see the files of my s3 bucket when i scroll through the catalog using the user interface. However when I try to connect through a...

  • 854 Views
  • 3 replies
  • 0 kudos
Latest Reply
adrjuju
New Contributor II
  • 0 kudos

Hey Chandra thank you for your answer. The path is a volume path indeed, : /Volumes/my_path_in_volume

  • 0 kudos
2 More Replies
swzzzsw
by New Contributor III
  • 5339 Views
  • 6 replies
  • 0 kudos

Resolved! SQLServerException: deadlock

I'm using databricks to connect to a SQL managed instance via JDBC. SQL operations I need to perform include DELETE, UPDATE, and simple read and write. Since spark syntax only handles simple read and write, I had to open SQL connection using Scala an...

image.png
  • 5339 Views
  • 6 replies
  • 0 kudos
Latest Reply
Panda
Valued Contributor
  • 0 kudos

@swzzzsw Since you are performing database operations, to reduce the chances of deadlocks, make sure to wrap your SQL operations inside transactions using commit and rollback.Another approachs to consider is adding retry logic or using Isolation Leve...

  • 0 kudos
5 More Replies
keeplearning
by New Contributor II
  • 24498 Views
  • 4 replies
  • 3 kudos

Resolved! How can I send custom email notification

I am using the edit notification in databricks to send email notification in case of workflow failure or success. How can I add additional information to this report for example if I want to notify about number of rows got processed or added how can ...

  • 24498 Views
  • 4 replies
  • 3 kudos
Latest Reply
Panda
Valued Contributor
  • 3 kudos

@keeplearning There are three approaches I can think of for this:Approach 1: Creating an email template and sending emails programmatically from DBX Notebook.Approach 2: Invoke a Logic App via an Azure REST API from Databricks after the code executes...

  • 3 kudos
3 More Replies
Oliver_Angelil
by Valued Contributor II
  • 10664 Views
  • 9 replies
  • 1 kudos

How to use the git CLI in databricks?

After making some changes in my feature branch, I have committed and pushed (to Azure Devops) some work (note I have not yet raised a PR or merge to any other branch). Many of the files I committed are data files and so I would like to reverse the co...

  • 10664 Views
  • 9 replies
  • 1 kudos
Latest Reply
AntonDBUser
New Contributor III
  • 1 kudos

Any updates on this? We still can't manage to run Git CLI commands from Databricks. Appreciate any input on this!

  • 1 kudos
8 More Replies
Kayla
by Valued Contributor II
  • 1887 Views
  • 1 replies
  • 0 kudos

External Table From BigQuery

I'm working on implementing Unity Catalog, and part of that is determining how to handle our BigQuery tables. We need to utilize them to connect to another application, or else we'd stay within regular delta tables on Databricks.The page https://docs...

  • 1887 Views
  • 1 replies
  • 0 kudos
Latest Reply
lorenz_singer
New Contributor II
  • 0 kudos

Hi Kayla,I know your question is already a year old, but it's possible to create BigQuery tables in Unity Catalog: https://docs.gcp.databricks.com/en/query-federation/bigquery.htmlBest regardsLorenz

  • 0 kudos
emilyawalker
by New Contributor
  • 405 Views
  • 1 replies
  • 0 kudos

How can I effectively integrate my ai laptop’s local resources with Databricks for AI model training

Hi everyone, I’m currently working on AI projects and using an ai laptop for local development, while leveraging Databricks for larger model training and experimentation. I’m looking for advice on how to effectively integrate my ai laptop local resou...

  • 405 Views
  • 1 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

Those are a lot of questions.For each of your questions, one could come up with an answer that runs locally on your computer.But......you will probably regret this as it will be very hard to maintain and deploy to databricks.It won't be an exact copy...

  • 0 kudos
a_t_h_i
by New Contributor
  • 3341 Views
  • 2 replies
  • 1 kudos

Move managed DLT table from one schema to another schema in Databricks

I have a DLT table in schema A which is being loaded by DLT pipeline.I want to move the table from schema A to schema B, and repoint my existing DLT pipeline to table in schema B. also I need to avoid full reload in DLT pipeline on table in Schema B....

Data Engineering
delta-live-table
deltalivetable
deltatable
dlt
  • 3341 Views
  • 2 replies
  • 1 kudos
Latest Reply
ThuanVanNguyen
New Contributor II
  • 1 kudos

Any updates on this since we are having this requirement

  • 1 kudos
1 More Replies
PJ11
by New Contributor
  • 577 Views
  • 1 replies
  • 0 kudos

Upset Plot in Databricks

I am trying to create an Upset Plot using following code, but my output is not as expected. See Image1: Output which I am getting vs Image2: Output expected. Where the total count of overlap is displayed at the top of each bar, Bar size is proportion...

  • 577 Views
  • 1 replies
  • 0 kudos
Latest Reply
filipniziol
Esteemed Contributor
  • 0 kudos

Hi @PJ11 ,As per documentation: UpSetPlot internally works with data based on Pandas data structures: a Series when all you care about is counts, or a DataFrame when you’re interested in visualising additional properties of the data, such as with the...

  • 0 kudos
adihc
by New Contributor II
  • 1926 Views
  • 9 replies
  • 1 kudos

Resolved! Options to access files in the community edition

As of now DBFS option is disabled in the Databricks community edition. What are the other ways to use file in the Databricks notebooks for learning? When I go to catalog it show default option only with the AWS S3. Is it the only option to access the...

  • 1926 Views
  • 9 replies
  • 1 kudos
Latest Reply
gchandra
Databricks Employee
  • 1 kudos

It's fixed. You can continue to use Upload.

  • 1 kudos
8 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels