cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

qwerty1
by Contributor
  • 5163 Views
  • 7 replies
  • 19 kudos

Resolved! When will databricks runtime be released for Scala 2.13?

I see that spark fully supports Scala 2.13. I wonder why is there no databricks runtime with Scala 2.13 yet. Any plans on making this available? It would be super useful.

  • 5163 Views
  • 7 replies
  • 19 kudos
Latest Reply
guersam
New Contributor II
  • 19 kudos

I agree with @777. As Scala 3 is getting mature and there are more real use cases with Scala 3 on Spark now, support for Scala 2.13 will be valuable to users including us.I think the recent upgrade of Databricks runtime from JDK 8 to 17 was one of a ...

  • 19 kudos
6 More Replies
Gary_Irick
by New Contributor III
  • 9292 Views
  • 9 replies
  • 10 kudos

Delta table partition directories when column mapping is enabled

I recently created a table on a cluster in Azure running Databricks Runtime 11.1. The table is partitioned by a "date" column. I enabled column mapping, like this:ALTER TABLE {schema}.{table_name} SET TBLPROPERTIES('delta.columnMapping.mode' = 'nam...

  • 9292 Views
  • 9 replies
  • 10 kudos
Latest Reply
talenik
New Contributor III
  • 10 kudos

Hi @Retired_mod , I have few queries on Directory Names with Column Mapping. I have this delta table on ADLS and I am trying to read it, but I am getting below error. How can we read delta tables with column mapping enabled with pyspark?Can you pleas...

  • 10 kudos
8 More Replies
amartinez
by New Contributor III
  • 4486 Views
  • 6 replies
  • 5 kudos

Workaround for GraphFrames not working on Delta Live Table?

According to this page, the GraphFrames package is included in the databricks runtime since at least 11.0. However trying to run a connected components algorithm inside a delta live table notebook yields the error java.lang.ClassNotFoundException: or...

  • 4486 Views
  • 6 replies
  • 5 kudos
Latest Reply
lprevost
Contributor
  • 5 kudos

I'm also trying to use GraphFrames inside a DLT pipeline.   I get an error that graphframes not installed in the cluster.   i"m using it successfully in test notebooks using the ML version of the cluster.  Is there a way to use this inside a DLT job?

  • 5 kudos
5 More Replies
Mohit_m
by Valued Contributor II
  • 24607 Views
  • 3 replies
  • 4 kudos

Resolved! How to get the Job ID and Run ID and save into a database

We are having Databricks Job running with main class and JAR file in it. Our JAR file code base is in Scala. Now, when our job starts running, we need to log Job ID and Run ID into the database for future purpose. How can we achieve this?

  • 24607 Views
  • 3 replies
  • 4 kudos
Latest Reply
Bruno-Castro
New Contributor II
  • 4 kudos

That article is for members only, can we also specify here how to do it (for those that are not medium members?). Thanks!

  • 4 kudos
2 More Replies
Ajay-Pandey
by Esteemed Contributor III
  • 1823 Views
  • 2 replies
  • 7 kudos

docs.databricks.com

Rename and drop columns with Delta Lake column mapping. Hi all,Now databricks started supporting column rename and drop.Column mapping requires the following Delta protocols:Reader version 2 or above.Writer version 5 or above.Blog URL##Available in D...

  • 1823 Views
  • 2 replies
  • 7 kudos
Latest Reply
Poovarasan
New Contributor III
  • 7 kudos

Above mentioned feature is not working in the DLT pipeline. if the scrip has more than 4 columns 

  • 7 kudos
1 More Replies
ranged_coop
by Valued Contributor II
  • 18554 Views
  • 22 replies
  • 28 kudos

How to install Chromium Browser and Chrome Driver on DBX runtime 10.4 and above ?

Hi Team,We are wondering if there is a recommended way to install the chromium browser and chrome driver on Databricks Runtime 10.4 and above ?I have been through the site and have come across several links to this effect, but they all seem to be ins...

  • 18554 Views
  • 22 replies
  • 28 kudos
Latest Reply
Kaizen
Valued Contributor
  • 28 kudos

Look into Playwrite instead of Selenium. I went through the same process y'all went through here (ended up writing a init script to install the drivers etc)This is all done for you in playwright. Refer to this post - I hope it helps!!https://communit...

  • 28 kudos
21 More Replies
azera
by New Contributor II
  • 1866 Views
  • 2 replies
  • 2 kudos

Stream-stream window join after time window aggregation not working in 13.1

Hey,I'm trying to perform Time window aggregation in two different streams followed by stream-stream window join described here. I'm running Databricks Runtime 13.1, exactly as advised.However, when I'm reproducing the following code:clicksWindow = c...

  • 1866 Views
  • 2 replies
  • 2 kudos
Latest Reply
Happyfield7
New Contributor II
  • 2 kudos

Hey,I'm currently facing the same problem, so I would to know if you've made any progress in resolving this issue.

  • 2 kudos
1 More Replies
sevvalmehder
by New Contributor II
  • 2330 Views
  • 3 replies
  • 3 kudos

Databricks run-time 12.2 LTS drop function problem

I am getting an error about the `drop function of pyspark` at a cluster using 12.2 LTS. When I check the error I see spark solved that bug, see SPARK-42444. Also when I check maintenance updates page, I saw this solved issue included the Databricks R...

image.png
  • 2330 Views
  • 3 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Sevval Mehder​ Elevate our community by acknowledging exceptional contributions. Your participation in marking the best answers is a testament to our collective pursuit of knowledge.

  • 3 kudos
2 More Replies
darkraisisi
by New Contributor
  • 908 Views
  • 0 replies
  • 0 kudos

Is there a way to manually update the cuda required file in the db runtime? There are some rather annoying bugs still in TF 2.11 that have been fixed ...

Is there a way to manually update the cuda required file in the db runtime?There are some rather annoying bugs still in TF 2.11 that have been fixed in TF 2.12.Sadly the latest DB runtime 13.1 (beta) only supports the older TF 2.11 even tho 2.12 was ...

  • 908 Views
  • 0 replies
  • 0 kudos
grazie
by Contributor
  • 2390 Views
  • 2 replies
  • 2 kudos

how to get dbutils in Runtime 13

We're using the following method (generated by using dbx) to access dbutils, e.g. to retrieve parameters from secret scopes: @staticmethod def _get_dbutils(spark: SparkSession) -> "dbutils": try: from pyspark.dbutils import...

  • 2390 Views
  • 2 replies
  • 2 kudos
Latest Reply
colt
New Contributor III
  • 2 kudos

We have something similar in our code. This worked using runtime 13 until last week. Also the Machine Learning DBR doesn't work either.

  • 2 kudos
1 More Replies
Thanapat_S
by Contributor
  • 2916 Views
  • 2 replies
  • 0 kudos

Resolved! Table access control is deprecated in Databricks Runtime for Machine Learning

After reviewing this Deprecations, I discovered that Table Access Control is not supported in Databricks Runtime for Machine Learning.I want to understand why table access control is not designed for ML runtime. Is there any reason behind this?

image.png
  • 2916 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Thanapat Sontayasara​ Table Access Control (TAC) is a feature in Databricks that allows you to restrict access to specific tables in your workspace based on user or group identity.According to the Databricks documentation, TAC is not supported in th...

  • 0 kudos
1 More Replies
AyushModi038
by New Contributor III
  • 16485 Views
  • 2 replies
  • 1 kudos

Resolved! Upgrade Python version in cluster

Currently I am using the following cluster. It is using the default python version of 3.9.5 and I would like to update it to 3.10.1.0How to achieve this?

image
  • 16485 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Ayush Modi​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers yo...

  • 1 kudos
1 More Replies
Data_Engineer3
by Contributor III
  • 11107 Views
  • 4 replies
  • 5 kudos

How can i use the same spark session from onenotebook to another notebook in databricks

I want to use the same spark session which created in one notebook and need to be used in another notebook in across same environment, Example, if some of the (variable)object got initialized in the first notebook, i need to use the same object in t...

  • 11107 Views
  • 4 replies
  • 5 kudos
Latest Reply
Manoj12421
Valued Contributor II
  • 5 kudos

You can use %run and then use the location of the notebook - %run "/folder/notebookname"

  • 5 kudos
3 More Replies
Rahul2025
by New Contributor III
  • 3652 Views
  • 4 replies
  • 4 kudos

Make environment variables defined in init script available to Spark JVM job?

Hi,We're using Databricks Runtime version 11.3LTS and executing a Spark Java Job using a Job Cluster. To automate the execution of this job, we need to define (source in from bash config files) some environment variables through an init script (clust...

  • 3652 Views
  • 4 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hi @Rahul K​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 4 kudos
3 More Replies
thushar
by Contributor
  • 8711 Views
  • 5 replies
  • 0 kudos

Optimize & Compaction

Hi,From which data bricks runtime will support Optimize and compaction

  • 8711 Views
  • 5 replies
  • 0 kudos
Latest Reply
Joe_Suarez
New Contributor III
  • 0 kudos

Optimize and compaction are operations commonly used in Apache Spark for optimizing and improving the performance of data storage and processing. Databricks, which is a cloud-based platform for Apache Spark, provides support for these operations on v...

  • 0 kudos
4 More Replies
Labels