cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

hari-prasad
by Valued Contributor II
  • 780 Views
  • 0 replies
  • 2 kudos

Databricks UniForm - Bridging Delta Lake and Iceberg

Databricks UniForm, enables seamless integration between Delta Lake and Iceberg formats. Databricks UniForm key features include:Interoperability: Read Delta tables with Iceberg clients without rewriting data.Automatic Metadata Generation: Asynchrono...

  • 780 Views
  • 0 replies
  • 2 kudos
martindlarsson
by New Contributor III
  • 2411 Views
  • 2 replies
  • 0 kudos

Jobs indefinitely pending with libraries install

I think I found a bug where you get Pending indefinitely on jobs that has a library requirement and the user of the job does not have Manage permission on the cluster.In my case I was trying to start a dbt job with dbt-databricks=1.8.5 as library. Th...

  • 2411 Views
  • 2 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

Thanks for your feedback! Just checking is this still an issue for you? would you share more details? if I wanted to reproduce this for example.

  • 0 kudos
1 More Replies
ls
by New Contributor III
  • 4432 Views
  • 10 replies
  • 0 kudos

Resolved! Py4JJavaError: An error occurred while calling o552.count()

Hey! I'm new to the forums but not Databricks, trying to get some help with this question:The error also is also fickle since it only appears what seems to be random. Like when running the same code it works then on the next run with a new set of dat...

  • 4432 Views
  • 10 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

@ls Agree, it doesn't seem to be fixed. Maybe on DBR 16 memory management is better optimized, hence I'd like to suggest going through the methods mentioned earlier in this post: Memory Profiling: Try freezing the dataset that reproduces the problem...

  • 0 kudos
9 More Replies
Alby091
by New Contributor
  • 704 Views
  • 1 replies
  • 0 kudos

Multiple schedules in workflow with different parameters

I have a notebook that takes a file from the landing, processes it and saves a delta table.This notebook contains a parameter (time_prm) that allows you to do this option for the different versions of files that arrive every day.Specifically, for eac...

Data Engineering
parameters
Workflows
  • 704 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Right now jobs support only 1 Schedule process per job, as you have mentioned you will need to create a different job for each schedule you require, you can use the clone capability to facilitate the process.

  • 0 kudos
Maatari
by New Contributor III
  • 1079 Views
  • 1 replies
  • 0 kudos

How Dedicated Access mode work ?

I have a question about the dedicated access modehttps://docs.databricks.com/en/compute/group-access.htmlIt is stated that:"Dedicated access mode is the latest version of single user access mode. With dedicated access, a compute resource can be assig...

  • 1079 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

The dedicated access mode allows multiple users within the assigned group to access and use the compute resource simultaneously. This is different from the traditional single-user access mode, as it enables secure sharing of the resource among group ...

  • 0 kudos
ashraf1395
by Honored Contributor
  • 1376 Views
  • 1 replies
  • 1 kudos

Resolved! referencing external locations in python notebooks

How can I refrence external lcoations in python notebook. I got the docs for referencing it in python : https://docs.databricks.com/en/sql/language-manual/sql-ref-external-locations.html.But how to do it in python. I am not able to understand. Do we ...

  • 1376 Views
  • 1 replies
  • 1 kudos
Latest Reply
fmadeiro
Contributor II
  • 1 kudos

@ashraf1395 ,Referencing external locations in a Databricks Python notebook, particularly for environments like Azure DevOps with different paths for development (dev) and production (prod), can be effectively managed using parameterized variables. H...

  • 1 kudos
Avinash_Narala
by Valued Contributor II
  • 398 Views
  • 1 replies
  • 1 kudos

Resolved! which type of cluster to use

Hi,Recently, I had some logic to collect the dataframe and process row by row. I am using 128GB driver node but it is taking significantly more time (like 2 hours for just 700 rows of data).May I know which type of cluster should I use and the driver...

  • 398 Views
  • 1 replies
  • 1 kudos
Latest Reply
Ayushi_Suthar
Databricks Employee
  • 1 kudos

Hi @Avinash_Narala , Good Day!  For right-sizing the cluster, the recommended approach is a hybrid approach for node provisioning in the cluster along with autoscaling. This involves defining the number of on-demand instances and spot instances for t...

  • 1 kudos
Michael_Galli
by Contributor III
  • 2691 Views
  • 5 replies
  • 1 kudos

Resolved! Importing data into Excel from Databricks over ODBC OAuth / Simba Spark Driver

Hi all,I am refering to this articleConnect to Azure Databricks from Microsoft Excel - Azure Databricks | Microsoft LearnI use the latest SimbaSparkODBC-2.8.2.1013-Windows-64bit driver and configured in like in that documentation.In Databricks I use ...

  • 2691 Views
  • 5 replies
  • 1 kudos
Latest Reply
Aydin
New Contributor II
  • 1 kudos

Hi @Michael_Galli, we're currently experiencing the same issue. I've just asked our internal support team to raise a ticket with Microsoft but thought it would be worth reaching out to you. Have you had any luck resolving this issue?

  • 1 kudos
4 More Replies
sgannavaram
by New Contributor III
  • 3511 Views
  • 3 replies
  • 1 kudos

How to connect to IBM MQ from Databricks notebook?

We are trying to connect to IBM MQ and post message to MQ, which eventually consumed by mainframe application.What are the IBM MQ clients .jars / libraries installed in cluster ? if you have any sample code for connectivity that would be helpful.

  • 3511 Views
  • 3 replies
  • 1 kudos
Latest Reply
none_ranjeet
New Contributor III
  • 1 kudos

Were you able to do this connection other than rest API which have problem in reading Binary messages, Please suggest

  • 1 kudos
2 More Replies
aliacovella
by Contributor
  • 839 Views
  • 2 replies
  • 2 kudos

Resolved! DLT Vs Notebook runs

I have this behavior that I'm not understanding. I have a notebook that defines a DLT from a Kinesis stream and a view from that DLT. This works when I run it from within workflow, configured using the DLT pipeline. If, however, I create a workflow a...

  • 839 Views
  • 2 replies
  • 2 kudos
Latest Reply
hari-prasad
Valued Contributor II
  • 2 kudos

Hi @aliacovella , DLT notebooks/codes only works with DLT pipelines.And regular Spark or SQL notebooks work with workflows. 

  • 2 kudos
1 More Replies
johnnwanosike
by New Contributor III
  • 457 Views
  • 2 replies
  • 0 kudos

Unable to connect internal hive metastore

I am unable to find the correct password for the internal Hive metastore I created. The protocol used was JDBC. What is the best way to connect to it? Additionally, I want to connect to an external Hive metastore as well.

  • 457 Views
  • 2 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @johnnwanosike, How have you created the metastore? have you followed any documentation. About external hive you can refer to: https://docs.databricks.com/ja/archive/external-metastores/external-hive-metastore.html

  • 0 kudos
1 More Replies
sahil_s_jain
by New Contributor III
  • 1408 Views
  • 6 replies
  • 1 kudos

Issue: NoSuchMethodError in Spark Job While Upgrading to Databricks 15.5 LTS

Problem DescriptionI am attempting to upgrade my application from Databricks runtime version 12.2 LTS to 15.5 LTS. During this upgrade, my Spark job fails with the following error:java.lang.NoSuchMethodError: org.apache.spark.scheduler.SparkListenerA...

  • 1408 Views
  • 6 replies
  • 1 kudos
Latest Reply
DBonomo
New Contributor II
  • 1 kudos

No I am currently downgrading to an older DBR (13.3) and running these jobs specifically on that version. That brings it's own suite of problems though.

  • 1 kudos
5 More Replies
PabloCSD
by Valued Contributor II
  • 2335 Views
  • 4 replies
  • 1 kudos

Resolved! How to connect via JDBC to SAP-HANA in a Databricks Notebook?

I have a set of connection credentials for SAP-HANA, how can I retrieve data from that location using JDBC?I have already installed in my cluster the ngdbc.jar (for the driver), but this simple Query has already taken more than 5 minutes and I don't ...

  • 2335 Views
  • 4 replies
  • 1 kudos
Latest Reply
PabloCSD
Valued Contributor II
  • 1 kudos

It worked changing the port to: 30041, the port for the next tenant (reference: https://community.sap.com/t5/technology-q-a/hana-connectivity-and-ports/qaq-p/12193927 ).jdbcQuery = '(SELECT * FROM DUMMY)' df_sap_hana_dummy_table = (spark.read .form...

  • 1 kudos
3 More Replies
jeremy98
by Honored Contributor
  • 267 Views
  • 1 replies
  • 0 kudos

Dynamic scheduling again and again

Hi Community,Is it possible to dynamic scheduling a databricks job definition as is possible to do it on Airflow Dags? If not, which could be a way to handle it?

  • 267 Views
  • 1 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @jeremy98, Databricks does not natively support dynamic scheduling of job definitions in the same way that Apache Airflow does with its Directed Acyclic Graphs (DAGs). However, there are ways to achieve similar functionality using Databricks Jobs:...

  • 0 kudos
KristiLogos
by Contributor
  • 984 Views
  • 6 replies
  • 0 kudos

Resolved! Connection from BigQuery to Databricks populating dictionary keys as "v"

I was able to connect our Bigquery account to our Databricks catalog. However, all the keys in the nested dictionary columsn populate as 'v'. For example:{"v":[{"v":{"f":[{"v":"engagement_time_msec"},{"v":{"f":[{"v":null},{"v":"2"},{"v":null},{"v":nu...

  • 984 Views
  • 6 replies
  • 0 kudos
Latest Reply
KristiLogos
Contributor
  • 0 kudos

@szymon_dybczak I couldn't run select TO_JSON_STRING(event_params) as event_params FROM ...I don't think thats a built-in Databricks. Is there another way you've had success?error:[UNRESOLVED_ROUTINE] Cannot resolve routine `TO_JSON_STRING` on search...

  • 0 kudos
5 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels