cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Avinash_Narala
by Contributor III
  • 101 Views
  • 1 replies
  • 1 kudos

Resolved! which type of cluster to use

Hi,Recently, I had some logic to collect the dataframe and process row by row. I am using 128GB driver node but it is taking significantly more time (like 2 hours for just 700 rows of data).May I know which type of cluster should I use and the driver...

  • 101 Views
  • 1 replies
  • 1 kudos
Latest Reply
Ayushi_Suthar
Databricks Employee
  • 1 kudos

Hi @Avinash_Narala , Good Day!  For right-sizing the cluster, the recommended approach is a hybrid approach for node provisioning in the cluster along with autoscaling. This involves defining the number of on-demand instances and spot instances for t...

  • 1 kudos
Michael_Galli
by Contributor III
  • 1416 Views
  • 5 replies
  • 1 kudos

Resolved! Importing data into Excel from Databricks over ODBC OAuth / Simba Spark Driver

Hi all,I am refering to this articleConnect to Azure Databricks from Microsoft Excel - Azure Databricks | Microsoft LearnI use the latest SimbaSparkODBC-2.8.2.1013-Windows-64bit driver and configured in like in that documentation.In Databricks I use ...

  • 1416 Views
  • 5 replies
  • 1 kudos
Latest Reply
Aydin
New Contributor II
  • 1 kudos

Hi @Michael_Galli, we're currently experiencing the same issue. I've just asked our internal support team to raise a ticket with Microsoft but thought it would be worth reaching out to you. Have you had any luck resolving this issue?

  • 1 kudos
4 More Replies
sgannavaram
by New Contributor III
  • 2800 Views
  • 3 replies
  • 1 kudos

How to connect to IBM MQ from Databricks notebook?

We are trying to connect to IBM MQ and post message to MQ, which eventually consumed by mainframe application.What are the IBM MQ clients .jars / libraries installed in cluster ? if you have any sample code for connectivity that would be helpful.

  • 2800 Views
  • 3 replies
  • 1 kudos
Latest Reply
none_ranjeet
New Contributor III
  • 1 kudos

Were you able to do this connection other than rest API which have problem in reading Binary messages, Please suggest

  • 1 kudos
2 More Replies
aliacovella
by New Contributor III
  • 416 Views
  • 2 replies
  • 2 kudos

Resolved! DLT Vs Notebook runs

I have this behavior that I'm not understanding. I have a notebook that defines a DLT from a Kinesis stream and a view from that DLT. This works when I run it from within workflow, configured using the DLT pipeline. If, however, I create a workflow a...

  • 416 Views
  • 2 replies
  • 2 kudos
Latest Reply
hari-prasad
Valued Contributor
  • 2 kudos

Hi @aliacovella , DLT notebooks/codes only works with DLT pipelines.And regular Spark or SQL notebooks work with workflows. 

  • 2 kudos
1 More Replies
johnnwanosike
by New Contributor II
  • 140 Views
  • 2 replies
  • 0 kudos

Unable to connect internal hive metastore

I am unable to find the correct password for the internal Hive metastore I created. The protocol used was JDBC. What is the best way to connect to it? Additionally, I want to connect to an external Hive metastore as well.

  • 140 Views
  • 2 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @johnnwanosike, How have you created the metastore? have you followed any documentation. About external hive you can refer to: https://docs.databricks.com/ja/archive/external-metastores/external-hive-metastore.html

  • 0 kudos
1 More Replies
shubhamM
by New Contributor II
  • 289 Views
  • 3 replies
  • 2 kudos

Resolved! Databricks File Trigger Limit

For Databricks File Trigger below limitation is mentioned.A storage location configured for a file arrival trigger can contain only up to 10,000 files. Locations with more files cannot be monitored for new file arrivals. If the configured storage loc...

  • 289 Views
  • 3 replies
  • 2 kudos
Latest Reply
Walter_C
Databricks Employee
  • 2 kudos

Your approach to managing the number of BLOBs in your Azure BLOB storage by moving older files to an archive directory is reasonable and can help ensure you do not exceed the 10,000 file limit in the monitored directories. This method will help keep ...

  • 2 kudos
2 More Replies
sahil_s_jain
by New Contributor III
  • 360 Views
  • 6 replies
  • 1 kudos

Issue: NoSuchMethodError in Spark Job While Upgrading to Databricks 15.5 LTS

Problem DescriptionI am attempting to upgrade my application from Databricks runtime version 12.2 LTS to 15.5 LTS. During this upgrade, my Spark job fails with the following error:java.lang.NoSuchMethodError: org.apache.spark.scheduler.SparkListenerA...

  • 360 Views
  • 6 replies
  • 1 kudos
Latest Reply
DBonomo
New Contributor II
  • 1 kudos

No I am currently downgrading to an older DBR (13.3) and running these jobs specifically on that version. That brings it's own suite of problems though.

  • 1 kudos
5 More Replies
PabloCSD
by Valued Contributor
  • 279 Views
  • 4 replies
  • 1 kudos

Resolved! How to connect via JDBC to SAP-HANA in a Databricks Notebook?

I have a set of connection credentials for SAP-HANA, how can I retrieve data from that location using JDBC?I have already installed in my cluster the ngdbc.jar (for the driver), but this simple Query has already taken more than 5 minutes and I don't ...

  • 279 Views
  • 4 replies
  • 1 kudos
Latest Reply
PabloCSD
Valued Contributor
  • 1 kudos

It worked changing the port to: 30041, the port for the next tenant (reference: https://community.sap.com/t5/technology-q-a/hana-connectivity-and-ports/qaq-p/12193927 ).jdbcQuery = '(SELECT * FROM DUMMY)' df_sap_hana_dummy_table = (spark.read .form...

  • 1 kudos
3 More Replies
jeremy98
by Contributor
  • 78 Views
  • 1 replies
  • 0 kudos

Dynamic scheduling again and again

Hi Community,Is it possible to dynamic scheduling a databricks job definition as is possible to do it on Airflow Dags? If not, which could be a way to handle it?

  • 78 Views
  • 1 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @jeremy98, Databricks does not natively support dynamic scheduling of job definitions in the same way that Apache Airflow does with its Directed Acyclic Graphs (DAGs). However, there are ways to achieve similar functionality using Databricks Jobs:...

  • 0 kudos
KristiLogos
by New Contributor III
  • 299 Views
  • 6 replies
  • 0 kudos

Resolved! Connection from BigQuery to Databricks populating dictionary keys as "v"

I was able to connect our Bigquery account to our Databricks catalog. However, all the keys in the nested dictionary columsn populate as 'v'. For example:{"v":[{"v":{"f":[{"v":"engagement_time_msec"},{"v":{"f":[{"v":null},{"v":"2"},{"v":null},{"v":nu...

  • 299 Views
  • 6 replies
  • 0 kudos
Latest Reply
KristiLogos
New Contributor III
  • 0 kudos

@szymon_dybczak I couldn't run select TO_JSON_STRING(event_params) as event_params FROM ...I don't think thats a built-in Databricks. Is there another way you've had success?error:[UNRESOLVED_ROUTINE] Cannot resolve routine `TO_JSON_STRING` on search...

  • 0 kudos
5 More Replies
LGABI
by New Contributor
  • 123 Views
  • 2 replies
  • 0 kudos

How to connect to Tableau Server FROM within Databricks Notebooks and publish data to Tableau Serv?

My company is having trouble connecting Databricks to Tableau Server. We need to be able to publish Hyper Files that are developed using Python on Databricks Notebooks to our Tableau Server, but it seems impossible to get a connection established des...

  • 123 Views
  • 2 replies
  • 0 kudos
Latest Reply
pgo
New Contributor II
  • 0 kudos

Please use netcat command for testing connection.

  • 0 kudos
1 More Replies
sahil_s_jain
by New Contributor III
  • 176 Views
  • 2 replies
  • 0 kudos

How to Exclude or Overwrite Specific JARs in Databricks Jars

Spark Version in Databricks 15.5 LTS: The runtime includes Apache Spark 3.5.x, which defines the SparkListenerApplicationEnd constructor as:public SparkListenerApplicationEnd(long time)This constructor takes a single long parameter.Conflicting Spark ...

  • 176 Views
  • 2 replies
  • 0 kudos
Latest Reply
sahil_s_jain
New Contributor III
  • 0 kudos

I am creating a uber jar of my application with spark 3.5.0 spark depedencies and using jar submit on the cluster for execution.As the spark libraries from the above jar "----ws_3_5--core--core-hive-2.3__hadoop-3.2_2.12_deploy.jar" are getting loaded...

  • 0 kudos
1 More Replies
jeremy98
by Contributor
  • 112 Views
  • 1 replies
  • 0 kudos

Resolved! Is there a INTERVAL data type?

Hi community,I was using a column in postgresSQL that is a DATETIME.TIMEDELTA, is it possible to have the same data type also in Databricks?

  • 112 Views
  • 1 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @jeremy98, You can use the TIMESTAMP and TIMESTAMP_NTZ data types to handle date and time values, similar to the DATETIME type in PostgreSQL. However, Databricks does not have a direct equivalent to PostgreSQL's TIMEDELTA type https://docs.databri...

  • 0 kudos
pavan_yndpl
by New Contributor
  • 105 Views
  • 1 replies
  • 0 kudos

How to resolve SSL_connect error when VPN is enabled

I am trying to connect to Databricks using ODBC protocol with Simba Driver DSN. I am able to successfully connect and access the data when our corporate VPN is turned OFF. but when it's turned ON , I am getting the following error "[Simba][ThriftExte...

  • 105 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

The error you are encountering, " (14) Unexpected response from server during a HTTP connection: SSL_connect:", when trying to connect to Databricks using the ODBC protocol with the Simba Driver DSN while the corporate VPN is turned on, is likely rel...

  • 0 kudos
raghu2
by New Contributor III
  • 149 Views
  • 2 replies
  • 1 kudos

Liquid Cluster enabled table - concurrent writes

I am trying to insert rows into a Liquid cluster enabled delta table using multiple threads. This link, states that liquid clustering is used for : Tables with concurrent write requirements.I get this error: [DELTA_CONCURRENT_APPEND] ConcurrentAppend...

  • 149 Views
  • 2 replies
  • 1 kudos
Latest Reply
TejeshS
New Contributor II
  • 1 kudos

We encountered a similar issue as well, and the workaround we tried was partitioning those columns, as Liquid clustering can sometimes trigger this error.

  • 1 kudos
1 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels