cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

ashraf1395
by Honored Contributor
  • 2176 Views
  • 1 replies
  • 1 kudos

Resolved! referencing external locations in python notebooks

How can I refrence external lcoations in python notebook. I got the docs for referencing it in python : https://docs.databricks.com/en/sql/language-manual/sql-ref-external-locations.html.But how to do it in python. I am not able to understand. Do we ...

  • 2176 Views
  • 1 replies
  • 1 kudos
Latest Reply
fmadeiro
Contributor II
  • 1 kudos

@ashraf1395 ,Referencing external locations in a Databricks Python notebook, particularly for environments like Azure DevOps with different paths for development (dev) and production (prod), can be effectively managed using parameterized variables. H...

  • 1 kudos
Avinash_Narala
by Valued Contributor II
  • 623 Views
  • 1 replies
  • 1 kudos

Resolved! which type of cluster to use

Hi,Recently, I had some logic to collect the dataframe and process row by row. I am using 128GB driver node but it is taking significantly more time (like 2 hours for just 700 rows of data).May I know which type of cluster should I use and the driver...

  • 623 Views
  • 1 replies
  • 1 kudos
Latest Reply
Ayushi_Suthar
Databricks Employee
  • 1 kudos

Hi @Avinash_Narala , Good Day!  For right-sizing the cluster, the recommended approach is a hybrid approach for node provisioning in the cluster along with autoscaling. This involves defining the number of on-demand instances and spot instances for t...

  • 1 kudos
Michael_Galli
by Contributor III
  • 3319 Views
  • 5 replies
  • 1 kudos

Resolved! Importing data into Excel from Databricks over ODBC OAuth / Simba Spark Driver

Hi all,I am refering to this articleConnect to Azure Databricks from Microsoft Excel - Azure Databricks | Microsoft LearnI use the latest SimbaSparkODBC-2.8.2.1013-Windows-64bit driver and configured in like in that documentation.In Databricks I use ...

  • 3319 Views
  • 5 replies
  • 1 kudos
Latest Reply
Aydin
New Contributor II
  • 1 kudos

Hi @Michael_Galli, we're currently experiencing the same issue. I've just asked our internal support team to raise a ticket with Microsoft but thought it would be worth reaching out to you. Have you had any luck resolving this issue?

  • 1 kudos
4 More Replies
sgannavaram
by New Contributor III
  • 4004 Views
  • 3 replies
  • 1 kudos

How to connect to IBM MQ from Databricks notebook?

We are trying to connect to IBM MQ and post message to MQ, which eventually consumed by mainframe application.What are the IBM MQ clients .jars / libraries installed in cluster ? if you have any sample code for connectivity that would be helpful.

  • 4004 Views
  • 3 replies
  • 1 kudos
Latest Reply
none_ranjeet
New Contributor III
  • 1 kudos

Were you able to do this connection other than rest API which have problem in reading Binary messages, Please suggest

  • 1 kudos
2 More Replies
aliacovella
by Contributor
  • 1191 Views
  • 2 replies
  • 2 kudos

Resolved! DLT Vs Notebook runs

I have this behavior that I'm not understanding. I have a notebook that defines a DLT from a Kinesis stream and a view from that DLT. This works when I run it from within workflow, configured using the DLT pipeline. If, however, I create a workflow a...

  • 1191 Views
  • 2 replies
  • 2 kudos
Latest Reply
hari-prasad
Valued Contributor II
  • 2 kudos

Hi @aliacovella , DLT notebooks/codes only works with DLT pipelines.And regular Spark or SQL notebooks work with workflows. 

  • 2 kudos
1 More Replies
johnnwanosike
by New Contributor III
  • 694 Views
  • 2 replies
  • 0 kudos

Unable to connect internal hive metastore

I am unable to find the correct password for the internal Hive metastore I created. The protocol used was JDBC. What is the best way to connect to it? Additionally, I want to connect to an external Hive metastore as well.

  • 694 Views
  • 2 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @johnnwanosike, How have you created the metastore? have you followed any documentation. About external hive you can refer to: https://docs.databricks.com/ja/archive/external-metastores/external-hive-metastore.html

  • 0 kudos
1 More Replies
PabloCSD
by Valued Contributor II
  • 3947 Views
  • 4 replies
  • 1 kudos

Resolved! How to connect via JDBC to SAP-HANA in a Databricks Notebook?

I have a set of connection credentials for SAP-HANA, how can I retrieve data from that location using JDBC?I have already installed in my cluster the ngdbc.jar (for the driver), but this simple Query has already taken more than 5 minutes and I don't ...

  • 3947 Views
  • 4 replies
  • 1 kudos
Latest Reply
PabloCSD
Valued Contributor II
  • 1 kudos

It worked changing the port to: 30041, the port for the next tenant (reference: https://community.sap.com/t5/technology-q-a/hana-connectivity-and-ports/qaq-p/12193927 ).jdbcQuery = '(SELECT * FROM DUMMY)' df_sap_hana_dummy_table = (spark.read .form...

  • 1 kudos
3 More Replies
jeremy98
by Honored Contributor
  • 400 Views
  • 1 replies
  • 0 kudos

Dynamic scheduling again and again

Hi Community,Is it possible to dynamic scheduling a databricks job definition as is possible to do it on Airflow Dags? If not, which could be a way to handle it?

  • 400 Views
  • 1 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @jeremy98, Databricks does not natively support dynamic scheduling of job definitions in the same way that Apache Airflow does with its Directed Acyclic Graphs (DAGs). However, there are ways to achieve similar functionality using Databricks Jobs:...

  • 0 kudos
KristiLogos
by Contributor
  • 1447 Views
  • 6 replies
  • 0 kudos

Resolved! Connection from BigQuery to Databricks populating dictionary keys as "v"

I was able to connect our Bigquery account to our Databricks catalog. However, all the keys in the nested dictionary columsn populate as 'v'. For example:{"v":[{"v":{"f":[{"v":"engagement_time_msec"},{"v":{"f":[{"v":null},{"v":"2"},{"v":null},{"v":nu...

  • 1447 Views
  • 6 replies
  • 0 kudos
Latest Reply
KristiLogos
Contributor
  • 0 kudos

@szymon_dybczak I couldn't run select TO_JSON_STRING(event_params) as event_params FROM ...I don't think thats a built-in Databricks. Is there another way you've had success?error:[UNRESOLVED_ROUTINE] Cannot resolve routine `TO_JSON_STRING` on search...

  • 0 kudos
5 More Replies
LGABI
by New Contributor
  • 899 Views
  • 2 replies
  • 0 kudos

How to connect to Tableau Server FROM within Databricks Notebooks and publish data to Tableau Serv?

My company is having trouble connecting Databricks to Tableau Server. We need to be able to publish Hyper Files that are developed using Python on Databricks Notebooks to our Tableau Server, but it seems impossible to get a connection established des...

  • 899 Views
  • 2 replies
  • 0 kudos
Latest Reply
pgo
New Contributor III
  • 0 kudos

Please use netcat command for testing connection.

  • 0 kudos
1 More Replies
jeremy98
by Honored Contributor
  • 739 Views
  • 1 replies
  • 0 kudos

Resolved! Is there a INTERVAL data type?

Hi community,I was using a column in postgresSQL that is a DATETIME.TIMEDELTA, is it possible to have the same data type also in Databricks?

  • 739 Views
  • 1 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @jeremy98, You can use the TIMESTAMP and TIMESTAMP_NTZ data types to handle date and time values, similar to the DATETIME type in PostgreSQL. However, Databricks does not have a direct equivalent to PostgreSQL's TIMEDELTA type https://docs.databri...

  • 0 kudos
pavan_yndpl
by New Contributor
  • 1685 Views
  • 1 replies
  • 0 kudos

How to resolve SSL_connect error when VPN is enabled

I am trying to connect to Databricks using ODBC protocol with Simba Driver DSN. I am able to successfully connect and access the data when our corporate VPN is turned OFF. but when it's turned ON , I am getting the following error "[Simba][ThriftExte...

  • 1685 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

The error you are encountering, " (14) Unexpected response from server during a HTTP connection: SSL_connect:", when trying to connect to Databricks using the ODBC protocol with the Simba Driver DSN while the corporate VPN is turned on, is likely rel...

  • 0 kudos
raghu2
by New Contributor III
  • 1357 Views
  • 2 replies
  • 1 kudos

Liquid Cluster enabled table - concurrent writes

I am trying to insert rows into a Liquid cluster enabled delta table using multiple threads. This link, states that liquid clustering is used for : Tables with concurrent write requirements.I get this error: [DELTA_CONCURRENT_APPEND] ConcurrentAppend...

  • 1357 Views
  • 2 replies
  • 1 kudos
Latest Reply
TejeshS
Contributor
  • 1 kudos

We encountered a similar issue as well, and the workaround we tried was partitioning those columns, as Liquid clustering can sometimes trigger this error.

  • 1 kudos
1 More Replies
shanisolomon
by New Contributor II
  • 989 Views
  • 2 replies
  • 0 kudos

Databricks inconsistent count and select

Hi, I have a table with 2 versions:1. Add txn: path = "a.parquet" numRecords = 10 deletionVector = null2. Add txn: path = "a.parquet" numRecords = 10 deletionVector = (..., cardinality = 2)Please note both transactions point to the same physical path...

  • 989 Views
  • 2 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Hello the behavior observed indeed seems to be inconsistent with the expected behavior in delta, do you have a support contract to open a support ticket so this can be further analyzed?

  • 0 kudos
1 More Replies
Tej_04
by New Contributor
  • 2298 Views
  • 1 replies
  • 0 kudos

Avoid scientific values

I am trying to insert data into catalog tables on data bricks but the values are being displayed in scientific notation which i am trying to avoid how do I view the data in standard formatfor example - 0.0000000 is being displayed as 0E-7

Data Engineering
catalogtables scientificnotation
  • 2298 Views
  • 1 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @Tej_04, Can you try with format_number Function. SELECT format_number(column_name, decimal_places) AS column_name FROM table_name; https://docs.databricks.com/en/sql/language-manual/functions/format_number.html

  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels