cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

shiv4050
by New Contributor
  • 3927 Views
  • 4 replies
  • 0 kudos

Execute databricks notebook form a python source code.

Hello,I 'm trying to execute databricks notebook form a python source code but getting error.source code below------------------from databricks_api import DatabricksAPI   # Create a Databricks API client api = DatabricksAPI(host='databrick_host', tok...

  • 3927 Views
  • 4 replies
  • 0 kudos
Latest Reply
sewl
New Contributor II
  • 0 kudos

The error you are encountering indicates that there is an issue with establishing a connection to the Databricks host specified in your code. Specifically, the error message "getaddrinfo failed" suggests that the hostname or IP address you provided f...

  • 0 kudos
3 More Replies
dataslicer
by Contributor
  • 9388 Views
  • 4 replies
  • 1 kudos

Successfully installed Maven:Coordinates:com.crealytics:spark-excel_2.12:3.2.0_0.16.0 on Azure DBX 9.1 LTS runtime but getting error for missing dependency: org.apache.commons.io.IOUtils.byteArray(I)

I am using Azure DBX 9.1 LTS and successfully installed the following library on the cluster using Maven coordinates: com.crealytics:spark-excel_2.12:3.2.0_0.16.0When I executed the following line:excelSDF = spark.read.format("excel").option("dataAdd...

  • 9388 Views
  • 4 replies
  • 1 kudos
Latest Reply
RamRaju
New Contributor II
  • 1 kudos

Hi @dataslicer  were you able to solve this issue?I am using 9.1 lts databricks version with Spark 3.1.2 and scala 2.12. I have installed com.crealytics:spark-excel-2.12.17-3.1.2_2.12:3.1.2_0.18.1.  It was working fine but now facing same exception a...

  • 1 kudos
3 More Replies
SKC01
by New Contributor II
  • 2971 Views
  • 1 replies
  • 0 kudos

Delta table - version number change on merge

I am running a merge with pyspark on a delta table in which nothing is getting updated in the target table. Still target table version is incremented when I check the table history. Is that expected behavior?

Data Engineering
Delta table
deltatable
history
MERGE
version
  • 2971 Views
  • 1 replies
  • 0 kudos
Latest Reply
Sidhant07
Databricks Employee
  • 0 kudos

Yes, this is the expected behavior. In Delta Lake, every operation, including MERGE, is atomic. This means that each operation is a transaction that can either succeed completely or fail; it cannot have partial success. Even if the MERGE operation do...

  • 0 kudos
LiamS
by New Contributor
  • 4569 Views
  • 1 replies
  • 0 kudos

Resolved! Optimize table for joins using identity column

Hi There, I'm new to the delta table format so please bear with me if I've missed something obvious! I've migrated data from on prem. Sql to fabric and stored two related tables as delta tables. When I query data from these tables and join them based...

  • 4569 Views
  • 1 replies
  • 0 kudos
Latest Reply
Sidhant07
Databricks Employee
  • 0 kudos

Hi, You mentioned that you have tried Z-ordering but it didn't impact the performance. Z-ordering is a technique that co-locates related information in the same set of files. It works best when the data is filtered by the column specified in the Z-or...

  • 0 kudos
prawan128
by New Contributor II
  • 956 Views
  • 1 replies
  • 0 kudos

Triggering a job run on databricks compute cluster

Hi community, how to set jar_params for databricks jobs api when jar_params value is greater than 10000 bytes. 

  • 956 Views
  • 1 replies
  • 0 kudos
Latest Reply
prawan128
New Contributor II
  • 0 kudos

@Retired_mod I was asking about jar_params as mentioned in the https://docs.databricks.com/en/workflows/jobs/jobs-2.0-api.html#request-structure since for my use case it can be more than 10000 bytes.

  • 0 kudos
Mikes
by New Contributor
  • 1077 Views
  • 0 replies
  • 0 kudos

DatabricksUnityCatalog: notebooks lineage not showing up in table/view lineage or lineage graph

Notebooks lineage not showing up in table&view lineage or lineage graph.I created two table and one view from a notebook by following the doc: Capture and explore lineage All lineages work fine, except the notebook lineage:   Lineage graph: Here is m...

Mikes_0-1699499370870.png Mikes_1-1699499411334.png Mikes_2-1699498421958.png Mikes_3-1699498777477.png
Data Engineering
azure
databricks unity catalog
lineage
Notebook
  • 1077 Views
  • 0 replies
  • 0 kudos
inpefess
by New Contributor II
  • 2520 Views
  • 4 replies
  • 3 kudos

Does Databricks need GCP VMs for a workspace with no clusters in it?

Hi! I'm using GCP. Does Databricks workspace always need two e2-highmem-2 instances running as soon as I create a workspace? I seem them in my VM list in GCP console no matter what (I can stop or remove a cluster, but these two machines are always th...

  • 2520 Views
  • 4 replies
  • 3 kudos
Latest Reply
abagshaw
New Contributor III
  • 3 kudos

To clarify, on Databricks on GCP will automatically delete the underlying GKE after 5 days of inactivity (no cluster launches or non-empty instance pools) in the workspace. You can contact Databricks support if you want to shorten the idle TTL for th...

  • 3 kudos
3 More Replies
MichaelO
by New Contributor III
  • 1530 Views
  • 0 replies
  • 0 kudos

gateway.create route for open source models

Am I able to use gateway.create_route in mlflow for open source LLM models?I'm aware of the syntax for propietary models like for openAI: from mlflow import gateway gateway.create_route( name=OpenAI_embeddings_route_name...

Data Engineering
llm
mlflow
  • 1530 Views
  • 0 replies
  • 0 kudos
Faisal
by Contributor
  • 1313 Views
  • 1 replies
  • 0 kudos

DLT bronze tables

I am trying to ingest incremental parquet files data to bronze streaming table, how much history data should be retained ideally in bronze layer as a general best practise considering I will be only using bronze to ingest source data and move it to s...

  • 1313 Views
  • 1 replies
  • 0 kudos
Latest Reply
MuthuLakshmi
Databricks Employee
  • 0 kudos

The amount of history data that should be retained in the bronze layer depends on your specific use case and requirements. As a general best practice, you should retain enough history data to support your downstream analytics and machine learning wor...

  • 0 kudos
CaptainJack
by New Contributor III
  • 6043 Views
  • 1 replies
  • 0 kudos

Workspace API

Hello friends. I am having problem with Workspace API. I have many folders inside my /Workspace (200+) which I would like to copy my Program, whole Program folder, which includes 20 spark scripts are Databricks notebooks. I tried Workspace API and I ...

  • 6043 Views
  • 1 replies
  • 0 kudos
Latest Reply
CaptainJack
New Contributor III
  • 0 kudos

I am using this as api = /api/2.0/workspace/import

  • 0 kudos
Immassive
by New Contributor II
  • 2034 Views
  • 1 replies
  • 0 kudos

Reading information_schema tables through JDBC connection

Hi, I am using Unity Catalog as storage for data. I have an external system that establishes connection to Unity Catalog via a JDBC connection using the Databricks driver:Configure the Databricks ODBC and JDBC drivers - Azure Databricks | Microsoft L...

  • 2034 Views
  • 1 replies
  • 0 kudos
Latest Reply
Immassive
New Contributor II
  • 0 kudos

Note: I can see the tables of the system.information schema in the UI of Databricks and read them there.

  • 0 kudos
JonLaRose
by New Contributor III
  • 5698 Views
  • 2 replies
  • 0 kudos

Resolved! Max amount of tables

Hi!What is the maximum amount of tables that is possible to create in a Unity catalog?Is there any difference between managed and external tables? If so, what is the limit for external tables? Thanks,Jonathan.

  • 5698 Views
  • 2 replies
  • 0 kudos
Latest Reply
JonLaRose
New Contributor III
  • 0 kudos

answer is here:https://docs.databricks.com/en/data-governance/unity-catalog/index.html#resource-quotas

  • 0 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels