cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Mikes
by New Contributor
  • 1598 Views
  • 0 replies
  • 0 kudos

DatabricksUnityCatalog: notebooks lineage not showing up in table/view lineage or lineage graph

Notebooks lineage not showing up in table&view lineage or lineage graph.I created two table and one view from a notebook by following the doc: Capture and explore lineage All lineages work fine, except the notebook lineage:   Lineage graph: Here is m...

Mikes_0-1699499370870.png Mikes_1-1699499411334.png Mikes_2-1699498421958.png Mikes_3-1699498777477.png
Data Engineering
azure
databricks unity catalog
lineage
Notebook
  • 1598 Views
  • 0 replies
  • 0 kudos
inpefess
by New Contributor II
  • 3567 Views
  • 4 replies
  • 3 kudos

Does Databricks need GCP VMs for a workspace with no clusters in it?

Hi! I'm using GCP. Does Databricks workspace always need two e2-highmem-2 instances running as soon as I create a workspace? I seem them in my VM list in GCP console no matter what (I can stop or remove a cluster, but these two machines are always th...

  • 3567 Views
  • 4 replies
  • 3 kudos
Latest Reply
abagshaw
Databricks Employee
  • 3 kudos

To clarify, on Databricks on GCP will automatically delete the underlying GKE after 5 days of inactivity (no cluster launches or non-empty instance pools) in the workspace. You can contact Databricks support if you want to shorten the idle TTL for th...

  • 3 kudos
3 More Replies
MichaelO
by New Contributor III
  • 1970 Views
  • 0 replies
  • 0 kudos

gateway.create route for open source models

Am I able to use gateway.create_route in mlflow for open source LLM models?I'm aware of the syntax for propietary models like for openAI: from mlflow import gateway gateway.create_route( name=OpenAI_embeddings_route_name...

Data Engineering
llm
mlflow
  • 1970 Views
  • 0 replies
  • 0 kudos
Faisal
by Contributor
  • 1851 Views
  • 1 replies
  • 0 kudos

DLT bronze tables

I am trying to ingest incremental parquet files data to bronze streaming table, how much history data should be retained ideally in bronze layer as a general best practise considering I will be only using bronze to ingest source data and move it to s...

  • 1851 Views
  • 1 replies
  • 0 kudos
Latest Reply
MuthuLakshmi
Databricks Employee
  • 0 kudos

The amount of history data that should be retained in the bronze layer depends on your specific use case and requirements. As a general best practice, you should retain enough history data to support your downstream analytics and machine learning wor...

  • 0 kudos
CaptainJack
by New Contributor III
  • 6766 Views
  • 1 replies
  • 0 kudos

Workspace API

Hello friends. I am having problem with Workspace API. I have many folders inside my /Workspace (200+) which I would like to copy my Program, whole Program folder, which includes 20 spark scripts are Databricks notebooks. I tried Workspace API and I ...

  • 6766 Views
  • 1 replies
  • 0 kudos
Latest Reply
CaptainJack
New Contributor III
  • 0 kudos

I am using this as api = /api/2.0/workspace/import

  • 0 kudos
Immassive
by New Contributor II
  • 3051 Views
  • 1 replies
  • 0 kudos

Reading information_schema tables through JDBC connection

Hi, I am using Unity Catalog as storage for data. I have an external system that establishes connection to Unity Catalog via a JDBC connection using the Databricks driver:Configure the Databricks ODBC and JDBC drivers - Azure Databricks | Microsoft L...

  • 3051 Views
  • 1 replies
  • 0 kudos
Latest Reply
Immassive
New Contributor II
  • 0 kudos

Note: I can see the tables of the system.information schema in the UI of Databricks and read them there.

  • 0 kudos
JonLaRose
by New Contributor III
  • 7629 Views
  • 2 replies
  • 0 kudos

Resolved! Max amount of tables

Hi!What is the maximum amount of tables that is possible to create in a Unity catalog?Is there any difference between managed and external tables? If so, what is the limit for external tables? Thanks,Jonathan.

  • 7629 Views
  • 2 replies
  • 0 kudos
Latest Reply
JonLaRose
New Contributor III
  • 0 kudos

answer is here:https://docs.databricks.com/en/data-governance/unity-catalog/index.html#resource-quotas

  • 0 kudos
1 More Replies
coltonflowers
by New Contributor III
  • 3699 Views
  • 0 replies
  • 0 kudos

MLFlow Spark UDF Error

After trying to run spark_udf = mlflow.pyfunc.spark_udf(spark, model_uri=logged_model,env_manager="virtualenv")We get the following error:org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 145.0 failed 4 times, most re...

  • 3699 Views
  • 0 replies
  • 0 kudos
alj_a
by New Contributor III
  • 1701 Views
  • 1 replies
  • 0 kudos

source db and target db in DLT

Hi,Thanks in advance.I am new in DLT, the scenario is i need to read the data from cloud storage(ADLS) and load it into my bronze table. and read it from bronz table -> do some DQ checks and load the cleaned data into my silver table. finally populat...

  • 1701 Views
  • 1 replies
  • 0 kudos
marianopenn
by New Contributor III
  • 3901 Views
  • 2 replies
  • 1 kudos

Databricks VSCode Extension Sync Timeout

I am using the databricks VSCode extension to sync my local repository to Databricks Workspaces. I have everything configured such that smaller syncs work fine, but a full sync of my repository leads to the following error:Sync Error: Post "https://<...

Data Engineering
dbx sync
Repos
VSCode
Workspaces
  • 3901 Views
  • 2 replies
  • 1 kudos
Latest Reply
kimongrigorakis
New Contributor II
  • 1 kudos

Same issue here..... Can someone please help??

  • 1 kudos
1 More Replies
Phani1
by Databricks MVP
  • 1197 Views
  • 0 replies
  • 0 kudos

Unity catalog accounts

Hi Team,We have the requirement to have metadata(Unity catalog) in one AWS account and data storage(Delta tables under data) in another account, is it possible to do that , Do we face any technical/Security issue??

  • 1197 Views
  • 0 replies
  • 0 kudos
278875
by New Contributor
  • 22824 Views
  • 4 replies
  • 1 kudos

How do I figure out the cost breakdown for Databricks

I'm trying to figure out the cost breakdown for the Databricks usage for my team.When I go into the Databricks administration console and click Usage when I select to show the usage By SKU it just displays the type of cluster but not the name of it. ...

  • 22824 Views
  • 4 replies
  • 1 kudos
Latest Reply
MuthuLakshmi
Databricks Employee
  • 1 kudos

Please check the below docs for usage related informations. The Billable Usage Logs: https://docs.databricks.com/en/administration-guide/account-settings/usage.html You can filter them using tags for more precise information which you are looking for...

  • 1 kudos
3 More Replies
Rafal9
by New Contributor III
  • 9275 Views
  • 0 replies
  • 0 kudos

Issue during testing SparkSession.sql() with pytest.

Dear Community,I am testing pyspark code via pytest using VS code and Databricks Connect.SparkSession is initiated from Databricks Connect: from databricks.connect import DatabricksSessionspark = DatabricksSession.builder.getOrCreate()I am  receiving...

  • 9275 Views
  • 0 replies
  • 0 kudos
Labels