cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

jx1226
by New Contributor II
  • 2517 Views
  • 0 replies
  • 0 kudos

Connect to storage with private endpoint from workspace EnableNoPublicIP=No and VnetInjection=No

We know that Databricks with VNET injection (our own VNET) allows is to connect to blob storage/ ADLS Gen2 over private endpoints and peering. This is what we typically do.We have a client who created Databricks with EnableNoPublicIP=No (secure clust...

  • 2517 Views
  • 0 replies
  • 0 kudos
grazie
by Contributor
  • 2653 Views
  • 2 replies
  • 0 kudos

Azure Databricks, migrating delta table data with CDF on.

We are on Azure Databricks over ADLS Gen2 and have a set of tables and workflows that process data from and between those tables, using change data feeds. (We are not yet using Unity Catalog, and also not Hive metastore, just accessing delta tables f...

  • 2653 Views
  • 2 replies
  • 0 kudos
Latest Reply
grazie
Contributor
  • 0 kudos

As it turns out, due to a misunderstanding, the responses from Azure support were answering a slightly different question (about Azure Table Storage instead of Delta Tables on Blob/ADLS Gen2), so we'll try there again. However, still interested in id...

  • 0 kudos
1 More Replies
hafeez
by New Contributor III
  • 2109 Views
  • 1 replies
  • 1 kudos

Resolved! Hive metastore table access control End of Support

Hello,We are using Databricks with Hive metastore and not Unity Catalog.We would like to know if there is any End of Support on Table Access Control with Hive as this link it states that it is legacy.https://docs.databricks.com/en/data-governance/tab...

  • 2109 Views
  • 1 replies
  • 1 kudos
Latest Reply
" src="" />
This widget could not be displayed.
This widget could not be displayed.
This widget could not be displayed.
  • 1 kudos

This widget could not be displayed.
Hello,We are using Databricks with Hive metastore and not Unity Catalog.We would like to know if there is any End of Support on Table Access Control with Hive as this link it states that it is legacy.https://docs.databricks.com/en/data-governance/tab...

This widget could not be displayed.
  • 1 kudos
This widget could not be displayed.
deng_dev
by New Contributor III
  • 9521 Views
  • 0 replies
  • 0 kudos

py4j.protocol.Py4JJavaError: An error occurred while calling o359.sql. : java.util.NoSuchElementExce

Hi!We are creating table in streaming job every micro-batch using spark.sql('create or replace table ... using delta as ...') command. This query includes combining data from multiple tables.Sometimes our job fails with error:py4j.Py4JException: An e...

  • 9521 Views
  • 0 replies
  • 0 kudos
Michael_Galli
by Contributor III
  • 991 Views
  • 0 replies
  • 0 kudos

Many dbutils.notebook.run interations in a workflow -> Failed to checkout Github repository Error

Hi all,I have a workflow that runs one single notebook with dbutils.notebook.run() and different parameters in one long loop.At some point, I do have random git erros in the notebook run:com.databricks.WorkflowException: com.databricks.NotebookExecut...

  • 991 Views
  • 0 replies
  • 0 kudos
amruth
by New Contributor
  • 2369 Views
  • 4 replies
  • 0 kudos

How do i retrieve timestamp data from history in databricks sql not using DELTA table,its data is coming from SAP

I am not using delta tables my data is from SAP ..how do i retrieve timestamp(history) dynamically from SAP table using databricks SQL

  • 2369 Views
  • 4 replies
  • 0 kudos
Latest Reply
Dribka
New Contributor III
  • 0 kudos

@amruth If you're working with data from SAP in Databricks and want to retrieve timestamps dynamically from a SAP table, you can utilize Databricks SQL to achieve this. You'll need to identify the specific SAP table that contains the timestamp or his...

  • 0 kudos
3 More Replies
Kayla
by Valued Contributor
  • 1399 Views
  • 0 replies
  • 0 kudos

External Table From BigQuery

I'm working on implementing Unity Catalog, and part of that is determining how to handle our BigQuery tables. We need to utilize them to connect to another application, or else we'd stay within regular delta tables on Databricks.The page https://docs...

  • 1399 Views
  • 0 replies
  • 0 kudos
IonFreeman_Pace
by New Contributor III
  • 3764 Views
  • 4 replies
  • 1 kudos

Resolved! First notebook in ML course fails with wrong runtime

Help! I'm trying to run this first notebook in the Scalable MachIne LEarning (SMILE) course.https://github.com/databricks-academy/scalable-machine-learning-with-apache-spark-english/blob/published/ML%2000a%20-%20Spark%20Review.pyIt fails on the first...

  • 3764 Views
  • 4 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

it means your cluster type has to be a ML runtime.When you create a cluster in databricks, you can choose between different runtimes.These have different version (spark version), but also different types:For your case you need to select the ML menu o...

  • 1 kudos
3 More Replies
Hoping
by New Contributor
  • 2495 Views
  • 0 replies
  • 0 kudos

Size of each partitioned file (partitioned by default)

When I try a describe detail I get the number of files the delta table is partitioned into. How can I check the size of each file of these files that make up my entire table ?Will I be able to query each partitioned file to understand how they have b...

  • 2495 Views
  • 0 replies
  • 0 kudos
eric-cordeiro
by New Contributor II
  • 1578 Views
  • 0 replies
  • 0 kudos

Insufficient Permission when writing to AWS Redshift

I'm trying to write a table in AWS Redshift using the following code:try:    (df_source.write        .format("redshift")        .option("dbtable", f"{redshift_schema}.{table_name}")        .option("tempdir", tempdir)        .option("url", url)       ...

  • 1578 Views
  • 0 replies
  • 0 kudos
pgruetter
by Contributor
  • 1438 Views
  • 1 replies
  • 0 kudos

Streaming problems after Vaccum

Hi allTo read from a large Delta table, I'm using readStream but with a trigger(availableNow=True) as I only want to run it daily. This worked well for an intial load and then incremental loads after that.At some point though, I received an error fro...

  • 1438 Views
  • 1 replies
  • 0 kudos
Latest Reply
" src="" />
This widget could not be displayed.
This widget could not be displayed.
This widget could not be displayed.
  • 0 kudos

This widget could not be displayed.
Hi allTo read from a large Delta table, I'm using readStream but with a trigger(availableNow=True) as I only want to run it daily. This worked well for an intial load and then incremental loads after that.At some point though, I received an error fro...

This widget could not be displayed.
  • 0 kudos
This widget could not be displayed.
param_sen
by New Contributor II
  • 1726 Views
  • 1 replies
  • 1 kudos

Maintain the camelCase column names in the bronze layer, or is it advisable to rename column names

I am utilizing the Databricks autoloader to ingest files from Google Cloud Storage (GCS) into Delta tables in the bronze layer of a Medallion architecture. According to lakehouse principles, the bronze layer should store raw data  Hi dear community,I...

Data Engineering
dataengineering
delta_table
  • 1726 Views
  • 1 replies
  • 1 kudos
Latest Reply
Dribka
New Contributor III
  • 1 kudos

Hey @param_sen ,Navigating the nuances of naming conventions, especially when dealing with different layers in a lakehouse architecture, can be a bit of a puzzle. Your considerations are on point. If consistency across layers is a priority and downst...

  • 1 kudos
eimis_pacheco
by Contributor
  • 5370 Views
  • 3 replies
  • 1 kudos

Resolved! What are the best practices in bronze layer regarding the column data types?

Hi dear community,When I used to work in the Hadoop ecosystem with HDS the landing zone was our raw layer, and we used to use AVRO format for the serialization of this raw data (for the schema evolution feature), only assigning names to columns but n...

  • 5370 Views
  • 3 replies
  • 1 kudos
Latest Reply
param_sen
New Contributor II
  • 1 kudos

Hi dear community,I am utilizing the Databricks autoloader to ingest files from Google Cloud Storage (GCS) into Delta tables in the bronze layer of a Medallion architecture. According to lakehouse principles, the bronze layer should store raw data wi...

  • 1 kudos
2 More Replies
Karo
by New Contributor
  • 655 Views
  • 0 replies
  • 0 kudos

Function in juypter notebook 12x faster than in python script

Hello dear community,I wrote some ETL functions, e.g. to count the sessions until a conversion (see below). There for I load the data and then execute several small function for the feature generation.When I run the function feat_session_unitl_conver...

  • 655 Views
  • 0 replies
  • 0 kudos
Bharathi_23
by New Contributor II
  • 2067 Views
  • 1 replies
  • 0 kudos

I Completed the course 'Databricks Lakehouse Platform' but badge not received

Hi Team ,I have completed the 'Fundamentals of the Databricks Lakehouse Platform Accreditation (V2) ' course successfully , but not received the badge yetcan you pls check the attachment and help on this.thanks,Bharathi. 

Data Engineering
Badge not received
  • 2067 Views
  • 1 replies
  • 0 kudos
Latest Reply
" src="" />
This widget could not be displayed.
This widget could not be displayed.
This widget could not be displayed.
  • 0 kudos

This widget could not be displayed.
Hi Team ,I have completed the 'Fundamentals of the Databricks Lakehouse Platform Accreditation (V2) ' course successfully , but not received the badge yetcan you pls check the attachment and help on this.thanks,Bharathi. 

This widget could not be displayed.
  • 0 kudos
This widget could not be displayed.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels