cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Kayla
by Contributor
  • 350 Views
  • 1 replies
  • 0 kudos

External Table From BigQuery

I'm working on implementing Unity Catalog, and part of that is determining how to handle our BigQuery tables. We need to utilize them to connect to another application, or else we'd stay within regular delta tables on Databricks.The page https://docs...

  • 350 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Kayla, Certainly! Let’s discuss how Unity Catalog can help you manage your data and analytics assets, including BigQuery tables:   What is Unity Catalog? Unity Catalog is Databricks’ unified data, analytics, and AI governance solution on the lake...

  • 0 kudos
amruth
by New Contributor
  • 926 Views
  • 4 replies
  • 0 kudos

How do i retrieve timestamp data from history in databricks sql not using DELTA table,its data is coming from SAP

I am not using delta tables my data is from SAP ..how do i retrieve timestamp(history) dynamically from SAP table using databricks SQL

  • 926 Views
  • 4 replies
  • 0 kudos
Latest Reply
Dribka
New Contributor III
  • 0 kudos

@amruth If you're working with data from SAP in Databricks and want to retrieve timestamps dynamically from a SAP table, you can utilize Databricks SQL to achieve this. You'll need to identify the specific SAP table that contains the timestamp or his...

  • 0 kudos
3 More Replies
808727
by New Contributor III
  • 1274 Views
  • 4 replies
  • 1 kudos

Resolved! First notebook in ML course fails with wrong runtime

Help! I'm trying to run this first notebook in the Scalable MachIne LEarning (SMILE) course.https://github.com/databricks-academy/scalable-machine-learning-with-apache-spark-english/blob/published/ML%2000a%20-%20Spark%20Review.pyIt fails on the first...

  • 1274 Views
  • 4 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

it means your cluster type has to be a ML runtime.When you create a cluster in databricks, you can choose between different runtimes.These have different version (spark version), but also different types:For your case you need to select the ML menu o...

  • 1 kudos
3 More Replies
opl12
by New Contributor II
  • 228 Views
  • 0 replies
  • 1 kudos

SQL Sub Query Not Working

Olá pessoal, espero que todos estejam bem!Por favor, você pode ajudar? ou orientação?Está retornando um erro na instrução "CASE WHEN"A lógica é a seguinte: Se o campo `valor` FOR NULO ENTÃO eu executo uma Sub Consulta usando os filtros: origem, desti...

  • 228 Views
  • 0 replies
  • 1 kudos
pgruetter
by Contributor
  • 370 Views
  • 2 replies
  • 0 kudos

Streaming problems after Vaccum

Hi allTo read from a large Delta table, I'm using readStream but with a trigger(availableNow=True) as I only want to run it daily. This worked well for an intial load and then incremental loads after that.At some point though, I received an error fro...

  • 370 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @pgruetter , Certainly! Let’s delve into the behavior of readStream in the context of Delta tables and address your questions.   Delta Table Streaming with readStream: When you use readStream to read from a Delta table, it operates in an increment...

  • 0 kudos
1 More Replies
param_sen
by New Contributor II
  • 590 Views
  • 1 replies
  • 1 kudos

Maintain the camelCase column names in the bronze layer, or is it advisable to rename column names

I am utilizing the Databricks autoloader to ingest files from Google Cloud Storage (GCS) into Delta tables in the bronze layer of a Medallion architecture. According to lakehouse principles, the bronze layer should store raw data  Hi dear community,I...

Data Engineering
dataengineering
delta_table
  • 590 Views
  • 1 replies
  • 1 kudos
Latest Reply
Dribka
New Contributor III
  • 1 kudos

Hey @param_sen ,Navigating the nuances of naming conventions, especially when dealing with different layers in a lakehouse architecture, can be a bit of a puzzle. Your considerations are on point. If consistency across layers is a priority and downst...

  • 1 kudos
Bharathi_23
by New Contributor II
  • 512 Views
  • 3 replies
  • 0 kudos

Resolved! I Completed the course 'Databricks Lakehouse Platform' but badge not received

Hi Team ,I have completed the 'Fundamentals of the Databricks Lakehouse Platform Accreditation (V2) ' course successfully , but not received the badge yetcan you pls check the attachment and help on this.thanks,Bharathi. 

Data Engineering
Badge not received
  • 512 Views
  • 3 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Sure! ticketing portal.

  • 0 kudos
2 More Replies
eimis_pacheco
by Contributor
  • 2026 Views
  • 3 replies
  • 0 kudos

Resolved! What are the best practices in bronze layer regarding the column data types?

Hi dear community,When I used to work in the Hadoop ecosystem with HDS the landing zone was our raw layer, and we used to use AVRO format for the serialization of this raw data (for the schema evolution feature), only assigning names to columns but n...

  • 2026 Views
  • 3 replies
  • 0 kudos
Latest Reply
param_sen
New Contributor II
  • 0 kudos

Hi dear community,I am utilizing the Databricks autoloader to ingest files from Google Cloud Storage (GCS) into Delta tables in the bronze layer of a Medallion architecture. According to lakehouse principles, the bronze layer should store raw data wi...

  • 0 kudos
2 More Replies
Karo
by New Contributor
  • 221 Views
  • 0 replies
  • 0 kudos

Function in juypter notebook 12x faster than in python script

Hello dear community,I wrote some ETL functions, e.g. to count the sessions until a conversion (see below). There for I load the data and then execute several small function for the feature generation.When I run the function feat_session_unitl_conver...

  • 221 Views
  • 0 replies
  • 0 kudos
adriennn
by New Contributor III
  • 669 Views
  • 4 replies
  • 3 kudos

Resolved! SQL Warehouse - Table does not support overwrite by expression:

I'm copying data from a foreign catalog using a replace where logic in the target table, this work fine for two other tables. But for a specific one, I keep getting this error:Table does not support overwrite by expression: DeltaTableV2(org.apache.sp...

  • 669 Views
  • 4 replies
  • 3 kudos
Latest Reply
adriennn
New Contributor III
  • 3 kudos

Thank you for the checklist @Kaniz,> Review and validate the replace where expression.I was using dateadd() with a pipeline parameter, dateadd() returns a timestamp, which was being compared against a date column which threw the error.

  • 3 kudos
3 More Replies
Faisal
by Contributor
  • 564 Views
  • 3 replies
  • 1 kudos

Resolved! Error while creating delta table with partitions

Hi All,I am unable to create delta table with partitioning option, can someone please correct me what I am missing and help me with updated query  CREATE OR REPLACE TABLE invoice USING DELTA PARTITION BY (year(shp_dt), month(shp_dt)) LOCATION '/ta...

  • 564 Views
  • 3 replies
  • 1 kudos
Latest Reply
Emil_Kaminski
Contributor
  • 1 kudos

@Kaniz Hi. Is that not exactly what I suggested before? Sorry for stupid questions, but I am learning rules or earning kudos and getting solutions approved, therefore suggestions from your end would be appreciated. Thank you.

  • 1 kudos
2 More Replies
BWong
by New Contributor III
  • 2402 Views
  • 8 replies
  • 6 kudos

Resolved! Cannot spin up a cluster

HiWhen I try to spin up a cluster, it gives me a bootstrap timeout error{ "reason": { "code": "BOOTSTRAP_TIMEOUT", "parameters": { "databricks_error_message": "[id: InstanceId(i-00b2b7acdd82e5fde), status: INSTANCE_INITIALIZING, workerEnv...

  • 2402 Views
  • 8 replies
  • 6 kudos
Latest Reply
BWong
New Contributor III
  • 6 kudos

Thanks guys. It's indeed a network issue on the AWS side. It's resolved now

  • 6 kudos
7 More Replies
robertkoss
by New Contributor II
  • 612 Views
  • 2 replies
  • 1 kudos

Databricks Autoloader Schema Evolution throws StateSchemaNotCompatible exception

I am trying to use Databricks Autoloader for a very simple use case:Reading JSONs from S3 and loading them into a delta table, with schema inference and evolution.This is my code:self.spark \ .readStream \ .format("cloudFiles") \ .o...

Data Engineering
autoloader
spark
  • 612 Views
  • 2 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @robertkoss, Let’s dive into the intricacies of Databricks Autoloader and tackle the StateSchemaNotCompatible exception you’re encountering.   Schema Evolution and Autoloader: Autoloader is designed to handle schema evolution by updating the schem...

  • 1 kudos
1 More Replies
Daniel3
by New Contributor II
  • 1215 Views
  • 3 replies
  • 0 kudos

Resolved! How to use the variable haiving set of values in a spark.sql?

Hi, I have a set of values to be searched from a table, for which i was trying to assign them to a variable first and then trying to use the variable in spark.sql, but i'm unable to fetch the records. Please see the image attached and correct my code...

  • 1215 Views
  • 3 replies
  • 0 kudos
Latest Reply
brockb
New Contributor III
  • 0 kudos

Hi, One way to address the example provided in your screenshot is by using a combination of a python f-string and a Common Table Expression like shown below. This is assuming that in reality the two tables are different unlike in the provided screens...

  • 0 kudos
2 More Replies
erigaud
by Valued Contributor III
  • 524 Views
  • 4 replies
  • 1 kudos

Incorrect dropped rows count in DLT Event log

Hello, I'm using a DLT pipeline with expectationsexpect_or_drop(...) To test it, I added files that contain records that should be dropped, and indeed when running the pipeline I can see some rows were dropped.However when looking at the DLT Event lo...

  • 524 Views
  • 4 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @erigaud , Thank you for reaching out! Let’s dive into the behavior of Delta Live Tables (DLT) expectations and clarify the observed behavior.   Expectations in DLT: DLT allows you to define expectations on your data pipelines using functions like...

  • 1 kudos
3 More Replies