cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Michael_Appiah
by Databricks Partner
  • 16561 Views
  • 17 replies
  • 11 kudos

Parameterized spark.sql() not working

Spark 3.4 introduced parameterized SQL queries and Databricks also discussed this new functionality in a recent blog post (https://www.databricks.com/blog/parameterized-queries-pyspark)Problem: I cannot run any of the examples provided in the PySpark...

Michael_Appiah_0-1704459542967.png Michael_Appiah_1-1704459570498.png
  • 16561 Views
  • 17 replies
  • 11 kudos
Latest Reply
adriennn
Valued Contributor
  • 11 kudos

@Malthe  the question did not indeed pertain to DLT / lakeflow, but another user asked about it and I mistakenly mentionned OP instead of @alex0sp  cheers!

  • 11 kudos
16 More Replies
shan-databricks
by Databricks Partner
  • 903 Views
  • 2 replies
  • 0 kudos

Resolved! What are the prerequisites for connecting Confluent Kafka with Databricks?

Please provide the prerequisites for connecting Confluent Kafka with Databricks, the different connection options, their respective advantages and disadvantages, and the best option for the deliverable.ThanksShanmugam 

  • 903 Views
  • 2 replies
  • 0 kudos
Latest Reply
lingareddy_Alva
Esteemed Contributor
  • 0 kudos

Hi @shan-databricks Connecting Confluent Kafka with Databricks creates a powerful "data in motion" to "data at rest" architecture.Below are the prerequisites, connection methods, and strategic recommendations for your deliverable.1. PrerequisitesBefo...

  • 0 kudos
1 More Replies
prajwalpoojary
by New Contributor
  • 459 Views
  • 1 replies
  • 1 kudos

Resolved! Databricks Apps Hosting Backend and Frontend

Hello, I want to host a webapp whose frontend will be on Streamlit and backend running on FastApi. Currently Databricks app listens to host 0.0.0.0 and port 8000 and my backend is running on host '127.0.0.1' and port 8080(if it's available). I want t...

  • 459 Views
  • 1 replies
  • 1 kudos
Latest Reply
stbjelcevic
Databricks Employee
  • 1 kudos

Hi @prajwalpoojary , Given you already have Streamlit on 0.0.0.0:8000 and FastAPI on 127.0.0.1:8080, you can keep that split and do server-side calls from Streamlit to http://127.0.0.1:8080/. It’s efficient and avoids cross-origin/auth issues. If you...

  • 1 kudos
dave_d
by New Contributor II
  • 9631 Views
  • 3 replies
  • 0 kudos

What is the "Columnar To Row" node in this simple Databricks SQL query profile?

I am running a relatively simple SQL query that writes back to a table on a Databricks serverless SQL warehouse, and I'm trying to understand why there is a "Columnar To Row" node in the query profile that is consuming the vast majority of the time s...

dave_d_0-1696974904324.png
  • 9631 Views
  • 3 replies
  • 0 kudos
Latest Reply
Annapurna_Hiriy
Databricks Employee
  • 0 kudos

 @dave_d We do not have a document with list of operations that would bring up ColumnarToRow node. This node provides a common executor to translate an RDD of ColumnarBatch into an RDD of InternalRow. This is inserted whenever such a transition is de...

  • 0 kudos
2 More Replies
francisix
by New Contributor III
  • 7671 Views
  • 6 replies
  • 9 kudos

Resolved! I haven't received badge for completion

Hi,Today I completed the test for Lakehouse fundamentals by scored 85%, still I haven't received the badge through my email francis@intellectyx.comKindly let me know please !-Francis

  • 7671 Views
  • 6 replies
  • 9 kudos
Latest Reply
sureshrocks1984
New Contributor II
  • 9 kudos

HI  I completed the test for Databricks Certified Data Engineer Associate on 17 December 2024.  still I haven't received the badge through my email sureshrocks.1984@hotmail.comKindly let me know please !SURESHK 

  • 9 kudos
5 More Replies
Danish11052000
by Contributor
  • 707 Views
  • 5 replies
  • 9 kudos

Resolved! How to get read/write bytes per table using Databricks system tables?

I’m working on a data usage use case and want to understand the right way to get read bytes and written bytes per table in Databricks, especially for Unity Catalog tables.What I wantFor each table, something like:DateTable name (catalog.schema.table)...

  • 707 Views
  • 5 replies
  • 9 kudos
Latest Reply
pradeep_singh
Contributor
  • 9 kudos

system.access.audit focuses on governance and admin/security events. It doesn’t capture per-table I/O metrics such as read_bytes or written_bytes.Use system.query.history for per-statement I/O metrics (read_bytes, written_bytes, read_rows, written_ro...

  • 9 kudos
4 More Replies
danny_frontgrad
by New Contributor III
  • 847 Views
  • 11 replies
  • 3 kudos

Resolved! Question on Ingestion Pipelines

Is there a better way to select source tables than having to manually select them 1 by 1. I have 96 tables and it's a pain. The gui keeps back to the schema and i have to search through all the tables again. Is there a way to import the tables using ...

  • 847 Views
  • 11 replies
  • 3 kudos
Latest Reply
pradeep_singh
Contributor
  • 3 kudos

So you dont see the option to edit the pipeline ?Oronce you click on edit pipeline you dont see the option to Switch to code version(YAML)Or After you Switch to code version(YAML) you can only view that yaml and cant edit it ?

  • 3 kudos
10 More Replies
Ericsson
by New Contributor II
  • 6135 Views
  • 3 replies
  • 1 kudos

SQL week format issue its not showing result as 01(ww)

Hi Folks,I've requirement to show the week number as ww format. Please see the below codeselect weekofyear(date_add(to_date(current_date, 'yyyyMMdd'), +35)). also plz refre the screen shot for result.

result
  • 6135 Views
  • 3 replies
  • 1 kudos
Latest Reply
Fowlkes
New Contributor II
  • 1 kudos

What Is Papa’s Freezeria?Papa’s Freezeria is part of the famous Papa Louie game series, where players take on the role of restaurant employees running one of Papa Louie’s many eateries. http://papasfreezeria.online/

  • 1 kudos
2 More Replies
kenny_hero
by New Contributor III
  • 990 Views
  • 7 replies
  • 1 kudos

Resolved! How do I import a python module when deploying with DAB?

Below is how the folder structure of my project looks like: resources/ |- etl_event/ |- etl_event.job.yml src/ |- pipeline/ |- etl_event/ |- transformers/ |- transformer_1.py |- utils/ |- logger.py databricks.ym...

  • 990 Views
  • 7 replies
  • 1 kudos
Latest Reply
pradeep_singh
Contributor
  • 1 kudos

You dont need to use wheel files . Use glob as the key instead of file - https://docs.databricks.com/aws/en/dev-tools/bundles/resources#pipelinelibrariesHere is the screenshot .  

  • 1 kudos
6 More Replies
Danish11052000
by Contributor
  • 739 Views
  • 5 replies
  • 5 kudos

Resolved! How to incrementally backup system.information_schema.table_privileges (no streaming, no unique keys

I'm trying to incrementally backup system.information_schema.table_privileges but facing challenges:No streaming support: Is streaming supported: FalseNo unique columns for MERGE: All columns contain common values, no natural key combinationNo timest...

  • 739 Views
  • 5 replies
  • 5 kudos
Latest Reply
MoJaMa
Databricks Employee
  • 5 kudos

information_schema is not a Delta Table, which is why you can't stream from it. They are basically views on top of the information coming straight from the control plane database. Also your query is actually going to be quite slow/expensive (you prob...

  • 5 kudos
4 More Replies
petergriffin1
by New Contributor II
  • 2375 Views
  • 4 replies
  • 1 kudos

Resolved! Are you able to create a iceberg table natively in Databricks?

Been trying to create a iceberg table natively in databricks with the cluster being 16.4. I also have the Iceberg JAR file for 3.5.2 Spark.Using a simple command such as:%sql CREATE OR REPLACE TABLE catalog1.default.iceberg( a INT ) USING iceberg...

  • 2375 Views
  • 4 replies
  • 1 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 1 kudos

Databricks supports creating and working with Apache Iceberg tables natively under specific conditions. Managed Iceberg tables in Unity Catalog can be created directly using Databricks Runtime 16.4 LTS or newer. The necessary setup requires enabling ...

  • 1 kudos
3 More Replies
souravroy1990
by New Contributor II
  • 172 Views
  • 2 replies
  • 2 kudos

Error in Column level tags creation in views via SQL

Hi,I'm trying to run this query using SQL and using DBR 17.3 cluster. But I get a syntax error. ALTER VIEW catalog.schema.viewALTER COLUMN column_nameSET TAGS (`METADATA` = `xyz`); But below query works- SET TAG ON COLUMN catalog.schema.view.column_n...

  • 172 Views
  • 2 replies
  • 2 kudos
Latest Reply
souravroy1990
New Contributor II
  • 2 kudos

Thanks for the clarification @szymon_dybczak. I have a follow-up qn, if I have attached tag to a view column and the same view is associated with a SHARE, will the recipient see the tag in the view i.e. whether view column tags associated to shares a...

  • 2 kudos
1 More Replies
dpc
by Contributor III
  • 1583 Views
  • 8 replies
  • 8 kudos

Resolved! Case insensitive data

For all it's positives, one of the first general issues we had with databricks was case sensitivity.We have a lot of data specific filters in our codeProblem is, we land and view data from lots of different case insensitive source systems e.g. SQL Se...

  • 1583 Views
  • 8 replies
  • 8 kudos
Latest Reply
dpc
Contributor III
  • 8 kudos

It works but there's a scenario that causes an issue.If I create a schema with defaultcollation UTF8_LCASE Then create a table, it marks all the string columns as UTF8_LCASE Which is fine and works If I create the table, in the newly created UTF8_LCA...

  • 8 kudos
7 More Replies
maddan80
by New Contributor II
  • 3041 Views
  • 6 replies
  • 3 kudos

Oracle Essbase connectivity

Team, I wanted to understand the best way of connecting to Oracle Essbase to ingest data into the delta lake

  • 3041 Views
  • 6 replies
  • 3 kudos
Latest Reply
hyaqoob
New Contributor II
  • 3 kudos

I am currently working with Essbase 21c and I need to pull data from Databricks through a SQL query. I was able to successfully setup JDBC connection to Databricks but when I try to create a data source using a SQL query, it gives me an error: "[Data...

  • 3 kudos
5 More Replies
Adig
by New Contributor III
  • 9255 Views
  • 6 replies
  • 17 kudos

Generate Group Id for similar deduplicate values of a dataframe column.

Inupt DataFrame'''KeyName KeyCompare SourcePapasMrtemis PapasMrtemis S1PapasMrtemis Pappas, Mrtemis S1Pappas, Mrtemis PapasMrtemis S2Pappas, Mrtemis Pappas, Mrtemis S2Mich...

  • 9255 Views
  • 6 replies
  • 17 kudos
Latest Reply
rafaelpoyiadzi
New Contributor II
  • 17 kudos

Hey. We’ve run into similar deduplication problems before. If the name differences are pretty minor (punctuation, spacing, small typos), fuzzy string matching can usually get you most of the way there. That kind of similarity-based clustering works f...

  • 17 kudos
5 More Replies
Labels