cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

prajwalpoojary
by New Contributor
  • 548 Views
  • 1 replies
  • 1 kudos

Resolved! Databricks Apps Hosting Backend and Frontend

Hello, I want to host a webapp whose frontend will be on Streamlit and backend running on FastApi. Currently Databricks app listens to host 0.0.0.0 and port 8000 and my backend is running on host '127.0.0.1' and port 8080(if it's available). I want t...

  • 548 Views
  • 1 replies
  • 1 kudos
Latest Reply
stbjelcevic
Databricks Employee
  • 1 kudos

Hi @prajwalpoojary , Given you already have Streamlit on 0.0.0.0:8000 and FastAPI on 127.0.0.1:8080, you can keep that split and do server-side calls from Streamlit to http://127.0.0.1:8080/. It’s efficient and avoids cross-origin/auth issues. If you...

  • 1 kudos
dave_d
by New Contributor II
  • 9761 Views
  • 3 replies
  • 0 kudos

What is the "Columnar To Row" node in this simple Databricks SQL query profile?

I am running a relatively simple SQL query that writes back to a table on a Databricks serverless SQL warehouse, and I'm trying to understand why there is a "Columnar To Row" node in the query profile that is consuming the vast majority of the time s...

dave_d_0-1696974904324.png
  • 9761 Views
  • 3 replies
  • 0 kudos
Latest Reply
Annapurna_Hiriy
Databricks Employee
  • 0 kudos

 @dave_d We do not have a document with list of operations that would bring up ColumnarToRow node. This node provides a common executor to translate an RDD of ColumnarBatch into an RDD of InternalRow. This is inserted whenever such a transition is de...

  • 0 kudos
2 More Replies
francisix
by New Contributor III
  • 7856 Views
  • 6 replies
  • 9 kudos

Resolved! I haven't received badge for completion

Hi,Today I completed the test for Lakehouse fundamentals by scored 85%, still I haven't received the badge through my email francis@intellectyx.comKindly let me know please !-Francis

  • 7856 Views
  • 6 replies
  • 9 kudos
Latest Reply
sureshrocks1984
New Contributor II
  • 9 kudos

HI  I completed the test for Databricks Certified Data Engineer Associate on 17 December 2024.  still I haven't received the badge through my email sureshrocks.1984@hotmail.comKindly let me know please !SURESHK 

  • 9 kudos
5 More Replies
Danish11052000
by Contributor
  • 831 Views
  • 5 replies
  • 9 kudos

Resolved! How to get read/write bytes per table using Databricks system tables?

I’m working on a data usage use case and want to understand the right way to get read bytes and written bytes per table in Databricks, especially for Unity Catalog tables.What I wantFor each table, something like:DateTable name (catalog.schema.table)...

  • 831 Views
  • 5 replies
  • 9 kudos
Latest Reply
pradeep_singh
Contributor III
  • 9 kudos

system.access.audit focuses on governance and admin/security events. It doesn’t capture per-table I/O metrics such as read_bytes or written_bytes.Use system.query.history for per-statement I/O metrics (read_bytes, written_bytes, read_rows, written_ro...

  • 9 kudos
4 More Replies
danny_frontgrad
by New Contributor III
  • 932 Views
  • 11 replies
  • 3 kudos

Resolved! Question on Ingestion Pipelines

Is there a better way to select source tables than having to manually select them 1 by 1. I have 96 tables and it's a pain. The gui keeps back to the schema and i have to search through all the tables again. Is there a way to import the tables using ...

  • 932 Views
  • 11 replies
  • 3 kudos
Latest Reply
pradeep_singh
Contributor III
  • 3 kudos

So you dont see the option to edit the pipeline ?Oronce you click on edit pipeline you dont see the option to Switch to code version(YAML)Or After you Switch to code version(YAML) you can only view that yaml and cant edit it ?

  • 3 kudos
10 More Replies
Ericsson
by New Contributor II
  • 6244 Views
  • 3 replies
  • 1 kudos

SQL week format issue its not showing result as 01(ww)

Hi Folks,I've requirement to show the week number as ww format. Please see the below codeselect weekofyear(date_add(to_date(current_date, 'yyyyMMdd'), +35)). also plz refre the screen shot for result.

result
  • 6244 Views
  • 3 replies
  • 1 kudos
Latest Reply
Fowlkes
New Contributor II
  • 1 kudos

What Is Papa’s Freezeria?Papa’s Freezeria is part of the famous Papa Louie game series, where players take on the role of restaurant employees running one of Papa Louie’s many eateries. http://papasfreezeria.online/

  • 1 kudos
2 More Replies
kenny_hero
by New Contributor III
  • 1115 Views
  • 7 replies
  • 1 kudos

Resolved! How do I import a python module when deploying with DAB?

Below is how the folder structure of my project looks like: resources/ |- etl_event/ |- etl_event.job.yml src/ |- pipeline/ |- etl_event/ |- transformers/ |- transformer_1.py |- utils/ |- logger.py databricks.ym...

  • 1115 Views
  • 7 replies
  • 1 kudos
Latest Reply
pradeep_singh
Contributor III
  • 1 kudos

You dont need to use wheel files . Use glob as the key instead of file - https://docs.databricks.com/aws/en/dev-tools/bundles/resources#pipelinelibrariesHere is the screenshot .  

  • 1 kudos
6 More Replies
Danish11052000
by Contributor
  • 867 Views
  • 5 replies
  • 5 kudos

Resolved! How to incrementally backup system.information_schema.table_privileges (no streaming, no unique keys

I'm trying to incrementally backup system.information_schema.table_privileges but facing challenges:No streaming support: Is streaming supported: FalseNo unique columns for MERGE: All columns contain common values, no natural key combinationNo timest...

  • 867 Views
  • 5 replies
  • 5 kudos
Latest Reply
MoJaMa
Databricks Employee
  • 5 kudos

information_schema is not a Delta Table, which is why you can't stream from it. They are basically views on top of the information coming straight from the control plane database. Also your query is actually going to be quite slow/expensive (you prob...

  • 5 kudos
4 More Replies
petergriffin1
by New Contributor II
  • 2455 Views
  • 4 replies
  • 1 kudos

Resolved! Are you able to create a iceberg table natively in Databricks?

Been trying to create a iceberg table natively in databricks with the cluster being 16.4. I also have the Iceberg JAR file for 3.5.2 Spark.Using a simple command such as:%sql CREATE OR REPLACE TABLE catalog1.default.iceberg( a INT ) USING iceberg...

  • 2455 Views
  • 4 replies
  • 1 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 1 kudos

Databricks supports creating and working with Apache Iceberg tables natively under specific conditions. Managed Iceberg tables in Unity Catalog can be created directly using Databricks Runtime 16.4 LTS or newer. The necessary setup requires enabling ...

  • 1 kudos
3 More Replies
souravroy1990
by New Contributor II
  • 218 Views
  • 2 replies
  • 2 kudos

Error in Column level tags creation in views via SQL

Hi,I'm trying to run this query using SQL and using DBR 17.3 cluster. But I get a syntax error. ALTER VIEW catalog.schema.viewALTER COLUMN column_nameSET TAGS (`METADATA` = `xyz`); But below query works- SET TAG ON COLUMN catalog.schema.view.column_n...

  • 218 Views
  • 2 replies
  • 2 kudos
Latest Reply
souravroy1990
New Contributor II
  • 2 kudos

Thanks for the clarification @szymon_dybczak. I have a follow-up qn, if I have attached tag to a view column and the same view is associated with a SHARE, will the recipient see the tag in the view i.e. whether view column tags associated to shares a...

  • 2 kudos
1 More Replies
dpc
by Contributor III
  • 1963 Views
  • 8 replies
  • 8 kudos

Resolved! Case insensitive data

For all it's positives, one of the first general issues we had with databricks was case sensitivity.We have a lot of data specific filters in our codeProblem is, we land and view data from lots of different case insensitive source systems e.g. SQL Se...

  • 1963 Views
  • 8 replies
  • 8 kudos
Latest Reply
dpc
Contributor III
  • 8 kudos

It works but there's a scenario that causes an issue.If I create a schema with defaultcollation UTF8_LCASE Then create a table, it marks all the string columns as UTF8_LCASE Which is fine and works If I create the table, in the newly created UTF8_LCA...

  • 8 kudos
7 More Replies
maddan80
by New Contributor II
  • 3158 Views
  • 6 replies
  • 3 kudos

Oracle Essbase connectivity

Team, I wanted to understand the best way of connecting to Oracle Essbase to ingest data into the delta lake

  • 3158 Views
  • 6 replies
  • 3 kudos
Latest Reply
hyaqoob
New Contributor II
  • 3 kudos

I am currently working with Essbase 21c and I need to pull data from Databricks through a SQL query. I was able to successfully setup JDBC connection to Databricks but when I try to create a data source using a SQL query, it gives me an error: "[Data...

  • 3 kudos
5 More Replies
Adig
by New Contributor III
  • 9374 Views
  • 6 replies
  • 17 kudos

Generate Group Id for similar deduplicate values of a dataframe column.

Inupt DataFrame'''KeyName KeyCompare SourcePapasMrtemis PapasMrtemis S1PapasMrtemis Pappas, Mrtemis S1Pappas, Mrtemis PapasMrtemis S2Pappas, Mrtemis Pappas, Mrtemis S2Mich...

  • 9374 Views
  • 6 replies
  • 17 kudos
Latest Reply
rafaelpoyiadzi
New Contributor II
  • 17 kudos

Hey. We’ve run into similar deduplication problems before. If the name differences are pretty minor (punctuation, spacing, small typos), fuzzy string matching can usually get you most of the way there. That kind of similarity-based clustering works f...

  • 17 kudos
5 More Replies
NathanE
by New Contributor II
  • 4141 Views
  • 2 replies
  • 1 kudos

Time travel on views

Hello,At my company, we design an application to analyze data, and we can do so on top of external databases such as Databricks. Our application cache some data in-memory and to avoid synchronization issues with the data on Databricks, we rely heavil...

  • 4141 Views
  • 2 replies
  • 1 kudos
Latest Reply
robert1213
New Contributor II
  • 1 kudos

Hi there,Your use case for time travel on views is really interesting. I can see why being able to track historical versions of both views and their underlying tables would be crucial for an application that relies on caching and granular queries. Ri...

  • 1 kudos
1 More Replies
luketl2
by Contributor
  • 676 Views
  • 6 replies
  • 1 kudos

Resolved! DELTA_FEATURES_REQUIRE_MANUAL_ENABLEMENT DLT Streaming Table as Variant

I am attempting to ingest csv files from an S3 bucket with Autoloader. Since the schema of the data is inconsistent (each csv may have different headers), I was hoping to ingest the data as Variant following this: https://docs.databricks.com/aws/en/i...

  • 676 Views
  • 6 replies
  • 1 kudos
Latest Reply
luketl2
Contributor
  • 1 kudos

I think I found the issue... I put the table_properties in the wrong place. It goes in the decorator args not the query_function args. My bad

  • 1 kudos
5 More Replies
Labels