cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Ericsson
by New Contributor II
  • 5694 Views
  • 3 replies
  • 1 kudos

SQL week format issue its not showing result as 01(ww)

Hi Folks,I've requirement to show the week number as ww format. Please see the below codeselect weekofyear(date_add(to_date(current_date, 'yyyyMMdd'), +35)). also plz refre the screen shot for result.

result
  • 5694 Views
  • 3 replies
  • 1 kudos
Latest Reply
Fowlkes
Visitor
  • 1 kudos

What Is Papa’s Freezeria?Papa’s Freezeria is part of the famous Papa Louie game series, where players take on the role of restaurant employees running one of Papa Louie’s many eateries. http://papasfreezeria.online/

  • 1 kudos
2 More Replies
kenny_hero
by New Contributor II
  • 387 Views
  • 7 replies
  • 1 kudos

How do I import a python module when deploying with DAB?

Below is how the folder structure of my project looks like: resources/ |- etl_event/ |- etl_event.job.yml src/ |- pipeline/ |- etl_event/ |- transformers/ |- transformer_1.py |- utils/ |- logger.py databricks.ym...

  • 387 Views
  • 7 replies
  • 1 kudos
Latest Reply
pradeep_singh
New Contributor II
  • 1 kudos

You dont need to use wheel files . Use glob as the key instead of file - https://docs.databricks.com/aws/en/dev-tools/bundles/resources#pipelinelibrariesHere is the screenshot .  

  • 1 kudos
6 More Replies
Danish11052000
by New Contributor III
  • 51 Views
  • 2 replies
  • 0 kudos

How to get read/write bytes per table using Databricks system tables?

I’m working on a data usage use case and want to understand the right way to get read bytes and written bytes per table in Databricks, especially for Unity Catalog tables.What I wantFor each table, something like:DateTable name (catalog.schema.table)...

  • 51 Views
  • 2 replies
  • 0 kudos
Latest Reply
balajij8
New Contributor
  • 0 kudos

System Audit tables are for account activity tracking & security.For IO detailsYou can use query history & lineage. Add usage attribution code to get details at table levelYou can use table logs to get the write info along with other details.

  • 0 kudos
1 More Replies
Danish11052000
by New Contributor III
  • 71 Views
  • 5 replies
  • 1 kudos

How to incrementally backup system.information_schema.table_privileges (no streaming, no unique keys

I'm trying to incrementally backup system.information_schema.table_privileges but facing challenges:No streaming support: Is streaming supported: FalseNo unique columns for MERGE: All columns contain common values, no natural key combinationNo timest...

  • 71 Views
  • 5 replies
  • 1 kudos
Latest Reply
MoJaMa
Databricks Employee
  • 1 kudos

information_schema is not a Delta Table, which is why you can't stream from it. They are basically views on top of the information coming straight from the control plane database. Also your query is actually going to be quite slow/expensive (you prob...

  • 1 kudos
4 More Replies
aranjan99
by Contributor
  • 73 Views
  • 3 replies
  • 1 kudos

System table missing primary keys?

This simple query takes 50seconds for me on a X-Small warehouse.select * from SYSTEM.access.workspaces_latest where workspace_id = '442224551661121'Can the team comment on why querying on system tables takes so long? I also dont see any primary keys ...

  • 73 Views
  • 3 replies
  • 1 kudos
Latest Reply
iyashk-DB
Databricks Employee
  • 1 kudos

System tables are a Databricks‑hosted, read‑only analytical store shared to your workspace via Delta Sharing; they aren’t modifiable (no indexes you can add), and the first read can have extra overhead on a very small warehouse. This can make “simple...

  • 1 kudos
2 More Replies
echol
by New Contributor
  • 94 Views
  • 3 replies
  • 1 kudos

Redeploy Databricks Asset Bundle created by others

Hi everyone,Our team is using Databricks Asset Bundles (DAB) with a customized template to develop data pipelines. We have a core team that maintains the shared infrastructure and templates, and multiple product teams that use this template to develo...

  • 94 Views
  • 3 replies
  • 1 kudos
Latest Reply
pradeep_singh
New Contributor II
  • 1 kudos

There is a purpose of development mode . its not a limitation . Its meant to make sure developers can test the changes individually . If you plan to have this deployed by multiple users you will have to deploy in production mode . 

  • 1 kudos
2 More Replies
dpc
by Contributor II
  • 169 Views
  • 6 replies
  • 2 kudos

Using AD groups for object ownership

Databricks has a general issue with object ownership in that only the creator can delete them.So, if I create a catalog, table, view, schema etc. I am the only person who can delete it.No good if it's a general table or view and some other developer ...

  • 169 Views
  • 6 replies
  • 2 kudos
Latest Reply
dpc
Contributor II
  • 2 kudos

Hi So, I've just tested this If I create a schema and somebody else creates a table in that schema, I can drop their table If they create a schema along with a table in that schemaThen grant me All privileges on the table, I cannot drop it as it says...

  • 2 kudos
5 More Replies
Fox19
by New Contributor III
  • 90 Views
  • 4 replies
  • 1 kudos

CSV Ingestion using Autoloader with single variant column

I've been working on ingesting csv files with varying schemas using Autoloader. Goal is to take the csvs and ingest them into a bronze table that writes each record as a key-value mapping with only the relevant fields for that record. I also want to ...

  • 90 Views
  • 4 replies
  • 1 kudos
Latest Reply
pradeep_singh
New Contributor II
  • 1 kudos

If i understand the problem correctly you are getting extra keys for records from files where the keys actually dont exist . I was not able to reproduce this issue . I am getting diffrent keys , value pairs and no extra keys with null. Can you share ...

  • 1 kudos
3 More Replies
petergriffin1
by New Contributor II
  • 1994 Views
  • 4 replies
  • 1 kudos

Resolved! Are you able to create a iceberg table natively in Databricks?

Been trying to create a iceberg table natively in databricks with the cluster being 16.4. I also have the Iceberg JAR file for 3.5.2 Spark.Using a simple command such as:%sql CREATE OR REPLACE TABLE catalog1.default.iceberg( a INT ) USING iceberg...

  • 1994 Views
  • 4 replies
  • 1 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 1 kudos

Databricks supports creating and working with Apache Iceberg tables natively under specific conditions. Managed Iceberg tables in Unity Catalog can be created directly using Databricks Runtime 16.4 LTS or newer. The necessary setup requires enabling ...

  • 1 kudos
3 More Replies
souravroy1990
by New Contributor II
  • 55 Views
  • 2 replies
  • 2 kudos

Error in Column level tags creation in views via SQL

Hi,I'm trying to run this query using SQL and using DBR 17.3 cluster. But I get a syntax error. ALTER VIEW catalog.schema.viewALTER COLUMN column_nameSET TAGS (`METADATA` = `xyz`); But below query works- SET TAG ON COLUMN catalog.schema.view.column_n...

  • 55 Views
  • 2 replies
  • 2 kudos
Latest Reply
souravroy1990
New Contributor II
  • 2 kudos

Thanks for the clarification @szymon_dybczak. I have a follow-up qn, if I have attached tag to a view column and the same view is associated with a SHARE, will the recipient see the tag in the view i.e. whether view column tags associated to shares a...

  • 2 kudos
1 More Replies
liquibricks
by Contributor
  • 257 Views
  • 7 replies
  • 4 kudos

Declarative Pipeline error: Name 'kdf' is not defined. Did you mean: 'sdf'

We have a Lakeflow Spark Declarative Pipeline using the new PySpark Pipelines API. This was working fine until about 7am (Central European) this morning when the pipeline started failing with a PYTHON.NAME_ERROR: name 'kdf' is not defined. Did you me...

  • 257 Views
  • 7 replies
  • 4 kudos
Latest Reply
zkaliszamisza
New Contributor
  • 4 kudos

For us it happened in westeurope around the same time

  • 4 kudos
6 More Replies
dpc
by Contributor II
  • 351 Views
  • 8 replies
  • 8 kudos

Case insensitive data

For all it's positives, one of the first general issues we had with databricks was case sensitivity.We have a lot of data specific filters in our codeProblem is, we land and view data from lots of different case insensitive source systems e.g. SQL Se...

  • 351 Views
  • 8 replies
  • 8 kudos
Latest Reply
dpc
Contributor II
  • 8 kudos

It works but there's a scenario that causes an issue.If I create a schema with defaultcollation UTF8_LCASE Then create a table, it marks all the string columns as UTF8_LCASE Which is fine and works If I create the table, in the newly created UTF8_LCA...

  • 8 kudos
7 More Replies
maddan80
by New Contributor II
  • 2690 Views
  • 6 replies
  • 3 kudos

Oracle Essbase connectivity

Team, I wanted to understand the best way of connecting to Oracle Essbase to ingest data into the delta lake

  • 2690 Views
  • 6 replies
  • 3 kudos
Latest Reply
hyaqoob
New Contributor II
  • 3 kudos

I am currently working with Essbase 21c and I need to pull data from Databricks through a SQL query. I was able to successfully setup JDBC connection to Databricks but when I try to create a data source using a SQL query, it gives me an error: "[Data...

  • 3 kudos
5 More Replies
RIDBX
by Contributor
  • 49 Views
  • 2 replies
  • 1 kudos

Robust/complex scheduling with dependency within Databricks?

Robust scheduling with dependency within Databricks?======================================  Thanks for reviewing my threads. I like to explore Robust/complex scheduling with dependency within Databricks.We know traditional scheduling framework allow ...

  • 49 Views
  • 2 replies
  • 1 kudos
Latest Reply
pradeep_singh
New Contributor II
  • 1 kudos

Further readings - SQL Altert Task - https://docs.databricks.com/aws/en/jobs/sqlIf else Task - https://docs.databricks.com/aws/en/jobs/if-elseFor Each task - https://docs.databricks.com/aws/en/jobs/for-eachRun job task - https://docs.databricks.com/a...

  • 1 kudos
1 More Replies
Adig
by New Contributor III
  • 8944 Views
  • 6 replies
  • 17 kudos

Generate Group Id for similar deduplicate values of a dataframe column.

Inupt DataFrame'''KeyName KeyCompare SourcePapasMrtemis PapasMrtemis S1PapasMrtemis Pappas, Mrtemis S1Pappas, Mrtemis PapasMrtemis S2Pappas, Mrtemis Pappas, Mrtemis S2Mich...

  • 8944 Views
  • 6 replies
  • 17 kudos
Latest Reply
rafaelpoyiadzi
New Contributor
  • 17 kudos

Hey. We’ve run into similar deduplication problems before. If the name differences are pretty minor (punctuation, spacing, small typos), fuzzy string matching can usually get you most of the way there. That kind of similarity-based clustering works f...

  • 17 kudos
5 More Replies
Labels