cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Loinguyen318
by New Contributor II
  • 683 Views
  • 2 replies
  • 2 kudos

Solution for executing Bigquery DDL in the notebook cell

Hi all,I would like to find a solution to execute a Bigquery DDL in a cell of databricks notebook directly.Have any solution for my problem?This is the example of DDL:CREATE EXTERNAL TABLE `PROJECT_ID.DATASET.DELTALAKE_TABLE_NAME` WITH CONNECTION `PR...

  • 683 Views
  • 2 replies
  • 2 kudos
Latest Reply
SmithPoll
New Contributor III
  • 2 kudos

You can use the google.cloud.bigquery client within a Databricks notebook by installing the google-cloud-bigquery library and using Python to run the DDL. Example: from google.cloud import bigquery client = bigquery.Client() query = \"\"\"CREATE EXTE...

  • 2 kudos
1 More Replies
Shivap
by New Contributor III
  • 937 Views
  • 1 replies
  • 1 kudos

Resolved! Data processing and validation with similar set of schema using databricks

We need to process hundreds of txt files with may be 15 different formats based on the file arrival. We need to do basic file validation (header/trailer) before loading them into may be 15 landing tables. What’s the best way to process them into land...

  • 937 Views
  • 1 replies
  • 1 kudos
Latest Reply
intuz
Contributor II
  • 1 kudos

Hey there, Processing many .txt files with different formats and validations is something Databricks handles well. Here’s a simple approach: Recommended Approach: Use DLT (Delta Live Tables) or LakeFlow to build a pipeline per format (if each format ...

  • 1 kudos
mjedy78
by New Contributor II
  • 2646 Views
  • 1 replies
  • 0 kudos

Impact of VACUUM and retention settings on Delta Lake

I have a table that needs to support time travel for up to 6 months. To preserve the necessary metadata and data files, I’ve already configured the table with the following properties:ALTER TABLE table_x SET TBLPROPERTIES ( 'delta.logRetentionDuratio...

  • 2646 Views
  • 1 replies
  • 0 kudos
Latest Reply
intuz
Contributor II
  • 0 kudos

Yes, the VACUUM table_x RETAIN 720 HOURS;  command will indeed override your table-level retention properties and potentially compromise your 6-month time travel capability. When you explicitly specify a retention period in the VACUUM command, it tak...

  • 0 kudos
Abhijeet2492
by New Contributor II
  • 586 Views
  • 1 replies
  • 0 kudos

Does Single node Shared Cluster cannot write to a delta table with row filter applied?

Does Single node Shared Cluster cannot write to a delta table with row filter applied?Do we need multi-node Shared Access Mode clusters?

  • 586 Views
  • 1 replies
  • 0 kudos
Latest Reply
Khaja_Zaffer
Esteemed Contributor
  • 0 kudos

While recent updates to Databricks Runtime have introduced limited write capabilities for single-node clusters in Dedicated Access Mode, the more robust, flexible, and officially recommended approach for writing to Delta tables with row filters is to...

  • 0 kudos
GANAPATI_HEGDE
by New Contributor III
  • 1228 Views
  • 3 replies
  • 0 kudos

SparkException: [INSUFFICIENT_PERMISSIONS] Insufficient privileges: User does not have permission SE

I am trying to build a dlt pipeline which fetches data from fabric onelake tables and stores it in databricks. To do this, i am using azure service principals and  databricks asset bundles. i have setup authentication configs as per official document...

GANAPATI_HEGDE_0-1751982239407.png
  • 1228 Views
  • 3 replies
  • 0 kudos
Latest Reply
nayan_wylde
Esteemed Contributor II
  • 0 kudos

In the workflow check the Runas. If the workflow is running using an SPN. Check if the SPN have appropriate permissions to read and select files.

  • 0 kudos
2 More Replies
Databricks143
by New Contributor III
  • 33320 Views
  • 15 replies
  • 5 kudos

Recrusive cte in databrick sql

Hi Team,How to write recrusive cte in databricks SQL.Please let me know any one have solution for this 

  • 33320 Views
  • 15 replies
  • 5 kudos
Latest Reply
StephanieAlba
Databricks Employee
  • 5 kudos

Check this out! https://docs.databricks.com/aws/en/sql/language-manual/sql-ref-syntax-qry-select-cte#recursive-query-examples

  • 5 kudos
14 More Replies
anilsampson
by New Contributor III
  • 1262 Views
  • 1 replies
  • 1 kudos

Resolved! databricks dashboard deployment question

Hello, i am trying to run a databricks dashboard via workflow.when i deploy the dashboard .json file in prod workspace via import dashboard option the dashboard_id is changed.is there a way i can deploy this without having to re-deploy my workflow wi...

  • 1262 Views
  • 1 replies
  • 1 kudos
Latest Reply
Advika
Community Manager
  • 1 kudos

Hello @anilsampson! From what I understand, when you import a dashboard into another workspace, a new dashboard_id is always generated.Deploying with a Databricks Asset Bundle does not keep the dashboard ID the same across different workspaces. Each ...

  • 1 kudos
dpc
by Contributor III
  • 1779 Views
  • 4 replies
  • 3 kudos

Can I see whether a table column is read in databricks?

HelloHistorically, we had a number of tables that have been extracted from source and loaded to databricks using 'select *'.As a result some columns that have been loaded never get used.I'd like to tidy this and remove redundant columns. Is there a w...

  • 1779 Views
  • 4 replies
  • 3 kudos
Latest Reply
dpc
Contributor III
  • 3 kudos

ThanksWould I have to do that 1 table at a time though?Lineage is useful but it only shows which tables used the table. It doesn't actually show the column used unless you go into the notebook. Unless I am missign something here? 

  • 3 kudos
3 More Replies
Phani1
by Databricks MVP
  • 776 Views
  • 1 replies
  • 0 kudos

Unity catalog + excel data access

Hi All,Is there or can there be connectors built to have excel connected to the Databricks Unity Catalog Semantic models and help users to connect and browse through the data that is stored in Databricks?Regards,Phani

  • 776 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @Phani1 ,Unfortunately, there is no such native connector at the moment. Of course you can still connect Excel to Databricks and browse data from a specific catalog using ODBC, like they do in following documentation entry:Connect to Azure Databri...

  • 0 kudos
Phani1
by Databricks MVP
  • 2000 Views
  • 1 replies
  • 2 kudos

Delta Sharing Approach for Secure Data Access in Development Environment

Hi Team,We have a scenarioProblem Statement:  The customer currently has data in both production and stage environments, with the stage environment being used primarily for development and bug fixing activities. They now want to separate these enviro...

  • 2000 Views
  • 1 replies
  • 2 kudos
Latest Reply
loui_wentzel
Databricks Partner
  • 2 kudos

Hey Phani!Cool setup you have there - some comments and ideas:Generally it sounds like you have a good apporch - Setting up a dedicated dev environement apart for staging and prod is the way. However, restricting access to tables in dev is generally ...

  • 2 kudos
darioschiraldi9
by New Contributor II
  • 1526 Views
  • 1 replies
  • 1 kudos

Resolved! Dario Schiraldi : How do I build a data pipeline in Databricks?

Hey everyone,I am Dario Schiraldi, working on building a data pipeline in Databricks and would love to get some feedback and suggestions from the community. I want to build a scalable and efficient pipeline that can handle large datasets and possibly...

  • 1526 Views
  • 1 replies
  • 1 kudos
Latest Reply
ilir_nuredini
Honored Contributor
  • 1 kudos

Hello @darioschiraldi9 ,Happy to hear that that you are exploring Databricks for you work. Here you may find a very detailed and good example on how you can build scalable data pipeline using DLT and  with the flexibility of Spark Streaming and a sop...

  • 1 kudos
shoumitra
by New Contributor
  • 2676 Views
  • 1 replies
  • 0 kudos

Resolved! Pathway advice on how to Data Engineer Associate

Hi everyone,I am new to this community and I am a BI/Data Engineer by trade in Microsoft Azure/On prem context. I want some advice on how to be a certified Data Engineer Associate in Databiricks. The training, lesson or courses to be eligible for tak...

  • 2676 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @shoumitra ,You can register at databricks academy. There's a plenty of free learning paths depending on what you're interested in.https://customer-academy.databricks.com/For example below you can find free Data Engineer Learning Plan that will pr...

  • 0 kudos
jar
by Contributor
  • 2806 Views
  • 1 replies
  • 0 kudos

Disable Photon for serverless SQL DW

Hello.Is it possible to disable Photon for a serverless SQL DW? If yes, how?Best,Johan.

  • 2806 Views
  • 1 replies
  • 0 kudos
Latest Reply
CURIOUS_DE
Valued Contributor
  • 0 kudos

No, it is not possible to disable Photon for Databricks Serverless SQL Warehouses.Why Photon Cannot Be Disabled:Photon is always enabled on Serverless SQL Warehouses as part of Databricks’ architecture.Serverless SQL is built on Photon to ensure high...

  • 0 kudos
seefoods
by Valued Contributor
  • 1598 Views
  • 3 replies
  • 2 kudos

Resolved! batch process autoloader

My job continue to running after is finished susccessfully this i my case, i enable useNotification if self.autoloader_config.use_autoloader: logger_file_ingestion.info("debut d'ecriture en mode streaming") if self.write_mode.value.lower() == "...

  • 1598 Views
  • 3 replies
  • 2 kudos
Latest Reply
MariuszK
Valued Contributor III
  • 2 kudos

Hi @seefoods ,If it works, you can mark my answer as a solution so that if someone has the same problem, it will be easier to find an answer.

  • 2 kudos
2 More Replies
rpshgupta
by New Contributor III
  • 3646 Views
  • 11 replies
  • 5 kudos

How to find the source code for the data engineering learning path?

Hi Everyone,I am taking data engineering learning path in customer-academy.databricks.com . I am not able to find any source code attached to the course. Can you please help me to find it so that I can try hands on as well ?ThanksRupesh

  • 3646 Views
  • 11 replies
  • 5 kudos
Latest Reply
sselvaganapathy
New Contributor II
  • 5 kudos

Please refer the below link, there is no more Demo code provided by Databricks.https://community.databricks.com/t5/databricks-academy-learners/how-to-download-demo-notebooks-for-data-engineer-learning-plan/td-p/105362 

  • 5 kudos
10 More Replies
Labels