cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

dpc
by New Contributor III
  • 1115 Views
  • 4 replies
  • 3 kudos

Can I see whether a table column is read in databricks?

HelloHistorically, we had a number of tables that have been extracted from source and loaded to databricks using 'select *'.As a result some columns that have been loaded never get used.I'd like to tidy this and remove redundant columns. Is there a w...

  • 1115 Views
  • 4 replies
  • 3 kudos
Latest Reply
dpc
New Contributor III
  • 3 kudos

ThanksWould I have to do that 1 table at a time though?Lineage is useful but it only shows which tables used the table. It doesn't actually show the column used unless you go into the notebook. Unless I am missign something here? 

  • 3 kudos
3 More Replies
Phani1
by Valued Contributor II
  • 487 Views
  • 1 replies
  • 0 kudos

Unity catalog + excel data access

Hi All,Is there or can there be connectors built to have excel connected to the Databricks Unity Catalog Semantic models and help users to connect and browse through the data that is stored in Databricks?Regards,Phani

  • 487 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @Phani1 ,Unfortunately, there is no such native connector at the moment. Of course you can still connect Excel to Databricks and browse data from a specific catalog using ODBC, like they do in following documentation entry:Connect to Azure Databri...

  • 0 kudos
Phani1
by Valued Contributor II
  • 1558 Views
  • 1 replies
  • 2 kudos

Delta Sharing Approach for Secure Data Access in Development Environment

Hi Team,We have a scenarioProblem Statement:  The customer currently has data in both production and stage environments, with the stage environment being used primarily for development and bug fixing activities. They now want to separate these enviro...

  • 1558 Views
  • 1 replies
  • 2 kudos
Latest Reply
loui_wentzel
Contributor
  • 2 kudos

Hey Phani!Cool setup you have there - some comments and ideas:Generally it sounds like you have a good apporch - Setting up a dedicated dev environement apart for staging and prod is the way. However, restricting access to tables in dev is generally ...

  • 2 kudos
darioschiraldi9
by New Contributor II
  • 729 Views
  • 1 replies
  • 1 kudos

Resolved! Dario Schiraldi : How do I build a data pipeline in Databricks?

Hey everyone,I am Dario Schiraldi, working on building a data pipeline in Databricks and would love to get some feedback and suggestions from the community. I want to build a scalable and efficient pipeline that can handle large datasets and possibly...

  • 729 Views
  • 1 replies
  • 1 kudos
Latest Reply
ilir_nuredini
Honored Contributor
  • 1 kudos

Hello @darioschiraldi9 ,Happy to hear that that you are exploring Databricks for you work. Here you may find a very detailed and good example on how you can build scalable data pipeline using DLT and  with the flexibility of Spark Streaming and a sop...

  • 1 kudos
shoumitra
by New Contributor
  • 734 Views
  • 1 replies
  • 0 kudos

Resolved! Pathway advice on how to Data Engineer Associate

Hi everyone,I am new to this community and I am a BI/Data Engineer by trade in Microsoft Azure/On prem context. I want some advice on how to be a certified Data Engineer Associate in Databiricks. The training, lesson or courses to be eligible for tak...

  • 734 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @shoumitra ,You can register at databricks academy. There's a plenty of free learning paths depending on what you're interested in.https://customer-academy.databricks.com/For example below you can find free Data Engineer Learning Plan that will pr...

  • 0 kudos
jar
by Contributor
  • 1245 Views
  • 1 replies
  • 0 kudos

Disable Photon for serverless SQL DW

Hello.Is it possible to disable Photon for a serverless SQL DW? If yes, how?Best,Johan.

  • 1245 Views
  • 1 replies
  • 0 kudos
Latest Reply
CURIOUS_DE
Contributor III
  • 0 kudos

No, it is not possible to disable Photon for Databricks Serverless SQL Warehouses.Why Photon Cannot Be Disabled:Photon is always enabled on Serverless SQL Warehouses as part of Databricks’ architecture.Serverless SQL is built on Photon to ensure high...

  • 0 kudos
seefoods
by Valued Contributor
  • 1205 Views
  • 3 replies
  • 2 kudos

Resolved! batch process autoloader

My job continue to running after is finished susccessfully this i my case, i enable useNotification if self.autoloader_config.use_autoloader: logger_file_ingestion.info("debut d'ecriture en mode streaming") if self.write_mode.value.lower() == "...

  • 1205 Views
  • 3 replies
  • 2 kudos
Latest Reply
MariuszK
Valued Contributor III
  • 2 kudos

Hi @seefoods ,If it works, you can mark my answer as a solution so that if someone has the same problem, it will be easier to find an answer.

  • 2 kudos
2 More Replies
rpshgupta
by New Contributor III
  • 2543 Views
  • 11 replies
  • 5 kudos

How to find the source code for the data engineering learning path?

Hi Everyone,I am taking data engineering learning path in customer-academy.databricks.com . I am not able to find any source code attached to the course. Can you please help me to find it so that I can try hands on as well ?ThanksRupesh

  • 2543 Views
  • 11 replies
  • 5 kudos
Latest Reply
sselvaganapathy
New Contributor II
  • 5 kudos

Please refer the below link, there is no more Demo code provided by Databricks.https://community.databricks.com/t5/databricks-academy-learners/how-to-download-demo-notebooks-for-data-engineer-learning-plan/td-p/105362 

  • 5 kudos
10 More Replies
Bob-
by New Contributor II
  • 2544 Views
  • 3 replies
  • 4 kudos

Resolved! Upload Screenshot

I am new to the Databricks Free Edition. I am trying to upload a screenshot to be able to put it in a table and run some AI functions against it. It is not letting me upload a .png file. After several attempts I am being told that the root cause is p...

  • 2544 Views
  • 3 replies
  • 4 kudos
Latest Reply
Sharanya13
Contributor III
  • 4 kudos

@Bob-  Can you explain your use case? I'm not sure I understand "I am trying to upload a screenshot to be able to put it in a table and run some AI functions against it."Are you trying to perform OCR?

  • 4 kudos
2 More Replies
yzhang
by New Contributor III
  • 1921 Views
  • 5 replies
  • 0 kudos

iceberg with partitionedBy option

I am able to create a UnityCatalog iceberg format table:    df.writeTo(full_table_name).using("iceberg").create()However, if I am adding option partitionedBy I will get an error.  df.writeTo(full_table_name).using("iceberg").partitionedBy("ingest_dat...

  • 1921 Views
  • 5 replies
  • 0 kudos
Latest Reply
yzhang
New Contributor III
  • 0 kudos

I am not trying to alter the table with partitionedBy option. To clarify, I wanted to create the (new) table with option partitionedBy and iceberg format but it failed due to Databricks error. I had to create the table without partitionedBy with iceb...

  • 0 kudos
4 More Replies
Phani1
by Valued Contributor II
  • 1669 Views
  • 4 replies
  • 2 kudos

Potential Challenges of Using Iceberg Format (Databricks + Iceberg)

 Hi Team,What are the potential challenges of using Iceberg format instead of Delta for saving data in databricks?Regards,Phani

  • 1669 Views
  • 4 replies
  • 2 kudos
Latest Reply
sridharplv
Valued Contributor II
  • 2 kudos

Hi @Phani1 , Please find the below link which details out maintaining icerberg metadata along with delta metadata. https://community.databricks.com/t5/technical-blog/read-delta-tables-with-snowflake-via-unity-catalog/ba-p/115877

  • 2 kudos
3 More Replies
stevewb
by New Contributor III
  • 850 Views
  • 1 replies
  • 0 kudos

Setting shuffle partitions in Databricks SQL Warehouse

I think it used to be possible to set shuffle partitions in databricks sql warehouse through e.g.: SET spark.sql.shuffle.partitions=20000. However, when I run this now, I get the error:[CONFIG_NOT_AVAILABLE] Configuration spark.sql.shuffle.partitions...

  • 850 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @stevewb ,It's not available anymore. According with documentation:" Databricks SQL allows admins to configure Spark properties for data access in the workspace settings menu. See Data access configurationsOther than data access configurations, Da...

  • 0 kudos
Somia
by New Contributor III
  • 2456 Views
  • 7 replies
  • 2 kudos

Resolved! sql query is not returning _sqldf.

Notebooks in my workspace are not returning _sqldf when a sql query is run. If I run this code, it would give an error in second cell that _sqldf is not defined.First Cell:%sqlselect * from some_table limit 10Second Cell:%sqlselect * from _sqldfHowev...

  • 2456 Views
  • 7 replies
  • 2 kudos
Latest Reply
Somia
New Contributor III
  • 2 kudos

Changing the notebook to default python and all purpose compute have fixed the issue. I am able to access _sqldf in subsequent sql or python cell.

  • 2 kudos
6 More Replies
anilsampson
by New Contributor III
  • 1214 Views
  • 2 replies
  • 3 kudos

Resolved! How to get previous version of the table in databricks sql dynamically

hello, im trying to get the previous version of a delta table using timestamp but databricks sql does not allow to use variables the only thing i can do is use TIMESTAMP AS OF CURRENT_DATE() -1 if i have refreshed the table today.please let me know i...

  • 1214 Views
  • 2 replies
  • 3 kudos
Latest Reply
anilsampson
New Contributor III
  • 3 kudos

thank you @Vidhi_Khaitan  .Is there an upgrade or use case in works where we can pass parameters via workflow while triggering a databricks dashboard?

  • 3 kudos
1 More Replies
Divya_Bhadauria
by New Contributor III
  • 1150 Views
  • 1 replies
  • 0 kudos

Update databricks job parameter with CLI

Use Case:Updating a Databricks job with multiple tasks can be time-consuming and error-prone when changes (such as adding new parameters) need to be applied to each task manually.Possible Solutions:1. Using Databricks CLI – jobs reset commandYou can ...

Divya_Bhadauria_1-1751740411129.png Divya_Bhadauria_0-1751740346442.png
  • 1150 Views
  • 1 replies
  • 0 kudos
Latest Reply
anilsampson
New Contributor III
  • 0 kudos

hello Divya, Could you also try YAML and update your task accordingly and deploy it as a part of asset bundles? let me know if you feel both are same? Regards,Anil.

  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels