cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

anilsampson
by New Contributor III
  • 1527 Views
  • 1 replies
  • 1 kudos

Resolved! databricks dashboard deployment question

Hello, i am trying to run a databricks dashboard via workflow.when i deploy the dashboard .json file in prod workspace via import dashboard option the dashboard_id is changed.is there a way i can deploy this without having to re-deploy my workflow wi...

  • 1527 Views
  • 1 replies
  • 1 kudos
Latest Reply
Advika
Community Manager
  • 1 kudos

Hello @anilsampson! From what I understand, when you import a dashboard into another workspace, a new dashboard_id is always generated.Deploying with a Databricks Asset Bundle does not keep the dashboard ID the same across different workspaces. Each ...

  • 1 kudos
dpc
by Contributor III
  • 2237 Views
  • 4 replies
  • 3 kudos

Can I see whether a table column is read in databricks?

HelloHistorically, we had a number of tables that have been extracted from source and loaded to databricks using 'select *'.As a result some columns that have been loaded never get used.I'd like to tidy this and remove redundant columns. Is there a w...

  • 2237 Views
  • 4 replies
  • 3 kudos
Latest Reply
dpc
Contributor III
  • 3 kudos

ThanksWould I have to do that 1 table at a time though?Lineage is useful but it only shows which tables used the table. It doesn't actually show the column used unless you go into the notebook. Unless I am missign something here? 

  • 3 kudos
3 More Replies
Phani1
by Databricks MVP
  • 980 Views
  • 1 replies
  • 0 kudos

Unity catalog + excel data access

Hi All,Is there or can there be connectors built to have excel connected to the Databricks Unity Catalog Semantic models and help users to connect and browse through the data that is stored in Databricks?Regards,Phani

  • 980 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @Phani1 ,Unfortunately, there is no such native connector at the moment. Of course you can still connect Excel to Databricks and browse data from a specific catalog using ODBC, like they do in following documentation entry:Connect to Azure Databri...

  • 0 kudos
Phani1
by Databricks MVP
  • 2548 Views
  • 1 replies
  • 2 kudos

Delta Sharing Approach for Secure Data Access in Development Environment

Hi Team,We have a scenarioProblem Statement:  The customer currently has data in both production and stage environments, with the stage environment being used primarily for development and bug fixing activities. They now want to separate these enviro...

  • 2548 Views
  • 1 replies
  • 2 kudos
Latest Reply
loui_wentzel
Databricks Partner
  • 2 kudos

Hey Phani!Cool setup you have there - some comments and ideas:Generally it sounds like you have a good apporch - Setting up a dedicated dev environement apart for staging and prod is the way. However, restricting access to tables in dev is generally ...

  • 2 kudos
darioschiraldi9
by New Contributor II
  • 1824 Views
  • 1 replies
  • 1 kudos

Resolved! Dario Schiraldi : How do I build a data pipeline in Databricks?

Hey everyone,I am Dario Schiraldi, working on building a data pipeline in Databricks and would love to get some feedback and suggestions from the community. I want to build a scalable and efficient pipeline that can handle large datasets and possibly...

  • 1824 Views
  • 1 replies
  • 1 kudos
Latest Reply
ilir_nuredini
Honored Contributor
  • 1 kudos

Hello @darioschiraldi9 ,Happy to hear that that you are exploring Databricks for you work. Here you may find a very detailed and good example on how you can build scalable data pipeline using DLT and  with the flexibility of Spark Streaming and a sop...

  • 1 kudos
jar
by Contributor
  • 3216 Views
  • 1 replies
  • 0 kudos

Disable Photon for serverless SQL DW

Hello.Is it possible to disable Photon for a serverless SQL DW? If yes, how?Best,Johan.

  • 3216 Views
  • 1 replies
  • 0 kudos
Latest Reply
CURIOUS_DE
Valued Contributor
  • 0 kudos

No, it is not possible to disable Photon for Databricks Serverless SQL Warehouses.Why Photon Cannot Be Disabled:Photon is always enabled on Serverless SQL Warehouses as part of Databricks’ architecture.Serverless SQL is built on Photon to ensure high...

  • 0 kudos
seefoods
by Valued Contributor
  • 1888 Views
  • 3 replies
  • 2 kudos

Resolved! batch process autoloader

My job continue to running after is finished susccessfully this i my case, i enable useNotification if self.autoloader_config.use_autoloader: logger_file_ingestion.info("debut d'ecriture en mode streaming") if self.write_mode.value.lower() == "...

  • 1888 Views
  • 3 replies
  • 2 kudos
Latest Reply
MariuszK
Valued Contributor III
  • 2 kudos

Hi @seefoods ,If it works, you can mark my answer as a solution so that if someone has the same problem, it will be easier to find an answer.

  • 2 kudos
2 More Replies
rpshgupta
by New Contributor III
  • 4763 Views
  • 11 replies
  • 5 kudos

How to find the source code for the data engineering learning path?

Hi Everyone,I am taking data engineering learning path in customer-academy.databricks.com . I am not able to find any source code attached to the course. Can you please help me to find it so that I can try hands on as well ?ThanksRupesh

  • 4763 Views
  • 11 replies
  • 5 kudos
Latest Reply
sselvaganapathy
New Contributor II
  • 5 kudos

Please refer the below link, there is no more Demo code provided by Databricks.https://community.databricks.com/t5/databricks-academy-learners/how-to-download-demo-notebooks-for-data-engineer-learning-plan/td-p/105362 

  • 5 kudos
10 More Replies
Bob-
by New Contributor II
  • 3481 Views
  • 3 replies
  • 4 kudos

Resolved! Upload Screenshot

I am new to the Databricks Free Edition. I am trying to upload a screenshot to be able to put it in a table and run some AI functions against it. It is not letting me upload a .png file. After several attempts I am being told that the root cause is p...

  • 3481 Views
  • 3 replies
  • 4 kudos
Latest Reply
Sharanya13
Contributor III
  • 4 kudos

@Bob-  Can you explain your use case? I'm not sure I understand "I am trying to upload a screenshot to be able to put it in a table and run some AI functions against it."Are you trying to perform OCR?

  • 4 kudos
2 More Replies
Phani1
by Databricks MVP
  • 4488 Views
  • 4 replies
  • 2 kudos

Potential Challenges of Using Iceberg Format (Databricks + Iceberg)

 Hi Team,What are the potential challenges of using Iceberg format instead of Delta for saving data in databricks?Regards,Phani

  • 4488 Views
  • 4 replies
  • 2 kudos
Latest Reply
sridharplv
Valued Contributor II
  • 2 kudos

Hi @Phani1 , Please find the below link which details out maintaining icerberg metadata along with delta metadata. https://community.databricks.com/t5/technical-blog/read-delta-tables-with-snowflake-via-unity-catalog/ba-p/115877

  • 2 kudos
3 More Replies
stevewb
by New Contributor III
  • 2127 Views
  • 1 replies
  • 0 kudos

Setting shuffle partitions in Databricks SQL Warehouse

I think it used to be possible to set shuffle partitions in databricks sql warehouse through e.g.: SET spark.sql.shuffle.partitions=20000. However, when I run this now, I get the error:[CONFIG_NOT_AVAILABLE] Configuration spark.sql.shuffle.partitions...

  • 2127 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @stevewb ,It's not available anymore. According with documentation:" Databricks SQL allows admins to configure Spark properties for data access in the workspace settings menu. See Data access configurationsOther than data access configurations, Da...

  • 0 kudos
Somia
by New Contributor III
  • 4798 Views
  • 7 replies
  • 2 kudos

Resolved! sql query is not returning _sqldf.

Notebooks in my workspace are not returning _sqldf when a sql query is run. If I run this code, it would give an error in second cell that _sqldf is not defined.First Cell:%sqlselect * from some_table limit 10Second Cell:%sqlselect * from _sqldfHowev...

  • 4798 Views
  • 7 replies
  • 2 kudos
Latest Reply
Somia
New Contributor III
  • 2 kudos

Changing the notebook to default python and all purpose compute have fixed the issue. I am able to access _sqldf in subsequent sql or python cell.

  • 2 kudos
6 More Replies
anilsampson
by New Contributor III
  • 4209 Views
  • 2 replies
  • 3 kudos

Resolved! How to get previous version of the table in databricks sql dynamically

hello, im trying to get the previous version of a delta table using timestamp but databricks sql does not allow to use variables the only thing i can do is use TIMESTAMP AS OF CURRENT_DATE() -1 if i have refreshed the table today.please let me know i...

  • 4209 Views
  • 2 replies
  • 3 kudos
Latest Reply
anilsampson
New Contributor III
  • 3 kudos

thank you @Vidhi_Khaitan  .Is there an upgrade or use case in works where we can pass parameters via workflow while triggering a databricks dashboard?

  • 3 kudos
1 More Replies
Divya_Bhadauria
by New Contributor III
  • 2224 Views
  • 1 replies
  • 0 kudos

Update databricks job parameter with CLI

Use Case:Updating a Databricks job with multiple tasks can be time-consuming and error-prone when changes (such as adding new parameters) need to be applied to each task manually.Possible Solutions:1. Using Databricks CLI – jobs reset commandYou can ...

Divya_Bhadauria_1-1751740411129.png Divya_Bhadauria_0-1751740346442.png
  • 2224 Views
  • 1 replies
  • 0 kudos
Latest Reply
anilsampson
New Contributor III
  • 0 kudos

hello Divya, Could you also try YAML and update your task accordingly and deploy it as a part of asset bundles? let me know if you feel both are same? Regards,Anil.

  • 0 kudos
zach
by New Contributor III
  • 1341 Views
  • 1 replies
  • 0 kudos

Get the total amount of S3 storage used per user

In Databricks is it possible to get the total amount of delta lake storage being used in the parquet format per user? Subsequently, what are the best practices on making sure that users saving delta files are not taking up storage unnecessarily, for ...

  • 1341 Views
  • 1 replies
  • 0 kudos
Latest Reply
Sharanya13
Contributor III
  • 0 kudos

Hi @zach, can you expand on why you need to know the total storage per user?Best practices - If you use Databricks managed tables, optimization is taken care of. https://docs.databricks.com/aws/en/optimizations/predictive-optimization

  • 0 kudos
Labels