Databricks Community

jerrygen78 · 07-29-2025

Yes, too many small Parquet files in a Delta table can degrade write performance by increasing metadata overhead during commits. Regularly running OPTIMIZE helps reduce this impact and improve streaming latency.

jerrygen78 · 07-29-2025

You're right to be concerned — this sounds like a classic case of memory or resource leakage over time, which can affect long-running jobs even if metrics look okay on the surface. In triggered DLT (now Lakeflow) pipelines, tasksand state can accumul...

jerrygen78 · 07-29-2025

Hi Alex,You're definitely not alone — this is a common issue when using the Spotify API in environments like Databricks or Jupyter Notebooks, where user input through the console (like typing in a prompt) isn't supported.What's happening is that the ...

jerrygen78 · 07-24-2025

Hi MGraham,To import an Excel spreadsheet into a database table in Databricks, you can first upload the Excel file to DBFS (Databricks File System), then use libraries like pandas to read the Excel file (e.g., pd.read_excel("/dbfs/path/to/file.xlsx")...

jerrygen78 · 06-17-2025

Hi Avinash, You could give ChatGPT a shot — just ask it to help convert it for you. You might honestly be surprised by how well it handles it. There are more advanced ways to do it too, though those can get a bit tricky if you're not used to them.

Databricks Community

User Stats

User Activity

Re: Does too many parquet files in delta table impact writes for the streaming job

Re: Lakeflow pipeline (formerly DLT pipeline) performance progressively degrades on a persistent clu

Re: spotify API get token - raw_input was called, but this frontend does not support input requests.

Re: Import excel spreadsheet to a database table

Re: Redshift Stored Procedure Migration to Databricks