cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

thiagoawstest
by Contributor
  • 1689 Views
  • 1 replies
  • 0 kudos

Resolved! Unity Catalog mount S3

Hi, I still have some questions, I have a Databricks on AWS and I need to mount S3 bucksts.According to the documentation, it is recommended to do it through the Unity Catalog, but how would I go about reading data from a notebook that would be mount...

  • 1689 Views
  • 1 replies
  • 0 kudos
Latest Reply
thiagoawstest
Contributor
  • 0 kudos

Returning, I already understood, I'm marking it as resolved.

  • 0 kudos
mmendez1012
by New Contributor
  • 472 Views
  • 0 replies
  • 0 kudos

Workflows

Someone Can give me some advices about parquet size files whem moving data

  • 472 Views
  • 0 replies
  • 0 kudos
ashwinhabbu
by New Contributor
  • 292 Views
  • 0 replies
  • 0 kudos

Summit 24 experience

Great to see collaboration between Nvidia and Databricks!! Excited about everything serverless

  • 292 Views
  • 0 replies
  • 0 kudos
Jorge3
by New Contributor III
  • 8161 Views
  • 3 replies
  • 1 kudos

Dynamic partition overwrite with Streaming Data

Hi,I'm working on a job that propagate updates of data from a delta table to a parquet files (requirement of the consumer). The data is partitioned by day (year > month > day) and the daily data is updated every hour. I'm using table read streaming w...

  • 8161 Views
  • 3 replies
  • 1 kudos
Latest Reply
JacintoArias
New Contributor III
  • 1 kudos

We had a similar situation, @Hubert-Dudek we are using delta, but we are having some problems when propagating updates via merge, as you cannot read the resulting table as streaming source anymore... so using complete overwrite over parquet partition...

  • 1 kudos
2 More Replies
dbengineer516
by New Contributor III
  • 7089 Views
  • 1 replies
  • 0 kudos

Resolved! IOStream.flush Timed Out

Hello,I'm encountering an issue with a Python script/notebook that I have developed and used in a daily job ran in Databricks. It has worked perfectly fine for months, but now continues to fail constantly. After digging a little deeper, when running ...

  • 7089 Views
  • 1 replies
  • 0 kudos
Latest Reply
raphaelblg
Databricks Employee
  • 0 kudos

Hello @dbengineer516  From my research it looks to be an IPython cache error. Maybe your python REPL is getting throttled due to too many requests. Please check: https://github.com/ipython/ipykernel/issues/334 This comment seems to be a possible solu...

  • 0 kudos
Philospher1425
by New Contributor II
  • 7724 Views
  • 4 replies
  • 2 kudos

Rename the file in Databricks is so hard.How to make it simpler

Hi Community  Actually my requirement is simple , I need to drop the files into Azure data Lake gen 2 storage from Databricks. But When I use df.coalesce(1).write.csv("url to gen 2/stage/) It's creating part .CSV file . But I need to rename to a cust...

  • 7724 Views
  • 4 replies
  • 2 kudos
Latest Reply
raphaelblg
Databricks Employee
  • 2 kudos

Hi @Philospher1425,   Allow me to clarify that dbutils.fs serves as an interface to submit commands to your cloud provider storage. As such, the speed of copy operations is determined by the cloud provider and is beyond Databricks' control.   That be...

  • 2 kudos
3 More Replies
Lizhi_Dong
by New Contributor II
  • 3225 Views
  • 6 replies
  • 1 kudos

Tables disappear when I re-start a new cluster on Community Edition

What would be the best plan for independent course creator?Hi folks! I want to use databrick community edition as the platform to teach online courses. As you may know, for community edition, you need to create a new cluster when the old one terminat...

  • 3225 Views
  • 6 replies
  • 1 kudos
Latest Reply
Shivanshu_
Contributor
  • 1 kudos

I believe only the metadata get's removed from HMS not the delta files from dbfs. Instead of loading the data again and again try using ctas with that dbfs location.

  • 1 kudos
5 More Replies
JonM
by New Contributor
  • 874 Views
  • 0 replies
  • 0 kudos

Information_schema appears empty

Hi,We've encountered a problem with the information schema for one of our catalogs. For context: we're using dbt to implement our logic. We noticed this issue because dbt queries the information_schema.tables view to check which tables should be drop...

  • 874 Views
  • 0 replies
  • 0 kudos
Pratibha
by New Contributor II
  • 1293 Views
  • 2 replies
  • 0 kudos

which cluster/worker/driver type is best for analytics work?

which cluster/worker/driver type is best for analytics work?

  • 1293 Views
  • 2 replies
  • 0 kudos
Latest Reply
jacovangelder
Honored Contributor
  • 0 kudos

Analytics work as in querying and analyzing data? Preferably using Databricks SQL? If so, then a SQL Warehouse is your best friend. 

  • 0 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels