cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

alejandrofm
by Valued Contributor
  • 1396 Views
  • 2 replies
  • 5 kudos

Resolved! How to set a global checkpoint for all notebooks?

I have several users doing data analysis on Databricks Spark notebooks, everything is smooth, now I want to make sure that the checkpointdir is configured on the cluster start, so every user doesn't had to set it on the Notebook (ending up in a lot o...

image
  • 1396 Views
  • 2 replies
  • 5 kudos
Latest Reply
Kaniz
Community Manager
  • 5 kudos

Hi @Alejandro Martinez​ , Just a friendly follow-up. Do you still need help, or @Hubert Dudek (Customer)​ 's response help you to find the solution? Please let us know.

  • 5 kudos
1 More Replies
Cameron_Afzal
by New Contributor II
  • 747 Views
  • 2 replies
  • 0 kudos

I'm unable to create an account for Databricks Community Edition. I've tried multiple email addresses and browsers across multiple attempts. I...

I'm unable to create an account for Databricks Community Edition. I've tried multiple email addresses and browsers across multiple attempts. I fill out and submit the sign-up form but never receive the email and thus can't log in. Any advice? Are the...

  • 747 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Cameron Afzal​  and @tipu sultan​ , Thank you for reaching out! Our sincere apologies for the delayed response; it won't happen again.We had a temporary bug that affected a small number of users; our regrets if you were impacted. We have fixed th...

  • 0 kudos
1 More Replies
prasadvaze
by Valued Contributor II
  • 5386 Views
  • 5 replies
  • 5 kudos

Resolved! Limit on number of result rows displayed on databricks SQL UI

Databricks SQL UI currently limits the query results display to 64000 rows. When will this limit go away? Using SSMS I get 40MM rows results in the UI and my users won't switch to databricks SQL for this reason

  • 5386 Views
  • 5 replies
  • 5 kudos
Latest Reply
User16765136105
New Contributor III
  • 5 kudos

Hi @prasad vaze​ - We do have a feature in the works that will increase this limit. If you reach out to your Databricks contact they can give you more details regarding dates and the preview.

  • 5 kudos
4 More Replies
mani238
by New Contributor III
  • 3314 Views
  • 7 replies
  • 4 kudos
  • 3314 Views
  • 7 replies
  • 4 kudos
Latest Reply
Kaniz
Community Manager
  • 4 kudos

Hi @manivannan p​ ​ , Just a friendly follow-up. Do you still need help, or do the above responses help you find the solution? Please let us know.

  • 4 kudos
6 More Replies
ChristianWuerdi
by New Contributor III
  • 7560 Views
  • 5 replies
  • 5 kudos

Resolved! How can I backup my Databricks instance?

We have a Databricks instance on Azure that has somewhat organically grow with dozens of users and hundreds of notebooks. How do I conveniently backup this env so in case disaster strikes the notebooks aren't lost? The data itself is backed by Azure ...

  • 7560 Views
  • 5 replies
  • 5 kudos
Latest Reply
ChristianWuerdi
New Contributor III
  • 5 kudos

@Kaniz Fatma​ All good thanks, combination of CLI + gradually migrating everything to git is a viable solution

  • 5 kudos
4 More Replies
philip
by New Contributor
  • 4973 Views
  • 3 replies
  • 2 kudos

Resolved! current date as default in a widget while scheduling the notebook

I have a scheduled a notebook. can I keep current date as default in widget whenever the notebook run and also i need the flexibility to change the widget value to any other date based on the ad hoc run that I do.

  • 4973 Views
  • 3 replies
  • 2 kudos
Latest Reply
Kaniz
Community Manager
  • 2 kudos

Hi @philip george​ , Just a friendly follow-up. Do you still need help, or @Hubert Dudek (Customer)​ 's and @Werner Stinckens​'s response help you to find the solution? Please let us know.

  • 2 kudos
2 More Replies
Andyfcx
by New Contributor
  • 1647 Views
  • 3 replies
  • 2 kudos

Resolved! Is it possible to clone a private repository and use it in databricks Repos?

As title, I need to clone code from my private git repo, and use it in my notebook, I do something likedef cmd(command, cwd=None): process = subprocess.Popen(command.split(), stdout=subprocess.PIPE, cwd=cwd) output, error = process.communicate(...

  • 1647 Views
  • 3 replies
  • 2 kudos
Latest Reply
Kaniz
Community Manager
  • 2 kudos

Hi @Andy Huang​ , Just a friendly follow-up. Do you still need help, or @Prabakar Ammeappin​ 's response help you to find the solution? Please let us know.

  • 2 kudos
2 More Replies
Personal1
by New Contributor
  • 2583 Views
  • 3 replies
  • 2 kudos

Resolved! Understanding Partitions in Spark Local Mode

I have few fundamental questions in Spark3 while running a simple Spark app in my local mac machine (with 6 cores in total). Please help.local[*] runs my Spark application in local mode with all the cores present on my mac, correct? It also means tha...

  • 2583 Views
  • 3 replies
  • 2 kudos
Latest Reply
Kaniz
Community Manager
  • 2 kudos

Hi @Abhishek Pradhan​ , Just a friendly follow-up. Do you still need help, or @Werner Stinckens​ 's response help you to find the solution? Please let us know.

  • 2 kudos
2 More Replies
Frankooo
by New Contributor III
  • 4291 Views
  • 9 replies
  • 7 kudos

How to optimize exporting dataframe to delta file?

Scenario : I have a dataframe that have 5 billion records/rows and 100+ columns. Is there a way to write this in a delta format efficiently. I have tried to export it but cancelled it after 2 hours (write didnt finish) as this processing time is not ...

  • 4291 Views
  • 9 replies
  • 7 kudos
Latest Reply
Kaniz
Community Manager
  • 7 kudos

Hi @Franco Sia​ , Just a friendly follow-up. Do you still need help or the above responses help you to find the solution? Please let us know.

  • 7 kudos
8 More Replies
Sam
by New Contributor III
  • 1056 Views
  • 2 replies
  • 0 kudos

Can Admins enable Table Download on Sample but not on Full Dataset?

Is it possible to allow for Table download on a sampled dataset but not the full dataset? In the configuration settings it seems like you have to allow both?Not withstanding the fact people could loop through the sample download, it seems like a prud...

  • 1056 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Sam H​, Just a friendly follow-up. Do you still need help, or @Arjun Kaimaparambil Rajan​ 's response helps you to find the solution? Please let us know.

  • 0 kudos
1 More Replies
yitao
by New Contributor III
  • 2032 Views
  • 6 replies
  • 11 kudos

Resolved! How to make sparklyr extension work with Databricks runtime?

Hello. I'm the current maintainer of sparklyr (a R interface for Apache Spark) and a few sparklyr extensions such as sparklyr.flint.Sparklyr was fortunate to receive some contribution from Databricks folks, which enabled R users to run `spark_connect...

  • 2032 Views
  • 6 replies
  • 11 kudos
Latest Reply
Kaniz
Community Manager
  • 11 kudos

Hi @yitao​ , Just a friendly follow-up. Do you still need help, or does the above response help you to find the solution? Please let us know.

  • 11 kudos
5 More Replies
Hubert-Dudek
by Esteemed Contributor III
  • 1792 Views
  • 5 replies
  • 18 kudos

Resolved! Azure: Permanently purge cluster logs

Is there any way to purge logs via API instead of clicking daily that option:

image.png
  • 1792 Views
  • 5 replies
  • 18 kudos
Latest Reply
Kaniz
Community Manager
  • 18 kudos

Hi @Hubert Dudek​ ​ , Just a friendly follow-up. Do you still need help, or @Prabakar Ammeappin​'s response help you to find the solution? Please let us know.

  • 18 kudos
4 More Replies
BorislavBlagoev
by Valued Contributor III
  • 2607 Views
  • 3 replies
  • 5 kudos

Resolved! Get package from Nexus repo.

I want to receive a package from Nexus repo both in notebook and job. If anyone has experience with this, please answer me here!

  • 2607 Views
  • 3 replies
  • 5 kudos
Latest Reply
Kaniz
Community Manager
  • 5 kudos

Hi @Borislav Blagoev​ , Just a friendly follow-up. Do you still need help, or does the above response help you to find the solution? Please let us know.

  • 5 kudos
2 More Replies
soundari
by New Contributor
  • 1403 Views
  • 3 replies
  • 1 kudos

Resolved! Identify the partitionValues written yesterday from delta

We have a streaming data written into delta. We will not write all the partitions every day. Hence i am thinking of running compact spark job, to run only on partitions that has been modified yesterday. Is it possible to query the partitionsValues wr...

  • 1403 Views
  • 3 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @Gnanasoundari Soundarajan​  , Just a friendly follow-up. Do you still need help, or @Deepak Bhutada​ 's response help you to find the solution? Please let us know.

  • 1 kudos
2 More Replies
narek_margaryan
by New Contributor II
  • 1867 Views
  • 3 replies
  • 3 kudos

Resolved! Do Spark nodes read data from storage in a sequence?

I'm new to Spark and trying to understand how some of its components work.I understand that once the data is loaded into the memory of separate nodes, they process partitions in parallel, within their own memory (RAM).But I'm wondering whether the in...

  • 1867 Views
  • 3 replies
  • 3 kudos
Latest Reply
Kaniz
Community Manager
  • 3 kudos

Hi @Narek Margaryan​, Just a friendly follow-up. Do you still need help, or does the above response help you to find the solution? Please let us know.

  • 3 kudos
2 More Replies
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels