cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

ClaudeR
by New Contributor III
  • 3765 Views
  • 5 replies
  • 1 kudos

Resolved! Can someone help me understand how compute pricing works.

Im looking at using Databricks internally for some Data Science projects. I am however very confused to how the pricing works and would like to obviously avoid high spending right now. Internal documentation and within Databricks All-Purpose Compute...

  • 3765 Views
  • 5 replies
  • 1 kudos
Latest Reply
GuillermoM
New Contributor II
  • 1 kudos

Hello,I was able to get a very precise cost of Azure Databricks Clusters and Computers jobs, using the Microsoft API and Databricks APIThen I wrote a simple tool to extract and manipulate the API results and generate detailed cost reports that can be...

  • 1 kudos
4 More Replies
David_K93
by Contributor
  • 2239 Views
  • 1 replies
  • 2 kudos

Resolved! Building a Document Store on Databricks

Hello,I am somewhat new to Databricks and am trying to build a Q&A application based on a collection of documents. I need to move .pdf and .docx files from my local machine to storage in Databricks and eventually a document store. My questions are:Wh...

  • 2239 Views
  • 1 replies
  • 2 kudos
Latest Reply
David_K93
Contributor
  • 2 kudos

Hi all,I took an initial stab at task one with some success using the Databricks CLI. Here are the steps below:Open Command/Anaconda prompt and enter: pip install databricks-cliGo to your Databricks console and under settings find "User Settings" and...

  • 2 kudos
Jason_923248
by New Contributor III
  • 2053 Views
  • 2 replies
  • 3 kudos

Resolved! In Data Explorer, how do you Refresh a table definition?

In Data Science & Engineering -> Data -> Data Explorer, if I expand the hive_metastore, then expand a schema and choose a table, and then view the "Sample Data", I receive this error:[DEFAULT_FILE_NOT_FOUND] It is possible the underlying files have b...

  • 2053 Views
  • 2 replies
  • 3 kudos
Latest Reply
padmajaa
New Contributor III
  • 3 kudos

Try refreshing all cached entries that are associated with the table that might helpREFRESH TABLE [db_name.]table_name

  • 3 kudos
1 More Replies
rlink
by New Contributor II
  • 2169 Views
  • 3 replies
  • 2 kudos

Resolved! Data Science & Engineering Dashboard Refresh Issue Using Databricks

Hi everyone,I create a Data Science & Engineering notebook in databricks to display some visualizations and also set up a schedule for the notebook to run every hour. I can see that the scheduled run is successful every hour, but the dashboard I crea...

  • 2169 Views
  • 3 replies
  • 2 kudos
Latest Reply
luis_herrera
New Contributor III
  • 2 kudos

To schedule a dashboard to refresh at a specified interval, schedule the notebook that generates the dashboard graphs.PS: Check #DAIS2023 talks

  • 2 kudos
2 More Replies
Anonymous
by Not applicable
  • 1628 Views
  • 3 replies
  • 2 kudos

www.dbdemos.ai

Getting started with Databricks is being made very easy now. Presenting dbdemos.If you're looking to get started with Databricks, there's good news: dbdemos makes it easier than ever. This platform offers a range of demos that you can install directl...

  • 1628 Views
  • 3 replies
  • 2 kudos
Latest Reply
FJ
Contributor III
  • 2 kudos

That's a great share Suteja. Is that supposed to work with the Databricks Community edition account? Had a strange error while trying. Any help is appreciated!Thanks,F

  • 2 kudos
2 More Replies
MarcJustice
by New Contributor
  • 1198 Views
  • 3 replies
  • 3 kudos

Is the promise of a data lake simply about data science, data analytics and data quality or can it also be an integral part of core transaction processing also?

Upfront, I want to let you know that I'm not a veteran data jockey, so I apologize if this topic has been covered already or is simply too basic or narrow for this community. That said, I do need help so please feel free to point me in another direc...

  • 1198 Views
  • 3 replies
  • 3 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 3 kudos

Hi @Marc Barnett​ , Just a friendly follow-up. Do you still need help, or @Aashita Ramteke​ 's response help you to find the solution? Please let us know.

  • 3 kudos
2 More Replies
cconnell
by Contributor II
  • 4568 Views
  • 11 replies
  • 7 kudos

Resolved! What is the proper way to import the new pyspark.pandas library?

I am moving an existing, working pandas program into Databricks. I want to use the new pyspark.pandas library, and change my code as little as possible. It appears that I should do the following:1) Add from pyspark import pandas as ps at the top2) Ch...

  • 4568 Views
  • 11 replies
  • 7 kudos
Latest Reply
Anonymous
Not applicable
  • 7 kudos

Make sure to use the 10.0 Runtime which includes Spark 3.2

  • 7 kudos
10 More Replies
User16788317454
by New Contributor III
  • 936 Views
  • 1 replies
  • 0 kudos
  • 936 Views
  • 1 replies
  • 0 kudos
Latest Reply
j_weaver
New Contributor III
  • 0 kudos

If you are talking about distributed training of a single XGBoost model, there is no built-in capability in SparkML. SparkML supports gradient boosted trees, but not XGBoost specifically. However, there are 3rd party packages, such as XGBoost4J that ...

  • 0 kudos
Labels