Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
Is there a way for me to get Type 2 SCD changes without using streaming tables?I'm worried streaming tables may have limitations that interfere with adoption.
I am trying to work through Tutorial: Query data from a Notebook.Access errors are defeating my attempts. Steps to reproduce:sign up for free trial through Databricks website. The path skipped the subscription-selection step and defaulted the trial t...
I am currently using a personal computer cluster [13.3 LTS (includes Apache Spark 3.4.1, Scala 2.12)] on GCP attached to a notebook. After running a few command lines without an issue, I end up getting this errorInternal error. Attach your notebook t...
Hello,at some point I tested Databricks for a potential customer and, after the test, I cancelled the subscription.I read that it is not possible to resubscribe with the same e-mail address. Therefore, my idea would be to delete the account I created...
I have a similar issue. I subscribed to Databricks using AWS account email. I cancelled it later. Now I want to start using Databricks on AWS again using the same email id and with pay as you go plan. But there is no way to re-subscribe. If this can...
Hi I am just getting started in databricks would appreciate some help here.I have 10TB TPCDS in S3 i a hive partition structure.My goal is to benchmark a data bricks cluster on this data.after setting all IAM credentials according to this https://doc...
Hi Expert,How we can setup multiple notebook in a sequence order in flow for an example 1 pipeline have notebook1 - sequence 1,Notebook2- Sequence 2(in 1pipeline only)
Not sure how to approach your challenge but something you can is to use the Databricks Job Scheduler or if you want an external solution in Azure you can call several notebooks from DataFactory.
HelloAfter upgrading my cluster from DBR 12 to 14.1 I got a MISSING_ATTRIBUTES.RESOLVED_ATTRIBUTE_APPEAR_IN_OPERATION on some of my Joinsdf1.join(
df2,
[df1["name"] == df2["name"], df1["age"] == df2["age"]],
'left_outer'
)I resolved it by...
Hello community!I'm currently working on Spark scripts for data processing and facing some performance challenges. Any tips or suggestions on optimizing code for better efficiency? Your expertise is highly appreciated! paybyplatema Thanks.
I was trying to push .ipynb file to github from Azure DB to github and appears that original file is converted to source code as .py.Why does databricks do this and how can I control which ones to do or not ?I need to keep some files as .ipynb.Thanks...
Hi,I'm trying to call the DLT api to kickoff my delta live table flow with a web API call block from Azure Data Factory. I have two environments: one DEV and one PROD.The DEV environment works fine, the response is giving me the update_id, but the PR...
i have usecase to call rest API and then return response file with base64Is it possible save the response directly to ADLS without convert it to file first ?
Hi. Recently, I've found that Databricks is really slow when editing notebooks, such as adding cells, copying and pasting text or cells, etc. It's just the past few weeks actually. I'm using Chrome version 118.0.5993.118. Everything else with Chrome ...
It seems related to the notebook length (number of cells). The notebook that was really slow had about 40-50 cells, which I've done before without issue. Anyway after starting a new notebook using Chrome, it seems useable again. So without a specific...