Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
Hey folks! I want to know what are the features that UCX does not provides in UC or specially Hive to UC Migration that can be done manually but not using UCX. As UCX is currently in developing mode so there are so many drawbacks, can someone share t...
We are loading a data source that contains XML. I am translating their queries to create views in Databricks. They use 'XMLNAMESPACES' to construct/parse XML. Below is an example. What is best practice for translating 'XMLNAMESPACES' in Databricks?...
Hi @TinaN, To handle XMLNAMESPACES in Databricks, use the from_xml function for parsing XML data, where you can define namespaces within your parsing logic. Start by reading the XML data using spark.read.format("xml"), then apply the from_xml functio...
Hi,I have created this table which contains the data that I need for my source path and target table. source_path: /data/customer/sid={sid}/abc=1/attr_provider={attr_prov}/source_data_provider_code={src_prov}/So basically, the value of each row are c...
Hi @zll_0091, To efficiently load only the necessary files without manually iterating through each row of your table, you can use Spark's DataFrame operations. First, read your table into a DataFrame and determine the maximum key value. Then, filter ...
Hello Team, @Cert-Team , @Cert-TeamOPS I faced a very bad experience while attempting my 1st DataBricks certification.I was asked to exit the exam multiple times by the support team saying technical issues. My test got rescheduled multiple times with...
Hi @ozbieG, I'm sorry to hear your exam was suspended. Thank you for filing a ticket with our support team. Please allow the support team 24-48 hours to resolve. In the meantime, you can review the following documentation:
Room requirements
Behaviour...
Hi Databricks!I am a relatively new developer that's looking for a solid API testing tool. I am interested in hearing about other developers, new or experienced, about their experiences with API testing tools, regardless if they are good or bad. I've...
Hi @bytetogo,In my daily work I use Postman. It has user-friendly interface, supports automated testing and has support for popular patterns and libraries. It is also compatible with Linux, MacOs, Windows.
Hi all,I am very new to databricks. I am looking for any good book recommendations that can help me get started. I know there is a vast resource available online but I feel a book will give me a structured approach to get startedAny book recommendati...
Hi @uniqueusername ,I would start with books that teach you spark.Learning Spark, 2nd Edition by Jules S. Damji, Brooke Wenig, Tathagata Das, Denny LeeData Analysis with Python and PySpark by Jonathan Rioux (Author)After you learn spark foundation, o...
Noel Nosse <nnosse@my.wgu.edu> 9:03 PM (0 minutes ago) to Databricks To complete a tutorial requires a workspace. The directions for the quickstart are outdated and do not match AWS. AWS has their own guide but cloudformation requires email ...
Hi All,Can you please share the best practices for job clusters configurations for production workloadsand which is good when compared to serverless and job cluster in production in terms of cost and performance?Regards,Phani
Hi @Phani1, For configuring job clusters for production workloads in Databricks, follow these best practices: match cluster size to workload needs, enable autoscaling for dynamic adjustment of worker nodes, use spot instances with a fallback to on-de...
I have been using visualization for a lot of different usecases and has been working for instead of using 3rd party libraries. Recently I had a need to customize the data labels but I haven't seen anything in the documentation that how to do that. If...
Hi,I'm trying to get the certain value of my variable in the for loop but it's returning the syntax instead of the value. Also, is it possible to covert this value to an integer? Thanks
While adding arepo in databricks workspace I am getting error as 'Error creating repoThe Azure container does not exist'Please see the attached screenshot.Anyone please suggest the fix.
There are three possible causes.
The Azure container might not have been properly created when the workspace was provisioned.The Azure container might have been deleted or moved after it was created.There might be a problem with the permissions or r...
I am trying to create databricks community account but after providing all the information and completing the puzzle it is showing me an error occurred: I also recorded the the network request for the error:Header:Request URL:https://www.databricks.c...
Hi,I have registered for Community Edition and can access it with no problems trough: https://community.cloud.databricks.com/login.htmlNow, I'm interested in completing the free "lakehouse fundamentals" training here and taking the quiz to get the ba...
Hi @slechtd , @qiuqiu You can't log in because this is login page for databricks customers. You should use login at community edition, like on the bottom left side of the below screen Furthermore, to get databricks fundamentals accreditation you need...
I use the following option to write from multiple tasks to the same table with overwrite (in Pyspark).option("partitionOverwriteMode", "dynamic")The table was created with partition by so it works as expected.I read about liquid clustering and it's b...