Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
I am trying to install ucx concurrently into multiple workspaces with bash script. Is it possible to install concurrently into multiple workspaces. If possible, explain it. I'm getting below error. Error: lib: cleanup: remove all: open C:\Users\offer...
Una terapia per le vene varicose che utilizza sostanze botaniche. Venicold G@l è efficace. La forza del G@l aiuta a riparare i vasi sanguigni lacerati. Venicold G@l riduce l'edema e migliora la circolazione sanguigna. Usa la soluzione quotidianamente...
Dzięki Venicold G@l naprawa naczyń krwionośnych w nogach następuje natychmiast. Ten nieinwazyjny, szybko działający lek rozpuszcza skrzepy krwi, co zatrzymuje krwawienie, ale prowadzi do upośledzenia przepływu krwi, cieńszych ścianek naczyń, wyższej ...
This video has a good overview of Databricks features, both existing and incoming: https://youtu.be/N9f_6Aoxeqg?si=LBkEJKvCY1IFCYZmGood for both those new and experienced on the platform, as new changes have added features. If anyone has a Keyboard s...
Thanks @szymon_dybczak - got those shortcuts now and trying to memorize some of them. Maybe Ill try and get these on a sheet of paper so I can keep it by my desk.
Hi there,Have a simple question. Not sure if Databricks supports this, but I'm wondering if there's a way to store the results of a sql cell into a spark dataframe? Or vice-versa, is there a way to take a sql query in python (saved as a string variab...
Hi @ChristianRRL, results of sql cell are automatically made available as a python dataframe using the _sqldf variable. You can read more about it here. For the second part not sure why you would need it when you can simply run the query like:spark.s...
Hey Everyone, Hope you’re all doing great!A friend of mine recently went through a data engineering interview at Walmart and shared his experience with me. I thought it would be really useful to pass along what he encountered. The interview had some ...
Hi Team,Can you share the best practices for designing the autoloader data processing?We have data from 30 countries data coming in various files. Currently, we are thinking of using a root folder i.e country, and with subfolders for the individual ...
Hi @Phani1 ,Structure of folders that you are going to use make sense to me. Since you've mentioned that there will be thousands of files, the best practice will be to use autoloader with file notification mode. Also, you can read about databricks r...
I am using the Unity Catalog Cluster. I have a requirement to read the files placed by the source team in a specific location (landing) in S3. I am already using a metastore pointing to a different bucket. Do I need to use an external location pointi...
did anyone get any solution on this topic? I am also facing the challenges reading the file from s3 using the boto3 with unity enabled cluster, created the s3 external location and granted the enough access. any help on this ?same path and data acce...
Getting Error when add shell script in init script for job cluster to copy file from DBFS to local as below is not working for GCP Databricks and same thing is working for azure data bricks Verified DBFS location file is present there .shell script i...
Hey @schunduri, not entirely sure because our SRE did the change, but the machine the pipeline runs on must be within the same vnet as your DBKS workspace. If you need more guidance, I could try and check what we did but our SRE left the company sinc...
I'm trying to use Structured Streaming in scala to stream from a delta table that is a dump of a kafka topic where each record/message is an update of attributes for the key and no messages from kafka are dropped from the dump, but the value is flatt...
I am confused about this recommendation. I thought the use of the append output mode in combination with aggregate queries is restricted to queries for which the aggregation is expressed using event-time and it defines a watermark.Could you clarify ?
Hi all,I am very new to databricks. I am looking for any good book recommendations that can help me get started. I know there is a vast resource available online but I feel a book will give me a structured approach to get startedAny book recommendati...
Hi @uniqueusername ,I would start with books that teach you spark.Learning Spark, 2nd Edition by Jules S. Damji, Brooke Wenig, Tathagata Das, Denny LeeData Analysis with Python and PySpark by Jonathan Rioux (Author)After you learn spark foundation, o...
Noel Nosse <nnosse@my.wgu.edu> 9:03 PM (0 minutes ago) to Databricks To complete a tutorial requires a workspace. The directions for the quickstart are outdated and do not match AWS. AWS has their own guide but cloudformation requires email ...
Hi All,Can you please share the best practices for job clusters configurations for production workloadsand which is good when compared to serverless and job cluster in production in terms of cost and performance?Regards,Phani
Hi @Phani1, For configuring job clusters for production workloads in Databricks, follow these best practices: match cluster size to workload needs, enable autoscaling for dynamic adjustment of worker nodes, use spot instances with a fallback to on-de...