cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

valjas
by New Contributor III
  • 4266 Views
  • 3 replies
  • 1 kudos

How do I create spark.sql.session.SparkSession?

When I create a session n Databricks it is defaulting to spark.sql.connect.session.SparkSession. How can I connect to spark with out spark connect?

  • 4266 Views
  • 3 replies
  • 1 kudos
Latest Reply
miguel_ortiz
New Contributor II
  • 1 kudos

Is there any solution to this? Pandera, Evidently and Ydata Profiling break because they don't speak a sql.connect session object. They expect a spark.sql.session.SparkSession it's very frustrating not being to use any of these libraries with the new...

  • 1 kudos
2 More Replies
liv1
by New Contributor II
  • 3198 Views
  • 2 replies
  • 1 kudos

Structured Streaming from a delta table that is a dump of kafka and get the latest record per key

I'm trying to use Structured Streaming in scala to stream from a delta table that is a dump of a kafka topic where each record/message is an update of attributes for the key and no messages from kafka are dropped from the dump, but the value is flatt...

  • 3198 Views
  • 2 replies
  • 1 kudos
Latest Reply
Maatari
New Contributor III
  • 1 kudos

I am confused about this recommendation. I thought the use of the append output mode in combination with aggregate queries is restricted to queries for which the aggregation is expressed using event-time and it defines a watermark.Could you clarify ?

  • 1 kudos
1 More Replies
itsmejoeyong
by New Contributor II
  • 3374 Views
  • 3 replies
  • 1 kudos

Resolved! Best Approach for Handling ETL Processes in Databricks

I am currently managing nearly 300 tables from a production database and considering moving the entire ETL process away from Azure Data Factory to Databricks.This process, which involves extraction, transformation, testing, and loading, is executed d...

  • 3374 Views
  • 3 replies
  • 1 kudos
Latest Reply
Brahmareddy
Esteemed Contributor
  • 1 kudos

Hi,Instead of 300 individual files or one massive script, try grouping similar tables together. For example, you could have 10 scripts, each handling 30 tables. This way, you get the best of both approches—This way you will have a freedom of easy deb...

  • 1 kudos
2 More Replies
Brahmareddy
by Esteemed Contributor
  • 2053 Views
  • 4 replies
  • 3 kudos

Understanding Flight Cancellations and Rescheduling in Airlines Using Databricks and PySpark

In the airline industry, it’s important to manage flights efficiently. Knowing why flights get canceled or rescheduled helps improve customer satisfaction and operational performance. In this article, I’ll show you how to use Databricks and PySpark t...

Get Started Discussions
airlines
artificial intelligence
feature engineering
machine learning
  • 2053 Views
  • 4 replies
  • 3 kudos
Latest Reply
Rishabh-Pandey
Esteemed Contributor
  • 3 kudos

@Brahmareddy Interesting one , thanks for sharing

  • 3 kudos
3 More Replies
AyushPandey
by New Contributor II
  • 8625 Views
  • 8 replies
  • 0 kudos

Unable to reactive an inactive user

Hi all,I am facing an issue with reactivating an inactive user i tried the following json with databricks cli run_update = {  "schemas": [ "urn:ietf:params:scim:api:messages:2.0:PatchOp" ],  "Operations": [    {      "op": "replace",      "path": "ac...

  • 8625 Views
  • 8 replies
  • 0 kudos
Latest Reply
bencsik
New Contributor III
  • 0 kudos

@FunkybunchOO Thank you for your response! I will look into other connections, but we are not currently using SCIM. There must be something similar blocking the activation.

  • 0 kudos
7 More Replies
priyansh
by New Contributor III
  • 606 Views
  • 0 replies
  • 0 kudos

UCX

Hey folks! I want to know what are the features that UCX does not provides in UC or specially Hive to UC Migration that can be done manually but not using UCX. As UCX is currently in developing mode so there are so many drawbacks, can someone share t...

  • 606 Views
  • 0 replies
  • 0 kudos
TinaN
by New Contributor III
  • 1267 Views
  • 2 replies
  • 0 kudos

Resolved! Translating XMLNAMESPACE in SQL Databricks

We are loading a data source that contains XML. I am translating their queries to create views in Databricks. They use 'XMLNAMESPACES' to construct/parse XML.  Below is an example.  What is best practice for translating 'XMLNAMESPACES' in Databricks?...

  • 1267 Views
  • 2 replies
  • 0 kudos
Latest Reply
Retired_mod
Esteemed Contributor III
  • 0 kudos

Hi @TinaN, To handle XMLNAMESPACES in Databricks, use the from_xml function for parsing XML data, where you can define namespaces within your parsing logic. Start by reading the XML data using spark.read.format("xml"), then apply the from_xml functio...

  • 0 kudos
1 More Replies
zll_0091
by New Contributor III
  • 660 Views
  • 1 replies
  • 0 kudos

Can I load the files based on the data in my table as variable without iterating through each row?

Hi,I have created this table which contains the data that I need for my source path and target table. source_path: /data/customer/sid={sid}/abc=1/attr_provider={attr_prov}/source_data_provider_code={src_prov}/So basically, the value of each row are c...

zll_0091_2-1722958875477.png zll_0091_1-1722958858553.png zll_0091_3-1722958975973.png
  • 660 Views
  • 1 replies
  • 0 kudos
Latest Reply
Retired_mod
Esteemed Contributor III
  • 0 kudos

Hi @zll_0091, To efficiently load only the necessary files without manually iterating through each row of your table, you can use Spark's DataFrame operations. First, read your table into a DataFrame and determine the maximum key value. Then, filter ...

  • 0 kudos
ozbieG
by New Contributor II
  • 1293 Views
  • 2 replies
  • 0 kudos

Databricks Certification exam got Suspended - Need Support

Hello Team, @Cert-Team , @Cert-TeamOPS I faced a very bad experience while attempting my 1st DataBricks certification.I was asked to exit the exam multiple times by the support team saying technical issues. My test got rescheduled multiple times with...

  • 1293 Views
  • 2 replies
  • 0 kudos
Latest Reply
Retired_mod
Esteemed Contributor III
  • 0 kudos

Hi @ozbieG, I'm sorry to hear your exam was suspended. Thank you for filing a ticket with our support team. Please allow the support team 24-48 hours to resolve. In the meantime, you can review the following documentation: Room requirements Behaviour...

  • 0 kudos
1 More Replies
bytetogo
by New Contributor
  • 1116 Views
  • 1 replies
  • 0 kudos

What API Testing Tool Do You Use?

Hi Databricks!I am a relatively new developer that's looking for a solid API testing tool. I am interested in hearing about other developers, new or experienced, about their experiences with API testing tools, regardless if they are good or bad. I've...

  • 1116 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @bytetogo,In my daily work I use Postman. It has user-friendly interface, supports automated testing and has support for popular patterns and libraries. It is also compatible with Linux, MacOs, Windows.

  • 0 kudos
uniqueusername
by New Contributor
  • 4527 Views
  • 1 replies
  • 0 kudos

Databricks book recommendations

Hi all,I am very new to databricks. I am looking for any good book recommendations that can help me get started. I know there is a vast resource available online but I feel a book will give me a structured approach to get startedAny book recommendati...

  • 4527 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @uniqueusername ,I would start with books that teach you spark.Learning Spark, 2nd Edition by Jules S. Damji, Brooke Wenig, Tathagata Das, Denny LeeData Analysis with Python and PySpark by Jonathan Rioux (Author)After you learn spark foundation, o...

  • 0 kudos
Niko1
by New Contributor III
  • 1266 Views
  • 3 replies
  • 0 kudos

Unable to create a workspace

Noel Nosse <nnosse@my.wgu.edu> 9:03 PM (0 minutes ago)   to Databricks        To complete a tutorial requires a workspace. The directions for the quickstart are outdated and do not match AWS. AWS has their own guide but cloudformation requires email ...

  • 1266 Views
  • 3 replies
  • 0 kudos
Latest Reply
Niko1
New Contributor III
  • 0 kudos

Now I get: Redirecting to: https://accounts.cloud.databricks.com/login/password?next_url=%2Fapi%2F2.

  • 0 kudos
2 More Replies
Phani1
by Valued Contributor II
  • 1065 Views
  • 1 replies
  • 0 kudos

Job Cluster best practices for production workloads

 Hi All,Can you please share the best practices for job clusters configurations for production workloadsand which is good when compared to serverless and job cluster in production in terms of cost and performance?Regards,Phani 

  • 1065 Views
  • 1 replies
  • 0 kudos
Latest Reply
Retired_mod
Esteemed Contributor III
  • 0 kudos

Hi @Phani1, For configuring job clusters for production workloads in Databricks, follow these best practices: match cluster size to workload needs, enable autoscaling for dynamic adjustment of worker nodes, use spot instances with a fallback to on-de...

  • 0 kudos
haseeb4Ambition
by New Contributor
  • 1053 Views
  • 0 replies
  • 0 kudos

Databricks visualization Data labels sssssss

I have been using visualization for a lot of different usecases and has been working for instead of using 3rd party libraries. Recently I had a need to customize the data labels but I haven't seen anything in the documentation that how to do that. If...

  • 1053 Views
  • 0 replies
  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels