cancel
Showing results for 
Search instead for 
Did you mean: 
Page Title

Welcome to the Databricks Community

Discover the latest insights, collaborate with peers, get help from experts and make meaningful connections

cancel
Showing results for 
Search instead for 
Did you mean: 
Announcing new training in Japanese and Brazilian Portuguese

Exciting news! Databricks is expanding available on-demand and instructor-led training offerings for Japan and Brazil. Two new introductory courses are now available: Databricks Fundamentals [Japanese | Brazilian Portuguese] A conceptual introduction...

  • 268 Views
  • 0 replies
  • 1 kudos
Friday
Registration now open! Databricks Data + AI Summit 2024

Join tens of thousands of data leaders, engineers, scientists and architects from around the world at Moscone Center in San Francisco, June 10–13.  Explore the latest advances in Apache Spark™, Delta Lake, MLflow, LangChain, PyTorch, dbt, Prest...

  • 838 Views
  • 0 replies
  • 2 kudos
a week ago
Calling all innovators and visionaries! The 2024 Data Team Awards are open for nominations

Each year, we celebrate the amazing customers that rely on Databricks to innovate and transform their organizations — and the world — with the power of data and AI. The nomination form is now open to submit nominations. Nominations will close on Marc...

  • 1036 Views
  • 0 replies
  • 0 kudos
a week ago
Drive Value With Generative AI

Join us this March to see how generative AI can drive value. Build your own LLM trained on your data with Databricks’ new MPT model: Mosaic AI. Imagine if you could train and deploy your own LLM — customized with your own data that gives accurate res...

  • 954 Views
  • 1 replies
  • 1 kudos
2 weeks ago

Community Activity

Krubug
by New Contributor
  • 0 Views
  • 0 replies
  • 0 kudos

Improve Query Performance

HelloI have a query in one of my notebooks that took around 3.5 hours on D12_V2 cluster and workers between 5 to 25 .is there a way to write the query in diffrenet way in order to improve performance and cost : select /*+ BROADCAST(b) */ MD5(CONCAT(N...

  • 0 Views
  • 0 replies
  • 0 kudos
Jeyapaul
by New Contributor II
  • 1113 Views
  • 3 replies
  • 1 kudos

User Agent Verfication

Hello Folks, We are recently working on Databricks integration in to our products and one of the Best practices suggested is to send user-agent information for any REST API or JDBC connect we make from the product to Databricks. We have made all the ...

  • 1113 Views
  • 3 replies
  • 1 kudos
Latest Reply
Bharathii
Visitor
  • 1 kudos

Hi @bdraitcak , Here are the steps to be followed to get the logs: Log in to your Azure portal.Go to / Search for "Log analytics workspace."Create a new Log Analytics workspace by specifying your Resource group and Instance details. [subscription + r...

  • 1 kudos
2 More Replies
dollyb
by Visitor
  • 3 Views
  • 0 replies
  • 0 kudos

Databricks Connect Scala -

Hi,I'm using Databricks Connect to run Scala code from IntelliJ on a Databricks single node cluster.Even with the simplest code, I'm experiencing this error:org.apache.spark.SparkException: grpc_shaded.io.grpc.StatusRuntimeException: INTERNAL: org.ap...

  • 3 Views
  • 0 replies
  • 0 kudos
Sandhya1825
by Visitor
  • 50 Views
  • 2 replies
  • 0 kudos

Databricks Certification Exam Suspended. kindly help to reschedule

Hi Team, I have experienced a pathetic experience today while giving my Data bricks certification exam for data analyst associate. Initially, to launch the exam, it took much time. After the exam has started,  I did not take my eyes off the screen bu...

  • 50 Views
  • 2 replies
  • 0 kudos
Latest Reply
Sandhya1825
  • 0 kudos

Hi Team, Thank you for reverting to my concern. My exam is rescheduled for 22nd February 11am as per my request. I logged into the web assessor before the schedule time of 11 am. i.e 5 to 10min earlier. But the website is not loading me the exam scre...

  • 0 kudos
1 More Replies
ashish577
by New Contributor III
  • 12 Views
  • 0 replies
  • 0 kudos

Databricks asset bundles passing parameters using bundle run which are not declared

Hi,We recently decided to move to databricks asset bundles, one scenario that we are dealing with is we have different parameters passed to the same job which are handled in the notebook. With bundles when I try to pass parameters at runtime(which ar...

  • 12 Views
  • 0 replies
  • 0 kudos
vigneshp
by New Contributor
  • 143 Views
  • 1 replies
  • 0 kudos

bitmap_count() function's output is different in databricks compared to snowflake

I have found that the results of the bitmap_count() function output differs significantly between databricks and snowflake.eg: snowflake returns a value of '1' for this code. "select bitmap_count(X'0001056c000000000000') " while  Databricks returns a...

vigneshp_1-1701992518337.png vigneshp_0-1701992493192.png
  • 143 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ayushi_Suthar
Valued Contributor II
  • 0 kudos

Hi @vigneshp , Good Day!  In Databricks, bitmap_count function returns the number of bits set in a BINARY string representing a bitmap. This function is typically used to count distinct values in combination with the bitmap_bucket_number() and the bi...

  • 0 kudos
s_park
by Community Manager
  • 1479 Views
  • 10 replies
  • 5 kudos

FAQ for Databricks Learning Festival (Virtual): 29 February 2024 - 13 March 2024

General Q: Where can I ask questions, if I have questions while going through the self-paced material? A: Databricks Academy Learners Group page is the forum within Databricks Community, where you can ask questions and get them answered by your fello...

2023-11-Community-HP-graphic-358x250.png
  • 1479 Views
  • 10 replies
  • 5 kudos
Latest Reply
JeanParra
New Contributor II
  • 5 kudos

what is the difference of Customers & Prospects or Partners? i new on databricks. What option do I have to choose?thanks

  • 5 kudos
9 More Replies
rfreitas
by New Contributor II
  • 51 Views
  • 1 replies
  • 1 kudos

Notebook and folder owner

Hi allWe can use this API https://docs.databricks.com/api/workspace/dbsqlpermissions/transferownership to transfer the ownership of a Query.Is there anything similar for notebooks and folders?

  • 51 Views
  • 1 replies
  • 1 kudos
Latest Reply
feiyun0112
New Contributor II
  • 1 kudos

Workspace object permissions — Manage which users can read, run, edit, or manage directories, files, and notebooks.https://docs.databricks.com/api/workspace/workspace/setpermissions

  • 1 kudos
RabahO
by New Contributor II
  • 38 Views
  • 1 replies
  • 0 kudos

Unit tests in notebook not working

Hello, I'm trying to setup a notebook for tests or data quality checks. The name is not important.I basically read a table (the ETL output process - actual data).Then I read another table and do the calculation in the notebook (expected data)I'm stuc...

  • 38 Views
  • 1 replies
  • 0 kudos
Latest Reply
feiyun0112
New Contributor II
  • 0 kudos

you can use nutter, Testing framework for Databricks notebooksmicrosoft/nutter: Testing framework for Databricks notebooks (github.com)

  • 0 kudos
vijaykumar99535
by New Contributor III
  • 172 Views
  • 1 replies
  • 0 kudos

How to create job cluster using rest api

I am creating cluster using rest api call but every-time it is creating all purpose cluster. Is there a way to create job cluster and run notebook using python code?

  • 172 Views
  • 1 replies
  • 0 kudos
Latest Reply
feiyun0112
New Contributor II
  • 0 kudos

job_cluster_key string [ 1 .. 100 ] characters ^[\w\-\_]+$ If job_cluster_key, this task is executed reusing the cluster specified in job.settings.job_clusters.Create a new job | Jobs API | REST API reference | Databricks on AWS

  • 0 kudos
William_Scardua
by Contributor III
  • 26 Views
  • 1 replies
  • 0 kudos

groupBy without aggregation (Pyspark API)

Hi guys,You have any idea how can I do a groupBy without aggregation (Pyspark API)like: df.groupBy('field1', 'field2', 'field3') My target is make a group but in this case is not necessary count records or aggregationThank you  

  • 26 Views
  • 1 replies
  • 0 kudos
Latest Reply
feiyun0112
New Contributor II
  • 0 kudos

df.select("field1","field2","field3").distinct()do you mean get distinct rows for selected column?

  • 0 kudos
100804
by New Contributor
  • 38 Views
  • 0 replies
  • 0 kudos

Instance Profile Access Controls

I manage instance profiles assigned to specific user groups. For example, instance profile A provides access solely to group A. Currently, any user within group A has the ability to update the permissions of a cluster using instance profile A, which ...

  • 38 Views
  • 0 replies
  • 0 kudos
Kaizen
by New Contributor III
  • 26 Views
  • 0 replies
  • 0 kudos

Init script error 13.3 and 14.3 LTS issues mesa

Hi - we had a few issues with some of our init scripts recently. Investigating I found that mesa packages were throwing issues when trying to install. Posting this to help the community and raise awareness to Databricks to fix itI believe the image f...

Kaizen_0-1708547649934.png Kaizen_1-1708547704572.png
  • 26 Views
  • 0 replies
  • 0 kudos
zero234
by New Contributor
  • 33 Views
  • 0 replies
  • 0 kudos

Data is not loaded when creating two different streaming table from one delta live table pipeline

 i am trying to create 2 streaming tables in one DLT pipleine , both read json data from different locations and both have different schema , the pipeline executes but no data is inserted in both the tables.where as when i try to run each table indiv...

Data Engineering
dlt
spark
STREAMINGTABLE
  • 33 Views
  • 0 replies
  • 0 kudos
Dhruv_Sinha
by Visitor
  • 31 Views
  • 0 replies
  • 0 kudos

Parallelizing processing of multiple spark dataframes

Hi all, I am trying to create a collection rd that contains a list of spark dataframes. I want to parallelize the cleaning process for each of these dataframes. Later on, I am sending each of these dataframes to another method. However, when I parall...

  • 31 Views
  • 0 replies
  • 0 kudos

Latest from our Blog

Introducing the DeepSpeed Distributor on Databricks

The GPU shortage is real, and being able to scale up and optimize the training of large language models will help accelerate the delivery of an AI project. DeepSpeed is a framework that can reduce GPU...

823Views 2kudos