cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Showing results for 
Search instead for 
Did you mean: 
Databricks Learning Festival (Virtual): 15 January - 31 January 2025

Join us for the return of the Databricks Learning Festival (Virtual)! Mark your calendars from 15 January - 31 January 2025! Upskill today across data engineering, data analysis, machine learning, and generative AI. Join the thousands who have el...

  • 104693 Views
  • 243 replies
  • 67 kudos
11-26-2024
Share Your Feedback in Our Community Survey

Your opinion matters! Take a few minutes to complete our Customer Experience Survey to help us improve the Databricks Community. Your input is crucial in shaping the future of our community and ensuring it meets your needs. Take the Survey Now Why p...

  • 885 Views
  • 0 replies
  • 0 kudos
2 weeks ago
Databricks Named a Leader in the 2024 Gartner® Magic Quadrant™ for Cloud Database Management Systems

We’re thrilled to share that Databricks has once again been recognized as a Leader in the 2024 Gartner® Magic Quadrant™ for Cloud Database Management Systems. This acknowledgement underscores our commitment to innovation and our leadership in the dat...

  • 1757 Views
  • 0 replies
  • 3 kudos
3 weeks ago
Milestone: DatabricksTV Reaches 100 Videos!

We are thrilled to announce that DatabricksTV, our growing video hub, has hit a major milestone: 100 videos and counting! What is DatabricksTV?DatabricksTV is a community-driven video hub designed to help data practitioners maximize the Databricks e...

  • 1727 Views
  • 1 replies
  • 4 kudos
12-11-2024
Announcing the new Meta Llama 3.3 model on Databricks

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model is optimized for multilingual dialogue use cases and outperfo...

  • 2576 Views
  • 0 replies
  • 3 kudos
12-11-2024

Community Activity

iscpablogarcia
by > Visitor
  • 9 Views
  • 0 replies
  • 0 kudos

How can i set the workflow status to Skipped?

I have a Python script workflow with 2 tasks: Task A and Task B.When task A has data, this is shared to Task B via createOrReplaceGlobalTempView with no issues.The goal is: When A has no data, skip the Task B and also set the workflow status to "Skip...

iscpablogarcia_0-1737152423551.png
  • 9 Views
  • 0 replies
  • 0 kudos
franc_bomb
by > Visitor
  • 38 Views
  • 4 replies
  • 0 kudos

Cluster creation issue

Hello,I just started using Databricks community version for learning purposes.I have been trying to create a cluster but the first time it failed asking me to retry or contact the support, and now it's just running forever.What could be the problem? 

  • 38 Views
  • 4 replies
  • 0 kudos
Latest Reply
franc_bomb
Visitor
  • 0 kudos

I've been trying again but I still face the same problem.

  • 0 kudos
3 More Replies
NavyaSinghvi
by > New Contributor III
  • 2125 Views
  • 6 replies
  • 2 kudos

Resolved! File_arrival trigger in Workflow

I am using  "job.trigger.file_arrival.location" in job parameters to get triggered file location . But I am getting error "job.trigger.file_arrival.location is not allowed". How can I get triggered file location in workflow ? 

  • 2125 Views
  • 6 replies
  • 2 kudos
Latest Reply
raghu2
New Contributor III
  • 2 kudos

The parameters are passed as widgets to the job. After defining the parameters in the job definition, With following code I was able to access the data associated with the parameter:widget_names = ["loc1", "loc2", "loc3"]  # Add all expected paramete...

  • 2 kudos
5 More Replies
garciargs
by > New Contributor
  • 14 Views
  • 0 replies
  • 0 kudos

Incremental load from two tables

Hi, I am looking to build a ETL process for a incremental load silver table.This silver table, lets say "contracts_silver", is built by joining two bronze tables, "contracts_raw" and "customer".contracts_silverCONTRACT_IDSTATUSCUSTOMER_NAME1SIGNEDPet...

  • 14 Views
  • 0 replies
  • 0 kudos
adam_mich
by > New Contributor II
  • 434 Views
  • 10 replies
  • 0 kudos

How to Pass Data to a Databricks App?

I am developing a Databricks application using the Streamlit package. I was able to get a "hello world" app deployed successfully, but now I am trying to pass data that exists in the dbfs on the same instance. I try to read a csv saved to the dbfs bu...

  • 434 Views
  • 10 replies
  • 0 kudos
Latest Reply
txti
New Contributor III
  • 0 kudos

I have the identical problem in Databricks Apps.  I have tried...Reading from DBFS path using mount version `/dbfs/myfolder/myfile` and protocol `dbfs:/myfolder/myfile`Reading from Unity Volumes `/Volumes/mycatalog/mydatabase/myfolder/myfile`Also mad...

  • 0 kudos
9 More Replies
TX-Aggie-00
by > New Contributor III
  • 832 Views
  • 6 replies
  • 2 kudos

Installing linux packages on cluster

Hey everyone!  We have a need to utilize libreoffice in one of our automated tasks via a notebook.  I have tried to install via a init script that I attach to the cluster, but sometimes the program gets installed and sometimes it doesn't.  For obviou...

  • 832 Views
  • 6 replies
  • 2 kudos
Latest Reply
TX-Aggie-00
New Contributor III
  • 2 kudos

Thanks Alberto!  There were 42 deb files, so I just changed my script to:sudo dpkg -i /dbfs/Volumes/your_catalog/your_schema/your_volume/*.debThe init_script log shows that it unpacks everything, sets them up and the processes triggers, but the packa...

  • 2 kudos
5 More Replies
Harish2122
by > Contributor
  • 14967 Views
  • 10 replies
  • 13 kudos

Databricks SQL string_agg

Migrating some on-premise SQL views to Databricks and struggling to find conversions for some functions. the main one is the string_agg function.string_agg(field_name, ', ')​Anyone know how to convert that to Databricks SQL?​Thanks in advance.

  • 14967 Views
  • 10 replies
  • 13 kudos
Latest Reply
smueller
New Contributor II
  • 13 kudos

If not grouping by something else: SELECT array_join(collect_set(field_name), ',') field_list    FROM table

  • 13 kudos
9 More Replies
Abdul-Mannan
by > New Contributor III
  • 24 Views
  • 1 replies
  • 0 kudos

Notifications have file information but dataframe is empty using autoloader file notification mode

Using DBR 13.3, i'm ingesting data from 1 adls storage account using autoloader with file notification mode enabled. and writing to container in another adls storage account. This is an older code which is using foreachbatch sink to process the data ...

  • 24 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Here are some potential steps and considerations to troubleshoot and resolve the issue: Permissions and Configuration: Ensure that the necessary permissions are correctly set up for file notification mode. This includes having the appropriate roles ...

  • 0 kudos
thecodecache
by > New Contributor II
  • 1697 Views
  • 2 replies
  • 0 kudos

Transpile a SQL Script into PySpark DataFrame API equivalent code

Input SQL Script (assume any dialect) : SELECT b.se10, b.se3, b.se_aggrtr_indctr, b.key_swipe_ind FROM (SELECT se10, se3, se_aggrtr_indctr, ROW_NUMBER() OVER (PARTITION BY SE10 ...

  • 1697 Views
  • 2 replies
  • 0 kudos
Latest Reply
MathieuDB
Databricks Employee
  • 0 kudos

Hello @thecodecache , Have a look the SQLGlot project: https://github.com/tobymao/sqlglot?tab=readme-ov-file#faq It can easily transpile SQL to Spark SQL, like that: import sqlglot from pyspark.sql import SparkSession # Initialize Spark session spar...

  • 0 kudos
1 More Replies
William_Scardua
by > Valued Contributor
  • 6458 Views
  • 2 replies
  • 0 kudos

Pyspark or Scala ?

Hi guys,Many people use pyspark to develop their pipelines, in your opinion in which cases is it better to use one or the other? Or is it better to choose a single language?Thanks

  • 6458 Views
  • 2 replies
  • 0 kudos
Latest Reply
hari-prasad
Valued Contributor
  • 0 kudos

Hi @William_Scardua,It is advisable to consider using Python (or PySpark) due to Spark's comprehensive API support for Python. Furthermore, Databricks currently supports Delta Live Tables (DLT) with Python, but does not support Scala at this time. Ad...

  • 0 kudos
1 More Replies
sanjay
by > Valued Contributor II
  • 2182 Views
  • 2 replies
  • 1 kudos

Error accessing file from dbfs inside mlflow serve endpoint

Hi,I have mlflow model served using serverless GPU which takes audio file name as input and then file will be passed as parameter to huggiung face model inside predict method. But I am getting following errorHFValidationError(\nhuggingface_hub.utils....

  • 2182 Views
  • 2 replies
  • 1 kudos
Latest Reply
txti
New Contributor III
  • 1 kudos

I have the same issue.I have a large file that I cannot access from an MLFlow service.Things I have tried (none of these work):Read-only from DBFS`dbfs:/myfolder/myfile.chroma` does not work`/dbfs/myfolder/myfile.chroma` does not workRead-only from U...

  • 1 kudos
1 More Replies
bryan1
by > New Contributor
  • 513 Views
  • 7 replies
  • 0 kudos

Customizing the UI of Streamlit App Template

Hi There,I used the Streamlit Chatbot template for Databricks Apps. I'm looking to customize the UI and wanted to make changes to the app.py file to do so. Even before I made any changes, I ran the file and I'm getting an error with databricks SDK an...

bryan1_0-1736647332600.png
  • 513 Views
  • 7 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @bryan1, Thanks for the details, there might be something at the cluster level configuration, can you please share your cluster settings like DBR version / libraries / access mode? I even tried on an older version / serverless and worked:  

  • 0 kudos
6 More Replies
Daithi
by > New Contributor II
  • 17813 Views
  • 15 replies
  • 3 kudos

Unity Catalog - Error getting sample data in data explorer

I get an error message saying I Error getting sample data, when I try to view sample data from a table in a schema I created in a Unity Catalog. I dropped the schema and table and got a collague to recreate and still the same message. We are both Uni...

image.png
  • 17813 Views
  • 15 replies
  • 3 kudos
Latest Reply
AlexRose1122
  • 3 kudos

It’s likely a permissions or session issue; try checking table-level ACLs, refreshing your session, or verifying role inheritance.

  • 3 kudos
14 More Replies
JrV
by > New Contributor
  • 23 Views
  • 1 replies
  • 0 kudos

Sparql and RDF data

Hello Databricks Community,Does anyone have experience with running SPARQL (https://en.wikipedia.org/wiki/SPARQL) queries in Databricks?Make a connection to the Community SolidServer https://github.com/CommunitySolidServer/CommunitySolidServerand que...

  • 23 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16502773013
Databricks Employee
  • 0 kudos

Hello @JrV , For this use case, Databricks currently support Bellman SPARQL engine which can run on Databricks as a Scala library operating on a dataframe of Triples (S, P, O) Also integration is available for Stardog through Databricks Partner Conne...

  • 0 kudos
Gajju
by > New Contributor
  • 16 Views
  • 1 replies
  • 0 kudos

[Deprecation Marker Required] : MERGE INTO Clause

Dear Friends:Considering MERGE INTO may generate wrong results(The APPLY CHANGES APIs: Simplify change data capture with Delta Live Tables | Databricks on AWS), may I ask that why it's API is still floating in technical documentation, without "Deprec...

  • 16 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16502773013
Databricks Employee
  • 0 kudos

Hello @Gajju , MERGE INTO is not being deprecated, APPLY CHANGES should be seen as an enhanced merge process in Delta Live Table that handles out of sequence records automatically as shown in  the example in the documentation shared. The notion of wr...

  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Featured Event

Join Us for an Exclusive Databricks Community Event in San Francisco!

Thursday, January 23, 2025

View Event
Top Kudoed Authors
Read Databricks Data Intelligence Platform reviews on G2

Latest from our Blog

Deep Dive - Streaming Deduplication

In this article we will cover in depth about streaming deduplication using watermarking with dropDuplicates and dropDuplicatesWithinWatermark, how they are different. This blog expects you to have a g...

425Views 1kudos

Data Engineering SQL Holiday Specials

December is the most celebrated time of year in the Data Engineering calendar as we embrace the important holiday: change freeze season.  As we come back to the office to start our new projects, I wan...

2424Views 3kudos