cancel
Showing results for 
Search instead for 
Did you mean: 
Page Title

Welcome to the Databricks Community

Discover the latest insights, collaborate with peers, get help from experts and make meaningful connections

99972members
50081posts
cancel
Showing results for 
Search instead for 
Did you mean: 
Registration now open! Databricks Data + AI Summit 2024

Join tens of thousands of data leaders, engineers, scientists and architects from around the world at Moscone Center in San Francisco, June 10–13.  Explore the latest advances in Apache Spark™, Delta Lake, MLflow, LangChain, PyTorch, dbt, Prest...

  • 1284 Views
  • 0 replies
  • 3 kudos
2 weeks ago
Calling all innovators and visionaries! The 2024 Data Team Awards are open for nominations

Each year, we celebrate the amazing customers that rely on Databricks to innovate and transform their organizations — and the world — with the power of data and AI. The nomination form is now open to submit nominations. Nominations will close on Marc...

  • 1573 Views
  • 0 replies
  • 0 kudos
2 weeks ago
Drive Value With Generative AI

Join us this March to see how generative AI can drive value. Build your own LLM trained on your data with Databricks’ new MPT model: Mosaic AI. Imagine if you could train and deploy your own LLM — customized with your own data that gives accurate res...

  • 1627 Views
  • 1 replies
  • 1 kudos
3 weeks ago
Databricks Learning Festival (Virtual): 29 February 2024 - 13 March 2024

Join the Databricks Learning Festival (Virtual)! Mark your calendars from 29 February 2024 to 13 March 2024! Upskill today across data engineering, data analysis, machine learning, and generative AI. Join the thousands who have elevated their car...

  • 23146 Views
  • 27 replies
  • 21 kudos
3 weeks ago

Community Activity

SJR
by New Contributor
  • 135 Views
  • 3 replies
  • 0 kudos

Problem when updating Databricks Repo through DevOps Pipeline

Hello all!I've been working on integrating a Databricks Repos update API call to a DevOps Pipeline so that the Databricks local repo stays up to date with the remote staging branch (Pipeline executes whenever there's a new commit in to the staging br...

Data Engineering
CICD
Data_Engineering
DevOps
pipelines
repo
  • 135 Views
  • 3 replies
  • 0 kudos
Latest Reply
BookerE1
Visitor
  • 0 kudos

Hello,I can try to give you some possible solutions for your problem, based on the web search results that I found. Here are some suggestions:Make sure that your Databricks workspace and your Azure DevOps account are properly connected and authentica...

  • 0 kudos
2 More Replies
Yulei
by New Contributor II
  • 131 Views
  • 2 replies
  • 0 kudos

Could not reach driver of cluster

 Hi, Rencently, I am seeing issue Could not reach driver of cluster <some_id> with my structure streaming job when migrating to unity catalog and found this when checking the traceback:Traceback (most recent call last):File "/databricks/python_shell/...

  • 131 Views
  • 2 replies
  • 0 kudos
Latest Reply
Latonya86Dodson
  • 0 kudos

@Yulei wrote: Hi, Rencently, I am seeing issue Could not reach driver of cluster <some_id> with my structure streaming job when migrating to unity catalog and found this when checking the traceback:Traceback (most recent call last):File "/databricks/...

  • 0 kudos
1 More Replies
BhaveshPatel
by New Contributor
  • 11 Views
  • 0 replies
  • 0 kudos

Auto loader

Suppose I have 1000's of historical .csv files stored from Jan, 2022 in a folder of my azure blob storage container. I want to use auto loader to read files beginning only on 1st, Oct, 2023 and ignoring all the files before this date to build a pipel...

  • 11 Views
  • 0 replies
  • 0 kudos
Brad
by New Contributor
  • 10 Views
  • 0 replies
  • 0 kudos

Colon sign operator for JSON

Hi,I have a streaming source loading data to a raw table, which has a string type col (whose value is JSON) to hold all data. I want to use colon sign operator to get fields from the JSON string. Is this going to have some perf issues vs. I use a sch...

  • 10 Views
  • 0 replies
  • 0 kudos
TestuserAva
by New Contributor II
  • 689 Views
  • 6 replies
  • 1 kudos

Getting HTML sign I page as api response from databricks api with statuscode 200

Response:<!doctype html><html><head>    <meta charset="utf-8" />    <meta http-equiv="Content-Language" content="en" />    <title>Databricks - Sign In</title>    <meta name="viewport" content="width=960" />    <link rel="icon" type="image/png" href="...

TestuserAva_0-1701165195616.png
  • 689 Views
  • 6 replies
  • 1 kudos
Latest Reply
Abhishek10745
New Contributor II
  • 1 kudos

Hello @SJR ,In the scenario which i mentioned in the previous comment, my ci pipeline was using a pool or scaleset which did not have access to this azure databricks service. Hence, when my service principal tried to create PAT token using databricks...

  • 1 kudos
5 More Replies
DatabricksGuide
by Visitor
  • 65 Views
  • 0 replies
  • 0 kudos

Getting Started with Serverless SQL on AWS

Getting Started with Serverless SQL on AWS This is an AWS admin guide for existing Databricks SQL customers interested in Serverless SQL features. This guide covers the following topics: What is Serverless ArchitectureSecurity on Serverless Architect...

1.png 2.png 3.premiumaccount.gif DatabricksGuide_0-1709152829941.gif
  • 65 Views
  • 0 replies
  • 0 kudos
Hal
by New Contributor II
  • 291 Views
  • 1 replies
  • 3 kudos

Connecting Power BI on Azure to Databricks on AWS?

Can someone share with me the proper way to connect Power BI running on Azure to Databricks running on AWS?

  • 291 Views
  • 1 replies
  • 3 kudos
Latest Reply
bhanadi
New Contributor II
  • 3 kudos

Have the same question. Do we have to take care of any specific tasks to make it work. Anyone who implemented it?

  • 3 kudos
NaeemS
by Visitor
  • 21 Views
  • 0 replies
  • 0 kudos

Handling Null Values in Feature Stores

Hi, I am using multiple feature stores in my workflow using feature lookups. In my logged pipeline, I have several stages, including Assembler, Standard Scaler, Indexer and then Model. However, I am facing an issue during inference using the `score b...

Machine Learning
Feature Store
  • 21 Views
  • 0 replies
  • 0 kudos
costi9992
by New Contributor III
  • 623 Views
  • 2 replies
  • 0 kudos

Access Databricks API using IDP token

Hello,We have a databricks account & workspace, provided by AWS with SSO enabled. Is there any way to access databricks workspace API ( jobs/clusters, etc ) using a token retrieved from IdentityProvider ? We can access databricks workspace API with A...

  • 623 Views
  • 2 replies
  • 0 kudos
Latest Reply
fpopa
Visitor
  • 0 kudos

Hey - Costin and Anonymous user, have you managed to get this working, do you have examples by any chance?I'm also trying something similar but I haven't been able to make it work.> authenticate and access the Databricks REST API by setting the Autho...

  • 0 kudos
1 More Replies
BillGuyTheScien
by New Contributor II
  • 17 Views
  • 0 replies
  • 0 kudos

combining accounts

I have an AWS based databricks account with a few workspaces and an Azure Databricks workspace.  How do I combine them into one account?I am particularly interested in setting up a single billing drop with all my Databricks costs.  

  • 17 Views
  • 0 replies
  • 0 kudos
CBL
by Visitor
  • 33 Views
  • 0 replies
  • 0 kudos

Schema Evolution in Azure databricks

Hi All -In my scenario, Loading data from 100 of Json files.Problem is, fields/columns are missing when JSON file contains new fields.Full Load: while writing JSON to delta use the option ("mergeschema", "true") so that we do not miss new columns Inc...

  • 33 Views
  • 0 replies
  • 0 kudos
nachog99
by New Contributor II
  • 16 Views
  • 0 replies
  • 0 kudos

Read VCF files using latest runtime version

Hello everyone!I was reading VCF files using the glow library (Maven: io.projectglow:glow-spark3_2.12:1.2.1).The last version of this library only works with the spark's version 3.3.2 so if I need to use a newer runtime with a more recent spark versi...

  • 16 Views
  • 0 replies
  • 0 kudos
807326
by New Contributor II
  • 1328 Views
  • 2 replies
  • 1 kudos

Resolved! Enable automatic schema evolution for Delta Lake merge for an SQL warehouse

Hello! We tried to update our integration scripts and use SQL warehouses instead of general compute clusters to fetch and update data, but we faced a problem. We use automatic schema evolution when we merge tables, but with SQL warehouse, when we try...

  • 1328 Views
  • 2 replies
  • 1 kudos
Latest Reply
BerkerKozan
New Contributor II
  • 1 kudos

Is there any update on that one?

  • 1 kudos
1 More Replies
srjchoubey2
by Visitor
  • 123 Views
  • 1 replies
  • 0 kudos

How to import excel files xls/xlsx file into Databricks python notebook?

Method 1: Using "com.crealytics.spark.excel" package, how do I import the package?Method 2: Using pandas I tried the possible paths, but file not found it shows, nor while uploading the xls/xlsx file it shows options for importing the dataframe.Help ...

Data Engineering
excel
import
pyspark
python
  • 123 Views
  • 1 replies
  • 0 kudos
Latest Reply
vishwanath_1
New Contributor III
  • 0 kudos

import pandas as pd ExcelData = pd.read_excel("/dbfs"+FilePath, sheetName) #  make sure you add /dbfs to FilePath 

  • 0 kudos
anushajalesh28
by Visitor
  • 69 Views
  • 2 replies
  • 1 kudos

Catalog issue

When i was trying to create catalog i got an error saying to mention azure storage account and storage container in the following query -CREATE CATALOG IF NOT EXISTS Databricks_Anu_Jal_27022024MANAGED LOCATION 'abfss://<databricks-workspace-stack-anu...

Get Started Discussions
Azure Databricks
  • 69 Views
  • 2 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @anushajalesh28, To create a catalog in Azure Databricks, you need to specify the Azure storage account and storage container in the MANAGED LOCATION clause.    Let’s break down the query:   CREATE CATALOG IF NOT EXISTS Databricks_Anu_Jal_270220...

  • 1 kudos
1 More Replies

Latest from our Blog

Lambda functions: Knights of the higher-order

It’s not often that a DBMS surprises me when it comes to SQL; I kind of think I have seen it all. However there is this one feature in Spark SQL that made me go: “Huh! Now that’s cool!” when I first e...

914Views 0kudos

Databricks cost analysis and cross charge with Power BI

Authors: Liping Huang (@Liphuan) and Marius Panga (@mariuspc) Introduction Effective cost management is a critical consideration for any cloud data platform. Historically, achieving cost control and i...

1656Views 1kudos