cancel
Showing results for 
Search instead for 
Did you mean: 
Page Title

Welcome to the Databricks Community

Discover the latest insights, collaborate with peers, get help from experts and make meaningful connections

cancel
Showing results for 
Search instead for 
Did you mean: 
Introducing the Databricks AI Fund

Databricks Ventures Launches New Fund to Extend our Ecosystem Leadership We launched Databricks Ventures in December 2021 as our strategic investment arm for funding innovative startups across the data, analytics and AI landscape — companies that sha...

  • 44 Views
  • 0 replies
  • 0 kudos
yesterday
Announcing General Availability of Liquid Clustering

Out-of-the-box, self-tuning data layout that scales with your data We’re excited to announce the General Availability of Delta Lake Liquid Clustering in the Databricks Data Intelligence Platform. Liquid Clustering is an innovative data management tec...

  • 55 Views
  • 0 replies
  • 0 kudos
yesterday
Accelerating the Scientific AI Revolution

TetraScience and Databricks Join Forces To Transform Scientific Research, Development, Manufacturing, and Quality Control in Life Sciences BOSTON & SAN FRANCISCO, May 20th, 2024 - TetraScience and Databricks today announced a strategic partnership de...

  • 399 Views
  • 0 replies
  • 2 kudos
Monday
Supercharge Your Code Generation

We are excited to introduce Databricks Assistant Autocomplete now in Public Preview. This feature brings the AI-powered assistant to you in real-time, providing personalized code suggestions as you type. Directly integrated into the notebook and SQL ...

  • 172 Views
  • 1 replies
  • 2 kudos
Monday

Community Activity

Ravi_Bobbili
by New Contributor
  • 116 Views
  • 1 replies
  • 0 kudos

Unable to connect to external metastore from databricks warehouse cluster

Hi,We are using azure sql as external meta store. We are trying to access this external metastore from databricks warehouse clusters but getting error. `data` property must be defined in SQL query response . but we are able to connect to the same usi...

  • 116 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Ravi_Bobbili, I understand that you’re having trouble accessing an Azure SQL external metastore from Databricks warehouse clusters. This issue might be due to a few reasons: Ensure that your configuration settings are correct. The error mess...

  • 0 kudos
VJ3
by New Contributor III
  • 1611 Views
  • 1 replies
  • 0 kudos

Azure Databricks Secret Management

Hi,Hope you both are doing well. I came to know that Databricks also provides secret management so I would like to compare it with some other well known secrets management solution like  Azure Key Vault, CyberArk in the market . Can someone provides ...

  • 1611 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @VJ3, I can provide some insights on Databricks Secrets Management and how it compares to other solutions like Azure Key Vault. Benefits of Databricks Secrets Management: Integration with Databricks workflows: Databricks Secrets Management is d...

  • 0 kudos
RobinK
by New Contributor III
  • 131 Views
  • 6 replies
  • 5 kudos

Databricks Jobs do not run on job compute but on shared compute

Hello,since last night none of our ETL jobs in Databricks are running anymore, although we have not made any code changes.The identical jobs (deployed with Databricks asset bundles) run on an all-purpose cluster, but fail on a job cluster. We have no...

  • 131 Views
  • 6 replies
  • 5 kudos
Latest Reply
ha2983
Visitor
  • 5 kudos

This Notebook can be used to recreate the issue:import pandas as pd from databricks.connect import DatabricksSession from pyspark.sql.functions import current_timestamp spark = DatabricksSession.builder.getOrCreate() # Create a pandas DataFrame da...

  • 5 kudos
5 More Replies
Kishor
by New Contributor
  • 386 Views
  • 1 replies
  • 0 kudos

Issue with Creating and Running Databricks Jobs with new databricks cli v0.214.0

Hi Databricks Support,I'm encountering an issue with creating and running jobs on Databricks. Here are the details:Problem Description:When attempting to create and run a job using the old JSON (which was successfully used to create and run jobs usin...

Get Started Discussions
Databricks CLI
databricks jobs
  • 386 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Kishor, I’m sorry to hear that you’re having trouble with Databricks job creation and retrieval of run output. Issue 1: “Error: No task is specified.” This error typically occurs when the JSON file used for job creation does not specify a t...

  • 0 kudos
FilipezAR
by New Contributor
  • 173 Views
  • 1 replies
  • 0 kudos

Failed to create new KafkaAdminClient

I want to create connections to kafka with spark.readStream using the following parameters: kafkaParams = { "kafka.sasl.jaas.config": f'org.apache.kafka.common.security.plain.PlainLoginModule required username="{kafkaUsername}" password="{kafkaPa...

  • 173 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @FilipezAR, The error message suggests that the class org.apache.kafka.common.security.plain.PlainLoginModule is not found. This class is part of the Kafka clients JAR, and it should be included in the classpath of your Spark applications.There...

  • 0 kudos
Harispap
by Visitor
  • 27 Views
  • 1 replies
  • 0 kudos

Different result between manual and automated task run

I have a notebook where I bring info about a previous task run metadata from the API ".... /jobs/runs/get". The response should be a dictionary that contains information such as task key, run if, run page URL etc.  When I run the notebook as part of ...

  • 27 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Harispap,  Yes, it’s possible to have different results between a manual and an automated task run. This could be due to a few reasons: The state of the job or tasks might have been different when the notebook was executed automatically compar...

  • 0 kudos
aranjan99
by Visitor
  • 35 Views
  • 1 replies
  • 0 kudos

system.billing.usage table missing data for jobs running in my databricks account

I have some jobs running on databricks. I can obtain their jobId from the Jobs UI or List Job Runs API.However when trying to get DBU usage for the corresponding jobs from system.billing.usage, I do not see the same job_id in that table. Its been mor...

  • 35 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @aranjan99, I understand your concern. The system.billing.usage table in Databricks is updated every hour, and it only includes completed tasks1. If your jobs are long-running and have not completed yet, they will not appear in this table1. Add...

  • 0 kudos
Hertz
by New Contributor
  • 50 Views
  • 1 replies
  • 0 kudos

System Tables / Audit Logs action_name createWarehouse/createEndpoint

I am creating a cost dashboard across multiple accounts. I am working get sql warehouse names and warehouse ids so I can combine with system.access.billing on warehouse_id.  But the only action_names that include both the warehouse_id and warehouse_n...

Data Engineering
Audit Logs
cost monitor
createEndpoint
createWarehouse
  • 50 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Hertz, You’re correct that the editEndpoint/editWarehouse and deleteEndpoint/deleteWarehouse actions include both the warehouse_id and warehouse_name. However, the createWarehouse/createEndpoint actions do not include the warehouse_id1. To get...

  • 0 kudos
saichandu_25
by New Contributor
  • 372 Views
  • 9 replies
  • 0 kudos

Not able to read the file content completely using head

Hi,We want to read the file content of the file and encode the content into base64. For that we have used below code file_path = "/path/to/your/file.csv"file_content = dbutils.fs.head(file_path, 512000000)encode_content = base64.b64encode(file_conten...

  • 372 Views
  • 9 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

I am curious what the use case if for wanting to load large files into github, which is a code repo.Depending on the file format different parsing is necessary.  you could foresee logic for that in your program.

  • 0 kudos
8 More Replies
subham0611
by New Contributor II
  • 40 Views
  • 1 replies
  • 0 kudos

Parallel kafka consumer in spark structured streaming

Hi,I have a spark streaming job which reads from kafka and process data and write to delta lake.Number of kafka partition: 100number of executor: 2 (4 core each)So we have 8 cores total which are reading from 100 partitions of a topic. I wanted to un...

  • 40 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @subham0611, In Spark Streaming, the number of threads is not explicitly controlled by the user. Instead, the parallelism is determined by the number of partitions in the Kafka topic. Each partition is consumed by a single Spark task. When you ...

  • 0 kudos
Frantz
by New Contributor III
  • 181 Views
  • 1 replies
  • 0 kudos

Error Code: METASTORE_DOES_NOT_EXIST when using Databricks API

Hello, I'm attempting to use the databricks API to list the catalogs in the metastore. When I send the GET request to `/api/2.1/unity-catalog/catalogs` , I get this error I have checked multiple times and yes, we do have a metastore associated with t...

Frantz_0-1716331980508.png
  • 181 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Frantz, The error METASTORE_DOES_NOT_EXIST typically indicates that the Databricks API is unable to find a metastore associated with your workspace. Here are a few things you could check: Directory Path: Ensure that the directory path you pro...

  • 0 kudos
DataEngineer
by New Contributor II
  • 162 Views
  • 2 replies
  • 0 kudos

AWS Email sending challenge from Databricks with UNITY CATALOG and Multinode cluster

Hi,I have implemented the UNITY CATALOG with multinode cluster in databricks. The workspace instance profile with EC2 access is also created in IAM. but still having a challenge in sending emails from databricks using SES service.The same is working ...

  • 162 Views
  • 2 replies
  • 0 kudos
Latest Reply
Babu_Krishnan
New Contributor III
  • 0 kudos

Hi @DataEngineer ,Are you able to resolve the issue. We are having the same issue when we try to use MultiNode cluster for UnityCatalog. Email functionality was working fine with Single node cluster.We are getting "ConnectionRefusedError: [Errno 111]...

  • 0 kudos
1 More Replies
NaeemS
by New Contributor II
  • 86 Views
  • 2 replies
  • 0 kudos

Handling Aggregations in Feature Function

Hi,Is it possible to cater aggregation using Feature Functions somehow. As we know that the logic defined in feature function is applied on a single row when a join is being performed. But do we have any mechanism to handle to aggregations too someho...

Data Engineering
Feature Functions
Feature Store
  • 86 Views
  • 2 replies
  • 0 kudos
Latest Reply
NaeemS
New Contributor II
  • 0 kudos

Hi @Kaniz ,Thanks for your reply. I'm familiar with both of these. But I was wondering if can include that part while logging our pipeline using feature stores to handle the grouping and filtering as well.

  • 0 kudos
1 More Replies
Eric_Kieft
by New Contributor III
  • 129 Views
  • 2 replies
  • 1 kudos

Materialized Views GA and Azure Region Availability

Materialized views are currently public preview (as of May 2024).  Is there a planned date for GA?Also the limitations section for Azure notes: Databricks SQL materialized views are not supported in the South Central US and West US 2 regions.Will thi...

  • 129 Views
  • 2 replies
  • 1 kudos
Latest Reply
Eric_Kieft
New Contributor III
  • 1 kudos

Thanks, Kaniz. Will this feature be available for these regions in the future?

  • 1 kudos
1 More Replies
StephenDsouza
by Visitor
  • 25 Views
  • 0 replies
  • 0 kudos

Error during build process for serving model caused by detectron2

Hi All,Introduction: I am trying to register my model on Databricks so that I can serve it as an endpoint. The packages that I need are "torch", "mlflow", "torchvision", "numpy" and "git+https://github.com/facebookresearch/detectron2.git". For this, ...

  • 25 Views
  • 0 replies
  • 0 kudos
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Top Kudoed Authors

Latest from our Blog

How to use System Tables with Overwatch

How to use System Tables with Overwatch Welcome to our blog post on integrating system tables with Overwatch! In this article, we'll delve into the exciting world of leveraging system tables to enhanc...

559Views 3kudos