Databricks Platform Discussions - Databricks

Register to join the community

Databricks Platform Discussions

Browse the Community

Administration & Architecture

327 Posts

Data Engineering

7887 Posts

Data Governance

323 Posts

Machine Learning

729 Posts

Warehousing & Analytics

433 Posts

Activity in Databricks Platform Discussions

Sorted by:

Start a conversation

by PrashantAghara • Visitor

13m ago

5 Views
1 replies
0 kudos

org.apache.spark.SparkException: Job aborted due to stage failure when writing to Cosmos

I am writing data to cosmos DB using Python & Spark on DatabricksI am getting below error :org.apache.spark.SparkException: Job aborted due to stage failure: Authorized committer (attemptNumber=0, stage=192, partition=105) failed; but task commit suc...

Data Engineering

5 Views
1 replies
0 kudos

13m ago

Latest Reply

PrashantAghara
Visitor

11m ago

0 kudos

The configs are for cluster:Worker Type & Driver type : Standard_D16ads_v5RUs for Cosmos : 1.5L

0 kudos

11m ago

by DC3 • New Contributor

Friday

117 Views
2 replies
0 kudos

Unable to access unity catalog volume via /Volumes in notebook

I have set up a volume in unity catalog in the format catalog/schema/volume, and granted all permissions to all users on the catalog, schema and volume.From the notebook I can see the /Volumes directory in the root of the file system but am unable to...

Data Engineering

117 Views
2 replies
0 kudos

Friday

Latest Reply

DC3
New Contributor

an hour ago

0 kudos

Thanks for your comments. The problem turned out to be the compute resource not having unity catalog enabled.

0 kudos

an hour ago

by Sagas • Visitor

5 hours ago

44 Views
2 replies
1 kudos

SparkR or sparklyr not showing history

Hi,for some reason Azure Databricks doesn't show History if the data is saved with SparkR (2 in the figure below) or Sparklyr (3), but it does show it with Data Ingestion (0) or with PySpark (1). Is this a known bug or am I doing something wrong? Is ...

Data Engineering

sparklyr

SparkR

44 Views
2 replies
1 kudos

5 hours ago

Latest Reply

Kaniz
Community Manager

5 hours ago

1 kudos

Hi @Sagas, Let’s address your questions regarding Azure Databricks, SparkR, and Sparklyr. History in Azure Databricks: Each operation that modifies a Delta Lake table creates a new table version. You can use history information to audit operation...

1 kudos

5 hours ago

by patrickw • New Contributor

Friday

125 Views
2 replies
0 kudos

connect timed out error - Connecting to SQL Server from Databricks

I am getting a connect timed out error when attempting to access a sql server. I can successfully ping the server from Databricks. I have used the jdbc connection and the sqlserver included driver and both result in the same error. I have also attemp...

Data Engineering

125 Views
2 replies
0 kudos

Friday

Latest Reply

Walter_C
Valued Contributor II

Saturday

0 kudos

Can you run the following command in a notebook using the same cluster you are using to connect:%sh nc -vz <hostname> <port> This test will confirm us if we are able to communicate with the SQL server by using the port you are defining to connect. If...

0 kudos

Saturday

by DLL • New Contributor

3 hours ago

23 Views
0 replies
0 kudos

Some columns are being dropped when moving to pandas data set.

Some columns are being dropped when moving to pandas data set. I see part of the dataset, but it does not show when displaying..

Data Engineering

23 Views
0 replies
0 kudos

3 hours ago

by Joaquim • New Contributor II

Friday

264 Views
2 replies
0 kudos

New admin question: How do you enable R on a existing cluster?

Hello Community. I have a user trying to use R and receive the error message illustrated on the attachment. I can't seem to find correct documentation on enabling R on an existing cluster. Would anyone be able to point me in the right direction? Than...

Administration & Architecture

264 Views
2 replies
0 kudos

Friday

Latest Reply

Walter_C
Valued Contributor II

Saturday

0 kudos

Hello Joaquim,Your issue might be related to the access mode of your cluster which probably has been selected to be Shared Access Mode.Shared cluster only allows Python, SQL and Scala languages, you might need to change the access mode to be Single U...

0 kudos

Saturday

by enkefalos-commu • New Contributor

Saturday

79 Views
1 replies
0 kudos

Unable to create serving endpoint for the huggingface model phi-3-mini-128k-instruct

#20 69.92 ERROR: Could not find a version that satisfies the requirement transformers==4.41.0.dev0 (from versions: 0.1, 2.0.0, 2.1.0, 2.1.1, 2.2.0, 2.2.1, 2.2.2, 2.3.0, 2.4.0, 2.4.1, 2.5.0, 2.5.1, 2.6.0, 2.7.0, 2.8.0, 2.9.0, 2.9.1, 2.10.0, 2.11.0, 3....

Machine Learning

79 Views
1 replies
0 kudos

Saturday

Latest Reply

Kaniz
Community Manager

5 hours ago

0 kudos

Hi @enkefalos-commu , I apologize for the inconvenience you’re facing. Let’s troubleshoot this issue together. Here are some steps you can take: Check Your Python Environment: Ensure that you are using a compatible Python environmentTransformer...

0 kudos

5 hours ago

by subha2 • New Contributor

Saturday

80 Views
1 replies
0 kudos

metadata driven DQ validation for multiple tables dynamically

There are multiple tables in the config/metadata table. These tables need to bevalidated for DQ rules.1.Natural Key / Business Key /Primary Key cannot be null orblank.2.Natural Key/Primary Key cannot be duplicate.3.Join columns missing values4.Busine...

Data Engineering

80 Views
1 replies
0 kudos

Saturday

Latest Reply

Kaniz
Community Manager

5 hours ago

0 kudos

Hi @subha2, To dynamically validate the data quality (DQ) rules for tables configured in a metadata-driven system using PySpark, you can follow these steps: Define Metadata for Tables: First, create a metadata configuration that describes the rules ...

0 kudos

5 hours ago

by Phani1 • Valued Contributor

8 hours ago

48 Views
1 replies
1 kudos

Job cluster configuration for 24/7

Hi Team,We intend to activate the job cluster around the clock. We consider the following parameters regarding cost: - Data volumes - Client SLA for job completion- Starting with a small cluster configuration Please advise on any other options we s...

Data Engineering

48 Views
1 replies
1 kudos

8 hours ago

Latest Reply

Kaniz
Community Manager

5 hours ago

1 kudos

Hi @Phani1, When configuring a job cluster for 24/7 operation, it’s essential to consider cost, performance, and scalability. Here are some recommendations based on your specified parameters: Data Volumes: Analyze your data volumes carefully. If...

1 kudos

5 hours ago

by agarg • Visitor

8 hours ago

80 Views
1 replies
0 kudos

Databricks REST API to fetch mount points

Is there a way to fetch workspace mount points ( mount infos) through REST API or SQL-query ? ( similar to the python API "display(dbutils.fs.mounts())" ) I couldn't find any REST API for the mounts in the official databricks API documentation ( ...

Administration & Architecture

80 Views
1 replies
0 kudos

8 hours ago

Latest Reply

Kaniz
Community Manager

6 hours ago

0 kudos

Hi @agarg, As of now, the Databricks REST API does not directly provide a specific endpoint to fetch workspace mount points or mount information. However, you can achieve this by executing SQL queries on Databricks SQL Warehouse using the Databricks ...

0 kudos

6 hours ago

by Sikki • New Contributor

Friday

156 Views
5 replies
0 kudos

Databricks Asset Bundle Workflow Redeployment Issue

Hello All,In my Databricks workflows, I have three tasks configured, with the final task set to run only if the condition "ALL_DONE" is met. During the first deployment, I observed that the dependency "ALL_DONE" was correctly assigned to the last tas...

Data Engineering

156 Views
5 replies
0 kudos

Friday

Latest Reply

Yeshwanth
Valued Contributor

yesterday

0 kudos

Hi @Sikki Good day! There was an issue and it was fixed recently. Could you please confirm if you are still facing the issue? Best regards,

0 kudos

yesterday

by madrhr • New Contributor

Wednesday

214 Views
3 replies
1 kudos

SparkContext lost when running %sh script.py

I need to execute a .py file in Databricks from a notebook (with arguments which for simplicity i exclude here). For this i am using:%sh script.pyscript.py:from pyspark import SparkContext def main(): sc = SparkContext.getOrCreate() print(sc...

Data Engineering

%sh

.py

bash shell

SparkContext

SparkShell

214 Views
3 replies
1 kudos

Wednesday

Latest Reply

madrhr
New Contributor

6 hours ago

1 kudos

I got it eventually working with a combination of:from databricks.sdk.runtime import *spark.sparkContext.addPyFile("/path/to/your/file")sys.path.append("path/to/your")

1 kudos

6 hours ago

by pfpmeijers • Visitor

7 hours ago

44 Views
1 replies
0 kudos

Databricks on premise GDCE

Hello, Any plans for supporting Databricks on GDCE or other on private cloud-native stack/HW on premise?Regards, Patrick

Administration & Architecture

44 Views
1 replies
0 kudos

7 hours ago

Latest Reply

Kaniz
Community Manager

7 hours ago

0 kudos

Hi @pfpmeijers, As of now, Databricks primarily operates as a unified, open analytics platform for constructing, deploying, sharing, and maintaining enterprise-grade data, analytics, and AI solutions at scale. It seamlessly integrates with cloud stor...

0 kudos

7 hours ago

by NOOR_BASHASHAIK • Contributor

03-12-2024 11:33:31 AM

405 Views
3 replies
0 kudos

Machine Type for VACUUM operation

Dear allI have a workflow with 2 tasks : one that does OPTIMIZE, followed by one that does VACUUM. I used a cluster with F32s driver and F64s - 8 workers (auto-scaling enabled). All 8 workers are launched by Databricks as soon as OPTIMIZE starts. As ...

Data Engineering

best practice

F series

optimize

vacuum

405 Views
3 replies
0 kudos

03-12-2024 11:33:31 AM

Latest Reply

ArturOA
Visitor

8 hours ago

0 kudos

Hi,were you able to get any useful help on this?

0 kudos

8 hours ago

by PrebenOlsen • New Contributor III

a week ago

109 Views
2 replies
0 kudos

How to migrate Git repos with DLT configurations

Hi!I want to migrate all my databricks related code from one github repo to another. I knew this wouldn't be straight forward. When I copy my code for one DLT, I get the errororg.apache.spark.sql.catalyst.ExtendedAnalysisException: Table 'vessel_batt...

Data Engineering

109 Views
2 replies
0 kudos

a week ago

Latest Reply

PrebenOlsen
New Contributor III

8 hours ago

0 kudos

Does cloning take considerably less time then recreating the tables?Can I resume append operations to a cloned table?

0 kudos

8 hours ago

Top Kudoed Authors

User

Count

1769

807

459

309

291

Featured Posts

Unity Catalog Lakeguard: Industry-first and only data governance for multi-user Apache™ Spark cluste

Unity Catalog Lakeguard: Industry-first and only data governance for multi-user Apache™ Spark cluste

Announcing the General Availability of Databricks Asset Bundles

Announcing the General Availability of Databricks Asset Bundles

How to successfully build GenAI applications

How to successfully build GenAI applications