- 619 Views
- 1 replies
- 0 kudos
delta live table merging
here at data AI summit day 3. very excited to hear about the delta live tables and it's functionality of applying changes, which actually provides a direct solution to the dilemma at our company when uploading data!
- 619 Views
- 1 replies
- 0 kudos
- 0 kudos
Hi @rli , We're thrilled to hear that you had a great experience at DAIS 2023! Your feedback is valuable to us, and we appreciate you taking the time to share it on the community platform. We wanted to let you know that the Databricks Community Team ...
- 0 kudos
- 917 Views
- 1 replies
- 0 kudos
How does coalesce works internally
Hi Databricks team,I am trying to understand internals of spark coalesce code(DefaultPartitionCoalescer) and going through spark code for this. While I understood coalesce function but I am not sure about complete flow of code like where its get call...
- 917 Views
- 1 replies
- 0 kudos
- 0 kudos
Hello @subham0611 , The coalesce operation triggered from user code can be initiated from either an RDD or a Dataset, with each having distinct codepaths: RDD: https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/RDD...
- 0 kudos
- 833 Views
- 1 replies
- 0 kudos
Issues with Runtime 15.1/15.2Beta in shared access mode
We have been using runtime 14.2, share mode for our computing cluster in Databrick for quite some time. We are now trying to upgrade to python 3.11 for some dependencies mangement, thereby requiring us to use runtime 15.1/15.2 as runtime 14.2 only ...
- 833 Views
- 1 replies
- 0 kudos
- 0 kudos
Hi @Neeraj_Kumar, Ensure that the necessary libraries are available in the repository used for installation.Verify that the library versions specified are correct and available.Consider installing the library with a different version or from a diffe...
- 0 kudos
- 1982 Views
- 2 replies
- 0 kudos
Resolved! Why saving pyspark df always converting string field to number?
import pandas as pd from pyspark.sql.types import StringType, IntegerType from pyspark.sql.functions import col save_path = os.path.join(base_path, stg_dir, "testCsvEncoding") d = [{"code": "00034321"}, {"code": "55964445226"}] df = pd.Data...
- 1982 Views
- 2 replies
- 0 kudos
- 0 kudos
@georgeyjy Try opening the CSV as text editor. I bet that Excel is automatically trying to detect the schema of CSV thus it thinks that it's an integer.
- 0 kudos
- 1346 Views
- 2 replies
- 0 kudos
Resolved! Unable to access AWS S3 - Error : java.nio.file.AccessDeniedException
Reading file like this "Data = spark.sql("SELECT * FROM edge.inv.rm") Getting this error org.apache.spark.SparkException: Job aborted due to stage failure: Task 10 in stage 441.0 failed 4 times, most recent failure: Lost task 10.3 in stage 441.0 (TID...
- 1346 Views
- 2 replies
- 0 kudos
- 0 kudos
Hi @Madhawa, Ensure that the AWS credentials (access key and secret key) are correctly configured in your Spark application. You can set them using spark.conf.set("spark.hadoop.fs.s3a.access.key", "your_access_key") and spark.conf.set("spark.hadoop....
- 0 kudos
- 795 Views
- 1 replies
- 0 kudos
Unable to install a wheel file which is in my volume to a serverless cluster
I am trying to install a wheel file which is in my volume to a serverless cluster, getting the below error@ken@Kaniz_Fatma Note: you may need to restart the kernel using %restart_python or dbutils.library.restartPython() to use updated packages. WARN...
- 795 Views
- 1 replies
- 0 kudos
- 0 kudos
Hi @Shravanshibu, Verify that the wheel file is actually present at the specified location. Double-check the path to ensure there are no typos or missing directories.Remember that Databricks mounts DBFS (Databricks File System) at /dbfs on cluster no...
- 0 kudos
- 649 Views
- 1 replies
- 0 kudos
DLT to push data instead of a pull
I am relatively new to Databricks, and from my recent experience it appears that every step in a DLT Pipeline, we define each LIVE TABLES (be it streaming or not) to pull data upstream.I have yet to see an implementation where data from upstream woul...
- 649 Views
- 1 replies
- 0 kudos
- 0 kudos
Hi @_databreaks, You’re absolutely right! While the typical approach in Databricks involves pulling data from upstream sources into downstream tables, there are scenarios where a push-based architecture could be beneficial. Pull-Based Architectu...
- 0 kudos
- 760 Views
- 1 replies
- 0 kudos
Databricks UC Data Lineage Official Limitations
Hi all.I have a huge data migration project using medallion architecture, UC, notebooks and workflows . One of the relevant requirements we have is to capture all data dependencies (upstreams and downstreams) using data lineage. I've followed all re...
- 760 Views
- 1 replies
- 0 kudos
- 0 kudos
Hi @RobsonNLPT, Consider checking the documentation for any updates or upcoming features related to capturing CTEs as upstreams in your chosen solution.
- 0 kudos
- 723 Views
- 1 replies
- 0 kudos
How to Read Data from Databricks Worker Nodes in Unity Catalog Volume
I am currently working on a similarity search use case where we need to extract text from PDF files and create a vector index. We have stored our PDF files in a Unity Catalog Volume, and I can successfully read these files from the driver node.Here's...
- 723 Views
- 1 replies
- 0 kudos
- 0 kudos
Hi @devendra_tomar, Unity Catalog volumes represent logical storage volumes in a cloud object storage location. They allow governance over non-tabular datasets, providing capabilities for accessing, storing, and organizing files.While tables govern ...
- 0 kudos
- 1894 Views
- 3 replies
- 0 kudos
Resolved! Unable to generate account level PAT for service principle
I am trying to generate PAT for a service principle.I am following the documentation as shown below:https://docs.databricks.com/en/dev-tools/auth/oauth-m2m.html#create-token-in-accountI have prepared the below curl command:I am getting below error:Pl...
- 1894 Views
- 3 replies
- 0 kudos
- 0 kudos
I was able to generate the workspace level token using the databricks cli.I set the following details in the databricks cli profile(.databrickscfg) file: host = https://myworksapce.azuredatabricks.net/ account_id = (my db account id)client_id = ...
- 0 kudos
- 2204 Views
- 2 replies
- 1 kudos
[Delta live table vs Workflow]
Hi Community Members,I have been using Databricks for a while, but I have only used Workflow. I have a question about the differences between Delta Live Table and Workflow. Which one should we use in which scenario?Thanks,
- 2204 Views
- 2 replies
- 1 kudos
- 1 kudos
Hi, Delta Live Tables focuses on managing data ingestion, transformation, and management of Delta tables using a declarative framework. Job Workflows are designed to orchestrate and schedule various data processing and analysis tasks, including SQL q...
- 1 kudos
- 2187 Views
- 2 replies
- 1 kudos
Resolved! Enable or disable Databricks Assistant in the Community Edition.
Hello,Good afternoon great people. I was following the step-by-step instructions to enable or disable Databricks Assistant in my Databricks Community Edition to enable the AI assistance. However, I couldn't find the option and was unable to enable it...
- 2187 Views
- 2 replies
- 1 kudos
- 724 Views
- 1 replies
- 0 kudos
DAB template dbt-sql not working
Hi,We are trying to use the dbt-sql template provided for databricks asset bundles but getting error as follows: Looks like its regarding default catalog configuration. Has anyone faced this previously or can help with the same
- 724 Views
- 1 replies
- 0 kudos
- 0 kudos
Hi @paritosh_sharma, In Databricks, you can use the USE CATALOG command to switch between catalogs1. If the default catalog is not set, you might encounter errors. You can try setting the default catalog by using the command USE CATALOG 'your_catalog...
- 0 kudos
- 2438 Views
- 1 replies
- 2 kudos
How to collect a thread dump from Databricks Spark UI.
If you observe a hung job, thread dumps are crucial to determine the root cause. Hence, it would be a good idea to collect the thread dumps before cancelling the hung job. Here are the Instructions to collect the Spark driver/executor thread dump: ​...
- 2438 Views
- 1 replies
- 2 kudos
- 999 Views
- 1 replies
- 0 kudos
Response code 400 received when using VSCode on Windows 10 but no issue while using Ubuntu
I use VSCode on Windows 10 for building and deploying a workflow from my system and always encounter response code 400 when trying to deploy it. I am able to deploy the workflows via Ubuntu, but not via Windows. Has anyone encountered this issue befo...
- 999 Views
- 1 replies
- 0 kudos
- 0 kudos
Hi @traillog, Windows uses a different path format compared to Unix-based systems like Ubuntu. Make sure that the paths in your script are in the correct format for Windows.
- 0 kudos
Connect with Databricks Users in Your Area
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group-
AI Summit
4 -
Azure
3 -
Azure databricks
3 -
Bi
1 -
Certification
1 -
Certification Voucher
2 -
Chatgpt
1 -
Community
7 -
Community Edition
3 -
Community Members
2 -
Community Social
1 -
Contest
1 -
Data + AI Summit
1 -
Data Engineering
1 -
Data Processing
1 -
Databricks Certification
1 -
Databricks Cluster
1 -
Databricks Community
11 -
Databricks community edition
3 -
Databricks Community Rewards Store
3 -
Databricks Lakehouse Platform
5 -
Databricks notebook
1 -
Databricks Office Hours
1 -
Databricks Runtime
1 -
Databricks SQL
4 -
Databricks-connect
1 -
DBFS
1 -
Dear Community
3 -
Delta
10 -
Delta Live Tables
1 -
Documentation
1 -
Exam
1 -
Featured Member Interview
1 -
HIPAA
1 -
Integration
1 -
LLM
1 -
Machine Learning
1 -
Notebook
1 -
Onboarding Trainings
1 -
Python
2 -
Rest API
11 -
Rewards Store
2 -
Serverless
1 -
Social Group
1 -
Spark
1 -
SQL
8 -
Summit22
1 -
Summit23
5 -
Training
1 -
Unity Catalog
4 -
Version
1 -
VOUCHER
1 -
WAVICLE
1 -
Weekly Release Notes
2 -
weeklyreleasenotesrecap
2 -
Workspace
1
- « Previous
- Next »