cancel
Showing results for 
Search instead for 
Did you mean: 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

subham0611
by New Contributor II
  • 1545 Views
  • 1 replies
  • 0 kudos

Resolved! How does coalesce works internally

Hi Databricks team,I am trying to understand internals of spark coalesce code(DefaultPartitionCoalescer) and going through spark code for this. While I understood coalesce function but I am not sure about complete flow of code like where its get call...

  • 1545 Views
  • 1 replies
  • 0 kudos
Latest Reply
raphaelblg
Databricks Employee
  • 0 kudos

  Hello @subham0611 , The coalesce operation triggered from user code can be initiated from either an RDD or a Dataset, with each having distinct codepaths: RDD: https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/RDD...

  • 0 kudos
georgeyjy
by New Contributor II
  • 3038 Views
  • 2 replies
  • 0 kudos

Resolved! Why saving pyspark df always converting string field to number?

  import pandas as pd from pyspark.sql.types import StringType, IntegerType from pyspark.sql.functions import col save_path = os.path.join(base_path, stg_dir, "testCsvEncoding") d = [{"code": "00034321"}, {"code": "55964445226"}] df = pd.Data...

  • 3038 Views
  • 2 replies
  • 0 kudos
Latest Reply
daniel_sahal
Esteemed Contributor
  • 0 kudos

@georgeyjy Try opening the CSV as text editor. I bet that Excel is automatically trying to detect the schema of CSV thus it thinks that it's an integer.

  • 0 kudos
1 More Replies
Madhawa
by New Contributor II
  • 2078 Views
  • 1 replies
  • 0 kudos

Unable to access AWS S3 - Error : java.nio.file.AccessDeniedException

Reading file like this "Data = spark.sql("SELECT * FROM edge.inv.rm") Getting this error org.apache.spark.SparkException: Job aborted due to stage failure: Task 10 in stage 441.0 failed 4 times, most recent failure: Lost task 10.3 in stage 441.0 (TID...

  • 2078 Views
  • 1 replies
  • 0 kudos
Latest Reply
" src="" />
This widget could not be displayed.
This widget could not be displayed.
This widget could not be displayed.
  • 0 kudos

This widget could not be displayed.
Reading file like this "Data = spark.sql("SELECT * FROM edge.inv.rm") Getting this error org.apache.spark.SparkException: Job aborted due to stage failure: Task 10 in stage 441.0 failed 4 times, most recent failure: Lost task 10.3 in stage 441.0 (TID...

This widget could not be displayed.
  • 0 kudos
This widget could not be displayed.
NarenderKumar
by New Contributor III
  • 2170 Views
  • 2 replies
  • 0 kudos

Resolved! Unable to generate account level PAT for service principle

I am trying to generate PAT for a service principle.I am following the documentation as shown below:https://docs.databricks.com/en/dev-tools/auth/oauth-m2m.html#create-token-in-accountI have prepared the below curl command:I am getting below error:Pl...

NarenderKumar_0-1715695724302.png NarenderKumar_1-1715695859890.png NarenderKumar_2-1715695895738.png
  • 2170 Views
  • 2 replies
  • 0 kudos
Latest Reply
NarenderKumar
New Contributor III
  • 0 kudos

I was able to generate the workspace level token using the databricks cli.I set the following details in the databricks cli profile(.databrickscfg) file: host  = https://myworksapce.azuredatabricks.net/ account_id = (my db account id)client_id     = ...

  • 0 kudos
1 More Replies
NhanNguyen
by Contributor II
  • 3157 Views
  • 2 replies
  • 1 kudos

[Delta live table vs Workflow]

Hi Community Members,I have been using Databricks for a while, but I have only used Workflow. I have a question about the differences between Delta Live Table and Workflow. Which one should we use in which scenario?Thanks,

  • 3157 Views
  • 2 replies
  • 1 kudos
Latest Reply
Hkesharwani
Contributor II
  • 1 kudos

Hi, Delta Live Tables focuses on managing data ingestion, transformation, and management of Delta tables using a declarative framework. Job Workflows are designed to orchestrate and schedule various data processing and analysis tasks, including SQL q...

  • 1 kudos
1 More Replies
kazinahian
by New Contributor III
  • 2977 Views
  • 1 replies
  • 0 kudos

Enable or disable Databricks Assistant in the Community Edition.

Hello,Good afternoon great people. I was following the step-by-step instructions to enable or disable Databricks Assistant in my Databricks Community Edition to enable the AI assistance. However, I couldn't find the option and was unable to enable it...

  • 2977 Views
  • 1 replies
  • 0 kudos
Latest Reply
kazinahian
New Contributor III
  • 0 kudos

Thank you @Retired_mod 

  • 0 kudos
paritosh_sharma
by New Contributor
  • 893 Views
  • 0 replies
  • 0 kudos

DAB template dbt-sql not working

Hi,We are trying to use the dbt-sql template provided for databricks asset bundles but getting error as follows: Looks like its regarding default catalog configuration. Has anyone faced this previously or can help with the same  

Screenshot 2024-05-17 at 10.25.38.png
  • 893 Views
  • 0 replies
  • 0 kudos
NandiniN
by Databricks Employee
  • 3246 Views
  • 1 replies
  • 2 kudos

How to collect a thread dump from Databricks Spark UI.

If you observe a hung job, thread dumps are crucial to determine the root cause. Hence, it would be a good idea to collect the thread dumps before cancelling the hung job. Here are the Instructions to collect the Spark driver/executor thread dump:  ​...

  • 3246 Views
  • 1 replies
  • 2 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 2 kudos

Thank you for sharing @NandiniN

  • 2 kudos
traillog
by New Contributor
  • 1146 Views
  • 0 replies
  • 0 kudos

Response code 400 received when using VSCode on Windows 10 but no issue while using Ubuntu

I use VSCode on Windows 10 for building and deploying a workflow from my system and always encounter response code 400 when trying to deploy it. I am able to deploy the workflows via Ubuntu, but not via Windows. Has anyone encountered this issue befo...

  • 1146 Views
  • 0 replies
  • 0 kudos
zgreen
by New Contributor
  • 604 Views
  • 0 replies
  • 0 kudos

jobs.python_wheel_task.enty_point can't find entry points defined in dependency packages

Let's say I have packageA with no entry points, packageA dependents on dependencyA package, which has entry-points.In order to be able to use those entrypoints, i.e.```yamlpython_wheel_task:  package_name: packageA  entry_point:dependencyA_entry```I ...

  • 604 Views
  • 0 replies
  • 0 kudos
Verr
by New Contributor II
  • 1234 Views
  • 2 replies
  • 0 kudos

child notebook is not displaying output.

I have built pipeline to execute databricks notebook having SQL scripts. It is executing notebook but not able to see output for each cell. I am executing child notebook through driver notebook.

  • 1234 Views
  • 2 replies
  • 0 kudos
Latest Reply
koushiknpvs
New Contributor III
  • 0 kudos

Hi Verr,In short it depends on how your child notebook is configured. But I would start with the following points -Output Logging Settings: Check the logging settings for your notebook cells. Ensure that the cells are configured to display output. In...

  • 0 kudos
1 More Replies
GeKo
by New Contributor III
  • 3890 Views
  • 6 replies
  • 2 kudos

Resolved! column "storage_sub_directory" is now always NULL in system.information_schema.tables

Hello,I am running a job that depends on the information provided in column storage_sub_directory in system.information_schema.tables .... and it worked until 1-2 weeks ago.Now I discovered in the doc that this column is deprecated and always null , ...

Community Platform Discussions
Unity Catalog
unitycatalog
  • 3890 Views
  • 6 replies
  • 2 kudos
Latest Reply
GeKo
New Contributor III
  • 2 kudos

Many thanks for the update @NandiniN 

  • 2 kudos
5 More Replies
Lucifer
by New Contributor
  • 976 Views
  • 1 replies
  • 0 kudos

displaying unity catalog metadata and other information in sharePoint

Is there any connectors or api which we can use to display metadata information stored in Unity catalog to business users using SharePoint.

  • 976 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 0 kudos

Hi @Lucifer As metadata are stored in system schema as table that means you can use databricks to extract the data from databricks and display it to Sharepoint Docs - Statement Execution API: Run SQL on warehouses | Databricks on AWS

  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Top Kudoed Authors