cancel
Showing results for 
Search instead for 
Did you mean: 
Community Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Neeraj_Kumar
by New Contributor
  • 472 Views
  • 1 replies
  • 0 kudos

Issues with Runtime 15.1/15.2Beta in shared access mode

We have been using runtime 14.2, share mode for our computing cluster in Databrick for quite some time.  We are now trying to upgrade to python 3.11 for some dependencies mangement, thereby requiring us to use runtime 15.1/15.2  as runtime 14.2 only ...

  • 472 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @Neeraj_Kumar,  Ensure that the necessary libraries are available in the repository used for installation.Verify that the library versions specified are correct and available.Consider installing the library with a different version or from a diffe...

  • 0 kudos
georgeyjy
by New Contributor II
  • 1295 Views
  • 2 replies
  • 0 kudos

Resolved! Why saving pyspark df always converting string field to number?

  import pandas as pd from pyspark.sql.types import StringType, IntegerType from pyspark.sql.functions import col save_path = os.path.join(base_path, stg_dir, "testCsvEncoding") d = [{"code": "00034321"}, {"code": "55964445226"}] df = pd.Data...

  • 1295 Views
  • 2 replies
  • 0 kudos
Latest Reply
daniel_sahal
Esteemed Contributor
  • 0 kudos

@georgeyjy Try opening the CSV as text editor. I bet that Excel is automatically trying to detect the schema of CSV thus it thinks that it's an integer.

  • 0 kudos
1 More Replies
Madhawa
by New Contributor II
  • 664 Views
  • 2 replies
  • 0 kudos

Resolved! Unable to access AWS S3 - Error : java.nio.file.AccessDeniedException

Reading file like this "Data = spark.sql("SELECT * FROM edge.inv.rm") Getting this error org.apache.spark.SparkException: Job aborted due to stage failure: Task 10 in stage 441.0 failed 4 times, most recent failure: Lost task 10.3 in stage 441.0 (TID...

  • 664 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @Madhawa,  Ensure that the AWS credentials (access key and secret key) are correctly configured in your Spark application. You can set them using spark.conf.set("spark.hadoop.fs.s3a.access.key", "your_access_key") and spark.conf.set("spark.hadoop....

  • 0 kudos
1 More Replies
Shravanshibu
by New Contributor III
  • 380 Views
  • 1 replies
  • 0 kudos

Unable to install a wheel file which is in my volume to a serverless cluster

I am trying to install a wheel file which is in my volume to a serverless cluster, getting the below error@ken@Kaniz_Fatma Note: you may need to restart the kernel using %restart_python or dbutils.library.restartPython() to use updated packages. WARN...

  • 380 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @Shravanshibu, Verify that the wheel file is actually present at the specified location. Double-check the path to ensure there are no typos or missing directories.Remember that Databricks mounts DBFS (Databricks File System) at /dbfs on cluster no...

  • 0 kudos
_databreaks
by New Contributor II
  • 315 Views
  • 1 replies
  • 0 kudos

DLT to push data instead of a pull

I am relatively new to Databricks, and from my recent experience it appears that every step in a DLT Pipeline, we define each LIVE TABLES (be it streaming or not) to pull data upstream.I have yet to see an implementation where data from upstream woul...

Community Discussions
dlt
DLT pipeline
  • 315 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @_databreaks, You’re absolutely right! While the typical approach in Databricks involves pulling data from upstream sources into downstream tables, there are scenarios where a push-based architecture could be beneficial.  Pull-Based Architectu...

  • 0 kudos
RobsonNLPT
by Contributor
  • 506 Views
  • 1 replies
  • 0 kudos

Databricks UC Data Lineage Official Limitations

Hi all.I have a huge data migration project using medallion architecture,  UC, notebooks and workflows . One of the relevant requirements we have is to capture all data dependencies (upstreams and downstreams) using data lineage. I've followed all re...

  • 506 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

 Hi @RobsonNLPT,  Consider checking the documentation for any updates or upcoming features related to capturing CTEs as upstreams in your chosen solution.

  • 0 kudos
devendra_tomar
by New Contributor
  • 357 Views
  • 1 replies
  • 0 kudos

How to Read Data from Databricks Worker Nodes in Unity Catalog Volume

I am currently working on a similarity search use case where we need to extract text from PDF files and create a vector index. We have stored our PDF files in a Unity Catalog Volume, and I can successfully read these files from the driver node.Here's...

  • 357 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @devendra_tomar,  Unity Catalog volumes represent logical storage volumes in a cloud object storage location. They allow governance over non-tabular datasets, providing capabilities for accessing, storing, and organizing files.While tables govern ...

  • 0 kudos
NarenderKumar
by New Contributor III
  • 1414 Views
  • 3 replies
  • 0 kudos

Resolved! Unable to generate account level PAT for service principle

I am trying to generate PAT for a service principle.I am following the documentation as shown below:https://docs.databricks.com/en/dev-tools/auth/oauth-m2m.html#create-token-in-accountI have prepared the below curl command:I am getting below error:Pl...

NarenderKumar_0-1715695724302.png NarenderKumar_1-1715695859890.png NarenderKumar_2-1715695895738.png
  • 1414 Views
  • 3 replies
  • 0 kudos
Latest Reply
NarenderKumar
New Contributor III
  • 0 kudos

I was able to generate the workspace level token using the databricks cli.I set the following details in the databricks cli profile(.databrickscfg) file: host  = https://myworksapce.azuredatabricks.net/ account_id = (my db account id)client_id     = ...

  • 0 kudos
2 More Replies
NhanNguyen
by Contributor II
  • 1229 Views
  • 2 replies
  • 1 kudos

[Delta live table vs Workflow]

Hi Community Members,I have been using Databricks for a while, but I have only used Workflow. I have a question about the differences between Delta Live Table and Workflow. Which one should we use in which scenario?Thanks,

  • 1229 Views
  • 2 replies
  • 1 kudos
Latest Reply
Hkesharwani
Contributor II
  • 1 kudos

Hi, Delta Live Tables focuses on managing data ingestion, transformation, and management of Delta tables using a declarative framework. Job Workflows are designed to orchestrate and schedule various data processing and analysis tasks, including SQL q...

  • 1 kudos
1 More Replies
kazinahian
by New Contributor III
  • 1338 Views
  • 2 replies
  • 1 kudos

Resolved! Enable or disable Databricks Assistant in the Community Edition.

Hello,Good afternoon great people. I was following the step-by-step instructions to enable or disable Databricks Assistant in my Databricks Community Edition to enable the AI assistance. However, I couldn't find the option and was unable to enable it...

Community Discussions
datbricks community
  • 1338 Views
  • 2 replies
  • 1 kudos
Latest Reply
kazinahian
New Contributor III
  • 1 kudos

Thank you @Kaniz_Fatma 

  • 1 kudos
1 More Replies
paritosh_sharma
by New Contributor
  • 544 Views
  • 1 replies
  • 0 kudos

DAB template dbt-sql not working

Hi,We are trying to use the dbt-sql template provided for databricks asset bundles but getting error as follows: Looks like its regarding default catalog configuration. Has anyone faced this previously or can help with the same  

Screenshot 2024-05-17 at 10.25.38.png
  • 544 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @paritosh_sharma, In Databricks, you can use the USE CATALOG command to switch between catalogs1. If the default catalog is not set, you might encounter errors. You can try setting the default catalog by using the command USE CATALOG 'your_catalog...

  • 0 kudos
NandiniN
by Honored Contributor
  • 1861 Views
  • 1 replies
  • 2 kudos

How to collect a thread dump from Databricks Spark UI.

If you observe a hung job, thread dumps are crucial to determine the root cause. Hence, it would be a good idea to collect the thread dumps before cancelling the hung job. Here are the Instructions to collect the Spark driver/executor thread dump:  ​...

  • 1861 Views
  • 1 replies
  • 2 kudos
Latest Reply
jose_gonzalez
Moderator
  • 2 kudos

Thank you for sharing @NandiniN

  • 2 kudos
RobsonNLPT
by Contributor
  • 6447 Views
  • 2 replies
  • 1 kudos

Permissions on Unity Catalog Table Constraints

Hi all.I've used new options to add constraints to UC tablesEven granting permissions to an user (ALL PRIVILEGES) on particular schema we have errors when trying to add PKs. The message doesn't make sense (PERMISSION_DENIED: User is not an owner of T...

  • 6447 Views
  • 2 replies
  • 1 kudos
Latest Reply
dmart
New Contributor III
  • 1 kudos

So how does one grant these permissions to non-owners?

  • 1 kudos
1 More Replies
traillog
by New Contributor
  • 730 Views
  • 1 replies
  • 0 kudos

Response code 400 received when using VSCode on Windows 10 but no issue while using Ubuntu

I use VSCode on Windows 10 for building and deploying a workflow from my system and always encounter response code 400 when trying to deploy it. I am able to deploy the workflows via Ubuntu, but not via Windows. Has anyone encountered this issue befo...

  • 730 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @traillog, Windows uses a different path format compared to Unix-based systems like Ubuntu. Make sure that the paths in your script are in the correct format for Windows.

  • 0 kudos
tim-mcwilliams
by New Contributor III
  • 2016 Views
  • 7 replies
  • 2 kudos

Notebook cell gets hung up but code completes

Have been running into an issue when running a pymc-marketing model in a Databricks notebook. The cell that fits the model gets hung up and the progress bar stops moving, however the code completes and dumps all needed output into a folder. After the...

  • 2016 Views
  • 7 replies
  • 2 kudos
Latest Reply
Mickel
New Contributor II
  • 2 kudos

This can be a frustrating situation where the notebook cell appears stuck, but the code execution actually finishes in the background. Here are some steps you can troubleshoot to resolve this: camzap bazoocam1. Restart vs Interrupt:Try using the "Res...

  • 2 kudos
6 More Replies
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!