cancel
Showing results for 
Search instead for 
Did you mean: 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

elsirya
by New Contributor III
  • 794 Views
  • 2 replies
  • 2 kudos

Resolved! unit testing

Currently I am creating unit tests for our ETL scripts although the test is not able to recognize sc (SparkContext).Is there a way to mock SparkContext for a unit test? Code being tested: df = spark.read.json(sc.parallelize([data])) Error message rec...

  • 794 Views
  • 2 replies
  • 2 kudos
Latest Reply
elsirya
New Contributor III
  • 2 kudos

Was able to get this to work.What I had to do was instantiate the "sc" variable in the PySpark notebook.PySpark code:"sc = spark.SparkContext"Then in the PyTest script we add a "@patch()" statement with the "sc" variable and create a "mock_sc" variab...

  • 2 kudos
1 More Replies
aparna_526
by New Contributor
  • 402 Views
  • 0 replies
  • 0 kudos

Databricks exam got suspended

My Databricks Certified data engineer associate exam got suspended on 18 July 2024 .I was continuously in front of the camera and an alert appeared and then my exam resumed. Then later a support person told me that your exam got suspended. I Don't kn...

  • 402 Views
  • 0 replies
  • 0 kudos
aditi06
by New Contributor III
  • 461 Views
  • 1 replies
  • 1 kudos

Issue while launching the data engineer exam

I started my exam but it said due to a technical issue it had been suspended though I checked all prerequisites and system checks.I have already raised the ticket please resolve this issue as early as possible.Ticket no.:  #00504757#reschedule #issue...

  • 461 Views
  • 1 replies
  • 1 kudos
Latest Reply
aditi06
New Contributor III
  • 1 kudos

@Retired_mod , Please resolve this issue as early as possible. I have already raised the ticket

  • 1 kudos
Simon_T
by New Contributor III
  • 1973 Views
  • 2 replies
  • 1 kudos

Resolved! Databricks Bundle Error

I am running this command: databricks bundle deploy --profile DAVE2_Dev --debug And I am getting this error: 10:13:28 DEBUG open dir C:\Users\teffs.THEAA\OneDrive - AA Ltd\Databricks\my_project\dist: open C:\Users\teffs.THEAA\OneDrive - AA Ltd\Databr...

  • 1973 Views
  • 2 replies
  • 1 kudos
Latest Reply
Simon_T
New Contributor III
  • 1 kudos

So I found a link to a page that said that the databricks bundle command is expecting python3.exe instead of python.exe. So I took a copy of python.exe and renamed it to python3.exe and that seems to work. Thanks for investigating though.

  • 1 kudos
1 More Replies
Puneet096
by New Contributor II
  • 858 Views
  • 3 replies
  • 1 kudos

how to implement delta load when table only has primary columns

I have a table where there are two columns and both are primary key, I want to do delta load when taking data from source to target. Any idea how to implement this? 

  • 858 Views
  • 3 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Contributor III
  • 1 kudos

But that shouldn't be a problem. In merge condition you check both keys as in example above. If combination of two keysb already exists in the table then do nothing. If there is new combination of key1 and key2 just insert it into target table.It's t...

  • 1 kudos
2 More Replies
InquisitiveGeek
by New Contributor II
  • 945 Views
  • 3 replies
  • 0 kudos

how can I store my cell output as a text file in my local drive?

I want to store the output of my cell as a text file in my local hard drive.I'm getting the json output and I need that json in my local drive as a text file. 

  • 945 Views
  • 3 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Contributor III
  • 0 kudos

Hi @InquisitiveGeek ,You can do this following below approach: https://learn.microsoft.com/en-us/azure/databricks/notebooks/notebook-outputs#download-results

  • 0 kudos
2 More Replies
himanmon
by New Contributor III
  • 706 Views
  • 2 replies
  • 1 kudos

Can I move a single file larger than 100GB using dbtuils fs?

Hello. I have a file over 100GB. Sometimes this is on the cluster's local path, and sometimes it's on the volume.And I want to send this to another path on the volume, or to the s3 bucket. dbutils.fs.cp('file:///tmp/test.txt', '/Volumes/catalog/schem...

himanmon_0-1721181332995.png himanmon_1-1721185042042.png
  • 706 Views
  • 2 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Contributor III
  • 1 kudos

Hi @himanmon ,This is caused because of S3 limit on segment count. The part files can be numbered only from 1 to 10000After Setting spark.hadoop.fs.s3a.multipart.size to 104857600. , did you RESTART the cluster? Because it'll only work when the clust...

  • 1 kudos
1 More Replies
yurib
by New Contributor III
  • 1245 Views
  • 1 replies
  • 0 kudos

Resolved! error creating token when creating databricks_mws_workspace resource on GCP

 resource "databricks_mws_workspaces" "this" { depends_on = [ databricks_mws_networks.this ] provider = databricks.account account_id = var.databricks_account_id workspace_name = "${local.prefix}-dbx-ws" location = var.google_region clou...

  • 1245 Views
  • 1 replies
  • 0 kudos
Latest Reply
yurib
New Contributor III
  • 0 kudos

my issue was caused be credentials in `~/.databrickscfg` (generated by databricks cli) taking precedence over the creds set by `gcloud auth application-default login`. google's application default creds should be used when using the databricks google...

  • 0 kudos
FaizH
by New Contributor III
  • 2208 Views
  • 2 replies
  • 1 kudos

Resolved! Error - Data Masking

Hi,I was testing masking functionality of databricks and got the below error:java.util.concurrent.ExecutionException: com.databricks.sql.managedcatalog.acl.UnauthorizedAccessException:PERMISSION_DENIED: Query on table dev_retransform.uc_lineage.test_...

FaizH_0-1721187898699.png
  • 2208 Views
  • 2 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Contributor III
  • 1 kudos

Hi @FaizH , Are you using single user compute by any chance? Because of you do there is following limitation:Single-user compute limitationDo not add row filters or column masks to any table that you are accessing from a single-user cluster. During t...

  • 1 kudos
1 More Replies
thilanka02
by New Contributor II
  • 1683 Views
  • 3 replies
  • 1 kudos

Resolved! Spark read CSV does not throw Exception if the file path is not available in Databricks 14.3

We were using this method and this was working as expected in Databricks 13.3.  def read_file(): try: df_temp_dlr_kpi = spark.read.load(raw_path,format="csv", schema=kpi_schema) return df_temp_dlr_kpi except Exce...

Screenshot 2024-04-19 at 13.29.19.png
  • 1683 Views
  • 3 replies
  • 1 kudos
Latest Reply
databricks100
New Contributor II
  • 1 kudos

Hi, has this been resolved? I am still seeing this issue with Runtime 14.3 LTSThanks in advance.

  • 1 kudos
2 More Replies
Mrinal16
by New Contributor
  • 776 Views
  • 1 replies
  • 0 kudos

Connection with virtual machine

I have to upload files from Azure container to Virtual machine using Databricks. I have mounted my files to Databricks.Please help me do it. If any idea about this.

  • 776 Views
  • 1 replies
  • 0 kudos
Latest Reply
imsabarinath
New Contributor III
  • 0 kudos

I am not sure if you need to further curate the data before you upload it to Virtual machine, if you not you can just mount storage on VMCreate an SMB Azure file share and connect it to a Windows VM | Microsoft LearnAzure Storage - Create File Storag...

  • 0 kudos
johnp
by New Contributor III
  • 1063 Views
  • 2 replies
  • 1 kudos

Resolved! Access to "Admin Console" and "System Tables"

I am the contributor and owner of my databricks workspace. After a recent spike of expense, I want to check the billing details of my Azure databricks usage. (i.e per cluster, per VM, etc).  Databricks provides these information thorough "Admin Conso...

  • 1063 Views
  • 2 replies
  • 1 kudos
Latest Reply
Rishabh_Tiwari
Databricks Employee
  • 1 kudos

Hi @johnp , Thank you for reaching out to our community! We're here to help you.  To ensure we provide you with the best support, could you please take a moment to review the response and choose the one that best answers your question? Your feedback...

  • 1 kudos
1 More Replies
lbdatauser
by New Contributor II
  • 453 Views
  • 1 replies
  • 0 kudos

Liquid clustering with incremental ingestion

We ingest data incrementally from a database into delta tables using a column updatedUtc. This column is a datetime and is updated when the row in the database table changes. What about using this non-mutable column in "cluster by"? Would it require ...

  • 453 Views
  • 1 replies
  • 0 kudos
Latest Reply
greyamber
New Contributor II
  • 0 kudos

It recommended to run optimize query in scheduled manner https://docs.databricks.com/en/delta/clustering.html#how-to-trigger-clustering 

  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Top Kudoed Authors