cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

MohammadWasi
by New Contributor II
  • 2094 Views
  • 3 replies
  • 0 kudos

i can list out the file using dbutils but can not able to read files in databricks

I can list out the file using dbutils but can not able to read files in databricks. PFB in screenshot. I can able to see the  file using dbutils.fs.ls but when i try to read this file using read_excel then it is showing me an error like "FileNotFound...

MohammadWasi_0-1715064354700.png
Data Engineering
Databricks
  • 2094 Views
  • 3 replies
  • 0 kudos
Latest Reply
MohammadWasi
New Contributor II
  • 0 kudos

Hi @Retired_mod I changed file in .xlsx format but again got the same error as above.

  • 0 kudos
2 More Replies
ashraf1395
by Valued Contributor
  • 2195 Views
  • 2 replies
  • 1 kudos

Resolved! Starting Serverless sql cluster on GCP

Hello there,I am trying to start a serverless databricks SQL cluster in GCP. I am following this databricks doc: https://docs.gcp.databricks.com/en/admin/sql/serverless.htmlI have checked that all my requirements are fulfilled for activating the clus...

Screenshot 2024-05-07 113120.png Screenshot 2024-05-07 113137.png
  • 2195 Views
  • 2 replies
  • 1 kudos
Latest Reply
ashraf1395
Valued Contributor
  • 1 kudos

I had another question. Though not related to this thread.Do databricks has any plan for startups, like they have normal free trial

  • 1 kudos
1 More Replies
jainshasha
by New Contributor III
  • 8040 Views
  • 11 replies
  • 2 kudos

Job Cluster in Databricks workflow

Hi,I have configured 20 different workflows in Databricks. All of them configured with job cluster with different name. All 20 workfldows scheduled to run at same time. But even configuring different job cluster in all of them they run sequentially w...

  • 8040 Views
  • 11 replies
  • 2 kudos
Latest Reply
emora
New Contributor III
  • 2 kudos

Honestly you shouldn't have any kind of limitation executing diferent workflows.I did a test case in my Databricks and if you have your workflows with a job cluster your shouldn't have limitation. But I did all my test in Azure and just for you to kn...

  • 2 kudos
10 More Replies
Anske
by New Contributor III
  • 2981 Views
  • 3 replies
  • 1 kudos

Resolved! DLT apply_changes applies only deletes and inserts not updates

Hi,I have a DLT pipeline that applies changes from a source table (cdctest_cdc_enriched) to a target table (cdctest), by the following code:dlt.apply_changes(    target = "cdctest",    source = "cdctest_cdc_enriched",    keys = ["ID"],    sequence_by...

Data Engineering
Delta Live Tables
  • 2981 Views
  • 3 replies
  • 1 kudos
namankhamesara
by New Contributor II
  • 765 Views
  • 0 replies
  • 0 kudos

Error while running Databricks modules

Hi Databricks Community,I am following https://customer-academy.databricks.com/learn/course/1266/data-engineering-with-databricks?generated_by=575333&hash=6edddab97f2f528922e2d38d8e4440cda4e5302a this course provided by databricks. In this when I am ...

namankhamesara_0-1715054731073.png
Data Engineering
databrickscommunity
  • 765 Views
  • 0 replies
  • 0 kudos
halox6000
by New Contributor III
  • 980 Views
  • 0 replies
  • 0 kudos

How do i stop pyspark from outputting text

I am using a tqdm progress bar to monitor the amount of data records I have collected via API. I am temporarily writing them to a file in the DBFS, then uploading to a Spark DataFrame. Each time I write to a file, I get a message like 'Wrote 8873925 ...

  • 980 Views
  • 0 replies
  • 0 kudos
MrD
by New Contributor
  • 884 Views
  • 1 replies
  • 0 kudos

Issue with autoscalling the cluster

Hi All, My job is breaking as the cluster is not able to autoscale. below is the log,can it be due to AWS vms are not spinning up or can be due to issue databricks configuration.Does anyone has faced it before ?TERMINATING Compute terminated. Reason:...

  • 884 Views
  • 1 replies
  • 0 kudos
Latest Reply
koushiknpvs
New Contributor III
  • 0 kudos

Hey MrD,I faced this issue while running Azure VMs. A restart and re atatching the cluster helped me. Please let me know if that works for you.

  • 0 kudos
Wolfoflag
by New Contributor II
  • 2487 Views
  • 1 replies
  • 0 kudos

Threads vs Processes (Parallel Programming) Databricks

Hi Everyone,I am trying to implement parallel processing in databricks and all the resources online point to using ThreadPool from the pythons multiprocessing.pool library or concurrent future library. These libraries offer methods for creating async...

  • 2487 Views
  • 1 replies
  • 0 kudos
Latest Reply
Wojciech_BUK
Valued Contributor III
  • 0 kudos

I am not super expert but I have been using databricks for a while and I can say that - when you use any Python library like asyncio, ThredPool and so one - this is good only to some maintenance things, small api calls etc.When you want to leverage s...

  • 0 kudos
digui
by New Contributor
  • 5543 Views
  • 3 replies
  • 0 kudos

Issues when trying to modify log4j.properties

Hi y'all.​I'm trying to export metrics and logs to AWS cloudwatch, but while following their tutorial to do so, I ended up facing this error when trying to initialize my cluster with an init script they provided.This is the part where the script fail...

  • 5543 Views
  • 3 replies
  • 0 kudos
Latest Reply
cool_cool_cool
New Contributor II
  • 0 kudos

@digui Did you figure out what to do? We're facing the same issue, the script works for the executors.I was thinking on adding an if that checks if there is log4j.properties and modify it only if it exists

  • 0 kudos
2 More Replies
ashraf1395
by Valued Contributor
  • 6485 Views
  • 1 replies
  • 1 kudos

Optimising Clusters in Databricks on GCP

Hi there everyone,We are trying to get hands on Databricks Lakehouse for a prospective client's project.Our Major aim for the project is to Compare Datalakehosue on Databricks and Bigquery Datawarehouse in terms of Costs and time to setup and run que...

  • 6485 Views
  • 1 replies
  • 1 kudos
smedegaard
by New Contributor III
  • 1870 Views
  • 1 replies
  • 0 kudos

DLT run filas with "com.databricks.cdc.spark.DebeziumJDBCMicroBatchProvider not found"

I've created a streaming live table from a foreign catalog. When I run the DLT pipeline it fils with "com.databricks.cdc.spark.DebeziumJDBCMicroBatchProvider not found".I haven't seen any documentation that suggests I need to install Debezium manuall...

  • 1870 Views
  • 1 replies
  • 0 kudos
MartinH
by New Contributor II
  • 13301 Views
  • 7 replies
  • 5 kudos

Resolved! Azure Data Factory and Photon

Hello, we have Databricks Python workbooks accessing Delta tables. These workbooks are scheduled/invoked by Azure Data Factory. How can I enable Photon on the linked services that are used to call Databricks?If I specify new job cluster, there does n...

  • 13301 Views
  • 7 replies
  • 5 kudos
Latest Reply
CharlesReily
New Contributor III
  • 5 kudos

When you create a cluster on Databricks, you can enable Photon by selecting the "Photon" option in the cluster configuration settings. This is typically done when creating a new cluster, and you would find the option in the advanced cluster configura...

  • 5 kudos
6 More Replies
dbdude
by New Contributor II
  • 11444 Views
  • 3 replies
  • 1 kudos

AWS Secrets Works In One Cluster But Not Another

Why can I use boto3 to go to secrets manager to retrieve a secret with a personal cluster but I get an error with a shared cluster?NoCredentialsError: Unable to locate credentials 

  • 11444 Views
  • 3 replies
  • 1 kudos
Latest Reply
Husky
New Contributor III
  • 1 kudos

Hey @dbdude, I am facing the same error. Did you find a solution to access the AWS credentials on a Shared Cluster?This article describes a way of storing credentials in a Unity Catalog Volume to fetch by the Shared Cluster:https://medium.com/@amluci...

  • 1 kudos
2 More Replies
mamiya
by New Contributor II
  • 1139 Views
  • 1 replies
  • 0 kudos

ODBC PowerBI 2 commands in one query

 Hello everyone,I'm trying to use the ODBC DirectQuery option in PowerBI, but I keep getting an error about another command. The SQL query works while using the SQL Editor. Do I need to change the setup of my ODBC connector?DECLARE dateFrom DATE = DA...

mamiya_0-1714651686806.png mamiya_3-1714651948145.png
  • 1139 Views
  • 1 replies
  • 0 kudos
Deepak_Kandpal
by New Contributor III
  • 10565 Views
  • 3 replies
  • 3 kudos

Resolved! Invalid configuration value detected for fs.azure.account.key with com.crealytics:spark-excel

I have setup my Databricks notebook to use Service Principal to access ADLS using below configuration.service_credential = dbutils.secrets.get(scope="<scope>",key="<service-credential-key>")   spark.conf.set("fs.azure.account.auth.type.<storage-accou...

  • 10565 Views
  • 3 replies
  • 3 kudos
Latest Reply
Harsha_Dbrs
New Contributor II
  • 3 kudos

Below is the implementation of same code in scala:spark.sparkContext.hadoopConfiguration.set("fs.azure.account.key.<accountName>.dfs.core.windows.net",<accountKey>)

  • 3 kudos
2 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels