cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

alejandrofm
by Valued Contributor
  • 5500 Views
  • 7 replies
  • 1 kudos

Improve dowload speed or see download progress Python-Databricks SQL

Hi! I'm using the code from here to execute a query on Databricks, it goes flawlessly, can follow it from the Spark UI, etc. The problem here is at the moment it seems the download of the result (spark is idle, there is a green check in the query his...

  • 5500 Views
  • 7 replies
  • 1 kudos
Latest Reply
Vidula
Honored Contributor
  • 1 kudos

Hi @Alejandro Martinez​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you...

  • 1 kudos
6 More Replies
data_boy_2022
by New Contributor III
  • 18370 Views
  • 7 replies
  • 3 kudos

Data ingest of csv files from S3 using Autoloader is slow

I have 150k small csv files (~50Mb) stored in S3 which I want to load into a delta table.All CSV files are stored in the following structure in S3:bucket/folder/name_00000000_00000100.csvbucket/folder/name_00000100_00000200.csvThis is the code I use ...

Cluster Metrics SparkUI_DAG SparkUI_Job
  • 18370 Views
  • 7 replies
  • 3 kudos
Latest Reply
Vidula
Honored Contributor
  • 3 kudos

Hi @Jan R​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 3 kudos
6 More Replies
Nid
by New Contributor
  • 1366 Views
  • 1 replies
  • 0 kudos

badge not received for Databricks Lakehouse Fundamentals Accreditation

Hi,I have cleared the assessment for Databricks Lakehouse Fundamentals Accreditationbut yet to received a badge. Kindly assist me with this

  • 1366 Views
  • 1 replies
  • 0 kudos
Latest Reply
Vidula
Honored Contributor
  • 0 kudos

Hi @Nidhi kawale​ Thank you for reaching out!Let us look into this for you, and we will get back to you with an update.Kindly, share your email id at community@databricks.com.

  • 0 kudos
Bit-Warrior
by New Contributor
  • 935 Views
  • 0 replies
  • 0 kudos

Installing System ML on the cluster

I am trying to install the systemml package from Maven, I ignored the librarieslog4j:log4j, com:sun.jdmk, com:sun.jmx, javax:jmsBut when I run one command of systemml, then spark/databricks can no longer select from tables, effectively breaking somet...

  • 935 Views
  • 0 replies
  • 0 kudos
parthsalvi
by Contributor
  • 2004 Views
  • 0 replies
  • 0 kudos

Few sparks apis not working in DBR 11.2, 10.4 LTS Shared Mode (custom vpc) like df.tail, df.rdd.map

We're trying to use DBR 11.2 & 10.4LTS in Shared mode on a customer managed vpc. But we're running into following issues Is this issue related to our customer managed VPC setup or is it specific to DBR 11.2.Same issue also seen in DBR 11.1 and 10.4 L...

Screenshot 2022-09-16 at 9.09.58 PM
  • 2004 Views
  • 0 replies
  • 0 kudos
nancy_g
by New Contributor III
  • 6601 Views
  • 4 replies
  • 5 kudos
  • 6601 Views
  • 4 replies
  • 5 kudos
Latest Reply
Rostislaw
New Contributor III
  • 5 kudos

Right now the feature seems to be public available. It is possible to schedule jobs with ADLS passthough enabled and do not have to provide service principal credentials.However I ask myself how that works behind the scenses. When working interactive...

  • 5 kudos
3 More Replies
amit
by New Contributor II
  • 1892 Views
  • 2 replies
  • 0 kudos

www.databricks.com

Hi @Lindsay Olson​ ,I have attended the virtual instructor-led training on 23-08-2022 (https://www.databricks.com/p/webinar/databricks-lakehouse-fundamentals-learning-plan). As per the conditions mentioned, I have completed all of the steps for getti...

  • 1892 Views
  • 2 replies
  • 0 kudos
Latest Reply
amit
New Contributor II
  • 0 kudos

Thanks @Lindsay Olson​ . Yes issue has been resolved,

  • 0 kudos
1 More Replies
BradSheridan
by Valued Contributor
  • 3059 Views
  • 1 replies
  • 0 kudos

using a UDF in a Windows function

I have created a UDF using:%sqlCREATE OR REPLACE FUNCTION f_timestamp_max()....And I've confirmed it works with:%sqlselect f_timestamp_max()But when I try to use it in a Window function (lead over partition), I get:AnalysisException: Using SQL functi...

  • 3059 Views
  • 1 replies
  • 0 kudos
Latest Reply
Debayan
Databricks Employee
  • 0 kudos

Hi, As of now, Spark SQL supports three kinds of window functions: ranking functions. analytic functions. aggregate functions. Please refer: https://docs.databricks.com/sql/language-manual/sql-ref-window-functions.html#parameters

  • 0 kudos
Haima
by New Contributor
  • 906 Views
  • 0 replies
  • 0 kudos

FileNotFoundError: [Errno 2] /dbfs/fileone.csv

I'm trying to transfer my csv file from databricks to sftp but i'm getting file not found error.here is my code:file_size = sftp.stat("/dbfs/fileone.csv").st_sizewith open("/dbfs/fileone.csv", "rb") as fl:return self.putfo(fl, Destinationpath, file_s...

  • 906 Views
  • 0 replies
  • 0 kudos
brickster_2018
by Databricks Employee
  • 7445 Views
  • 3 replies
  • 0 kudos

Resolved! How many notebooks/jobs can I run in parallel on a Databricks cluster?

Is there a limit on it and is the limit configurable?

  • 7445 Views
  • 3 replies
  • 0 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 0 kudos

There is a hard limit of 145 active execution contexts on a Cluster. This is to ensure the cluster is not overloaded with too many parallel threads starving for resources. The limit is not configurable. If there are more than 145 parallel jobs to be ...

  • 0 kudos
2 More Replies
data_serf
by New Contributor
  • 11918 Views
  • 3 replies
  • 1 kudos

Resolved! How to integrate java 11 code in Databricks

Hi all,We're trying to attach java libraries which are compiled/packaged using Java 11.After doing some research it looks like even the most recent runtimes use Java 8 which can't run the Java 11 code ("wrong version 55.0, should be 52.0" errors)Is t...

  • 11918 Views
  • 3 replies
  • 1 kudos
Latest Reply
matthewrj
New Contributor II
  • 1 kudos

I have tried setting JNAME=zulu11-ca-amd64 under Cluster > Advanced options > Spark > Environment variables but it doesn't seem to work. I still get errors indicating Java 8 is the JRE and in the Spark UI under "Environment" I still see:Java Home: /u...

  • 1 kudos
2 More Replies
齐木木
by New Contributor III
  • 2887 Views
  • 1 replies
  • 3 kudos

Resolved! The case class reports an error when running in the notebook

As shown in the figure, the case class and the json string are converted through fasterxml.jackson, but an unexpected error occurred during the running of the code. I think this problem may be related to the loading principle of the notebook. Because...

image.png local image
  • 2887 Views
  • 1 replies
  • 3 kudos
Latest Reply
齐木木
New Contributor III
  • 3 kudos

code:var str="{\"app_type\":\"installed-app\"}" import com.fasterxml.jackson.databind.ObjectMapper import com.fasterxml.jackson.module.scala.DefaultScalaModule val mapper = new ObjectMapper() mapper.registerModule(DefaultScalaModule) ...

  • 3 kudos
WBM1
by New Contributor
  • 979 Views
  • 0 replies
  • 0 kudos

wbm.com.pk

WBM is the best online Supermarket in Pakistan provides you with Fast home delivery of your complete grocery, Home Cleaning, Skincare, Baby Products, and Mosquito Repellent Collection.https://wbm.com.pk/

  • 979 Views
  • 0 replies
  • 0 kudos
Deepak_Kandpal
by New Contributor III
  • 8578 Views
  • 3 replies
  • 2 kudos

Resolved! Enable credential passthrough Option is not available in new UI for Job Cluster

Hi All,I am trying to add new workflow which require to use credential passthrough, but when I am trying to create new Job Cluster from Workflow -> Jobs -> My Job, the option of Enable credential passthrough is not available. Is there any other way t...

image
  • 8578 Views
  • 3 replies
  • 2 kudos
Latest Reply
Rostislaw
New Contributor III
  • 2 kudos

assuming your Excel file is located on ADLS you can add a service principal to the cluster configuration. see: https://docs.microsoft.com/en-us/azure/databricks/data/data-sources/azure/azure-storage#--access-azure-data-lake-storage-gen2-or-blob-stora...

  • 2 kudos
2 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels