cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

turagittech
by New Contributor III
  • 8978 Views
  • 2 replies
  • 1 kudos

PYODBC very slow - 30 minutes to write 6000 rows

Along withh several other issues I'm encountering, I am finding pandas dataframe to_sql being very slowI am writing to an Azure SQL database and performance is woeful. This is a test database and it has S3 100DTU and one user, me as it's configuratio...

  • 8978 Views
  • 2 replies
  • 1 kudos
Latest Reply
Vidula
Honored Contributor
  • 1 kudos

Hi @Peter McLarty​ Does @Debayan Mukherjee​  response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

  • 1 kudos
1 More Replies
dibo
by New Contributor II
  • 846 Views
  • 0 replies
  • 0 kudos

I can't login to https://community.cloud.databricks.com/login.html

Now, I can't login to https://community.cloud.databricks.com/login.html with the correct username and password, later I click the button to reset my password and I receive the email for modifying password, I have modified password, But I still can't ...

  • 846 Views
  • 0 replies
  • 0 kudos
SamSteere
by New Contributor III
  • 2387 Views
  • 3 replies
  • 6 kudos

docs.databricks.com

REST API Documentation is out of date since the release of Delta Live TablesWhen using the `2.0/clusters/list` endpoint in an environment with running clusters provisioned by DLTs, the clusters will be returned with a `cluster_source` value of `PIPEL...

  • 2387 Views
  • 3 replies
  • 6 kudos
Latest Reply
Vidula
Honored Contributor
  • 6 kudos

Hi @Sam Steere​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 6 kudos
2 More Replies
alejandrofm
by Valued Contributor
  • 4370 Views
  • 7 replies
  • 1 kudos

Improve dowload speed or see download progress Python-Databricks SQL

Hi! I'm using the code from here to execute a query on Databricks, it goes flawlessly, can follow it from the Spark UI, etc. The problem here is at the moment it seems the download of the result (spark is idle, there is a green check in the query his...

  • 4370 Views
  • 7 replies
  • 1 kudos
Latest Reply
Vidula
Honored Contributor
  • 1 kudos

Hi @Alejandro Martinez​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you...

  • 1 kudos
6 More Replies
data_boy_2022
by New Contributor III
  • 15788 Views
  • 7 replies
  • 3 kudos

Data ingest of csv files from S3 using Autoloader is slow

I have 150k small csv files (~50Mb) stored in S3 which I want to load into a delta table.All CSV files are stored in the following structure in S3:bucket/folder/name_00000000_00000100.csvbucket/folder/name_00000100_00000200.csvThis is the code I use ...

Cluster Metrics SparkUI_DAG SparkUI_Job
  • 15788 Views
  • 7 replies
  • 3 kudos
Latest Reply
Vidula
Honored Contributor
  • 3 kudos

Hi @Jan R​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 3 kudos
6 More Replies
Nid
by New Contributor
  • 1141 Views
  • 1 replies
  • 0 kudos

badge not received for Databricks Lakehouse Fundamentals Accreditation

Hi,I have cleared the assessment for Databricks Lakehouse Fundamentals Accreditationbut yet to received a badge. Kindly assist me with this

  • 1141 Views
  • 1 replies
  • 0 kudos
Latest Reply
Vidula
Honored Contributor
  • 0 kudos

Hi @Nidhi kawale​ Thank you for reaching out!Let us look into this for you, and we will get back to you with an update.Kindly, share your email id at community@databricks.com.

  • 0 kudos
Bit-Warrior
by New Contributor
  • 781 Views
  • 0 replies
  • 0 kudos

Installing System ML on the cluster

I am trying to install the systemml package from Maven, I ignored the librarieslog4j:log4j, com:sun.jdmk, com:sun.jmx, javax:jmsBut when I run one command of systemml, then spark/databricks can no longer select from tables, effectively breaking somet...

  • 781 Views
  • 0 replies
  • 0 kudos
parthsalvi
by Contributor
  • 1747 Views
  • 0 replies
  • 0 kudos

Few sparks apis not working in DBR 11.2, 10.4 LTS Shared Mode (custom vpc) like df.tail, df.rdd.map

We're trying to use DBR 11.2 & 10.4LTS in Shared mode on a customer managed vpc. But we're running into following issues Is this issue related to our customer managed VPC setup or is it specific to DBR 11.2.Same issue also seen in DBR 11.1 and 10.4 L...

Screenshot 2022-09-16 at 9.09.58 PM
  • 1747 Views
  • 0 replies
  • 0 kudos
nancy_g
by New Contributor III
  • 5902 Views
  • 4 replies
  • 5 kudos
  • 5902 Views
  • 4 replies
  • 5 kudos
Latest Reply
Rostislaw
New Contributor III
  • 5 kudos

Right now the feature seems to be public available. It is possible to schedule jobs with ADLS passthough enabled and do not have to provide service principal credentials.However I ask myself how that works behind the scenses. When working interactive...

  • 5 kudos
3 More Replies
amit
by New Contributor II
  • 1504 Views
  • 2 replies
  • 0 kudos

www.databricks.com

Hi @Lindsay Olson​ ,I have attended the virtual instructor-led training on 23-08-2022 (https://www.databricks.com/p/webinar/databricks-lakehouse-fundamentals-learning-plan). As per the conditions mentioned, I have completed all of the steps for getti...

  • 1504 Views
  • 2 replies
  • 0 kudos
Latest Reply
amit
New Contributor II
  • 0 kudos

Thanks @Lindsay Olson​ . Yes issue has been resolved,

  • 0 kudos
1 More Replies
BradSheridan
by Valued Contributor
  • 2594 Views
  • 1 replies
  • 0 kudos

using a UDF in a Windows function

I have created a UDF using:%sqlCREATE OR REPLACE FUNCTION f_timestamp_max()....And I've confirmed it works with:%sqlselect f_timestamp_max()But when I try to use it in a Window function (lead over partition), I get:AnalysisException: Using SQL functi...

  • 2594 Views
  • 1 replies
  • 0 kudos
Latest Reply
Debayan
Databricks Employee
  • 0 kudos

Hi, As of now, Spark SQL supports three kinds of window functions: ranking functions. analytic functions. aggregate functions. Please refer: https://docs.databricks.com/sql/language-manual/sql-ref-window-functions.html#parameters

  • 0 kudos
Haima
by New Contributor
  • 751 Views
  • 0 replies
  • 0 kudos

FileNotFoundError: [Errno 2] /dbfs/fileone.csv

I'm trying to transfer my csv file from databricks to sftp but i'm getting file not found error.here is my code:file_size = sftp.stat("/dbfs/fileone.csv").st_sizewith open("/dbfs/fileone.csv", "rb") as fl:return self.putfo(fl, Destinationpath, file_s...

  • 751 Views
  • 0 replies
  • 0 kudos
brickster_2018
by Databricks Employee
  • 6776 Views
  • 3 replies
  • 0 kudos

Resolved! How many notebooks/jobs can I run in parallel on a Databricks cluster?

Is there a limit on it and is the limit configurable?

  • 6776 Views
  • 3 replies
  • 0 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 0 kudos

There is a hard limit of 145 active execution contexts on a Cluster. This is to ensure the cluster is not overloaded with too many parallel threads starving for resources. The limit is not configurable. If there are more than 145 parallel jobs to be ...

  • 0 kudos
2 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels