cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

turagittech
by Contributor
  • 11980 Views
  • 2 replies
  • 1 kudos

PYODBC very slow - 30 minutes to write 6000 rows

Along withh several other issues I'm encountering, I am finding pandas dataframe to_sql being very slowI am writing to an Azure SQL database and performance is woeful. This is a test database and it has S3 100DTU and one user, me as it's configuratio...

  • 11980 Views
  • 2 replies
  • 1 kudos
Latest Reply
Vidula
Databricks Partner
  • 1 kudos

Hi @Peter McLarty​ Does @Debayan Mukherjee​  response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

  • 1 kudos
1 More Replies
dibo
by New Contributor II
  • 1090 Views
  • 0 replies
  • 0 kudos

I can't login to https://community.cloud.databricks.com/login.html

Now, I can't login to https://community.cloud.databricks.com/login.html with the correct username and password, later I click the button to reset my password and I receive the email for modifying password, I have modified password, But I still can't ...

  • 1090 Views
  • 0 replies
  • 0 kudos
SamSteere
by New Contributor III
  • 3134 Views
  • 3 replies
  • 6 kudos

docs.databricks.com

REST API Documentation is out of date since the release of Delta Live TablesWhen using the `2.0/clusters/list` endpoint in an environment with running clusters provisioned by DLTs, the clusters will be returned with a `cluster_source` value of `PIPEL...

  • 3134 Views
  • 3 replies
  • 6 kudos
Latest Reply
Vidula
Databricks Partner
  • 6 kudos

Hi @Sam Steere​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 6 kudos
2 More Replies
alejandrofm
by Valued Contributor
  • 6137 Views
  • 7 replies
  • 1 kudos

Improve dowload speed or see download progress Python-Databricks SQL

Hi! I'm using the code from here to execute a query on Databricks, it goes flawlessly, can follow it from the Spark UI, etc. The problem here is at the moment it seems the download of the result (spark is idle, there is a green check in the query his...

  • 6137 Views
  • 7 replies
  • 1 kudos
Latest Reply
Vidula
Databricks Partner
  • 1 kudos

Hi @Alejandro Martinez​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you...

  • 1 kudos
6 More Replies
data_boy_2022
by New Contributor III
  • 19848 Views
  • 7 replies
  • 3 kudos

Data ingest of csv files from S3 using Autoloader is slow

I have 150k small csv files (~50Mb) stored in S3 which I want to load into a delta table.All CSV files are stored in the following structure in S3:bucket/folder/name_00000000_00000100.csvbucket/folder/name_00000100_00000200.csvThis is the code I use ...

Cluster Metrics SparkUI_DAG SparkUI_Job
  • 19848 Views
  • 7 replies
  • 3 kudos
Latest Reply
Vidula
Databricks Partner
  • 3 kudos

Hi @Jan R​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 3 kudos
6 More Replies
Nid
by New Contributor
  • 1463 Views
  • 1 replies
  • 0 kudos

badge not received for Databricks Lakehouse Fundamentals Accreditation

Hi,I have cleared the assessment for Databricks Lakehouse Fundamentals Accreditationbut yet to received a badge. Kindly assist me with this

  • 1463 Views
  • 1 replies
  • 0 kudos
Latest Reply
Vidula
Databricks Partner
  • 0 kudos

Hi @Nidhi kawale​ Thank you for reaching out!Let us look into this for you, and we will get back to you with an update.Kindly, share your email id at community@databricks.com.

  • 0 kudos
Bit-Warrior
by New Contributor
  • 1001 Views
  • 0 replies
  • 0 kudos

Installing System ML on the cluster

I am trying to install the systemml package from Maven, I ignored the librarieslog4j:log4j, com:sun.jdmk, com:sun.jmx, javax:jmsBut when I run one command of systemml, then spark/databricks can no longer select from tables, effectively breaking somet...

  • 1001 Views
  • 0 replies
  • 0 kudos
parthsalvi
by Contributor
  • 2147 Views
  • 0 replies
  • 0 kudos

Few sparks apis not working in DBR 11.2, 10.4 LTS Shared Mode (custom vpc) like df.tail, df.rdd.map

We're trying to use DBR 11.2 & 10.4LTS in Shared mode on a customer managed vpc. But we're running into following issues Is this issue related to our customer managed VPC setup or is it specific to DBR 11.2.Same issue also seen in DBR 11.1 and 10.4 L...

Screenshot 2022-09-16 at 9.09.58 PM
  • 2147 Views
  • 0 replies
  • 0 kudos
nancy_g
by New Contributor III
  • 7052 Views
  • 4 replies
  • 5 kudos
  • 7052 Views
  • 4 replies
  • 5 kudos
Latest Reply
Rostislaw
Databricks Partner
  • 5 kudos

Right now the feature seems to be public available. It is possible to schedule jobs with ADLS passthough enabled and do not have to provide service principal credentials.However I ask myself how that works behind the scenses. When working interactive...

  • 5 kudos
3 More Replies
amit
by New Contributor II
  • 2167 Views
  • 2 replies
  • 0 kudos

www.databricks.com

Hi @Lindsay Olson​ ,I have attended the virtual instructor-led training on 23-08-2022 (https://www.databricks.com/p/webinar/databricks-lakehouse-fundamentals-learning-plan). As per the conditions mentioned, I have completed all of the steps for getti...

  • 2167 Views
  • 2 replies
  • 0 kudos
Latest Reply
amit
New Contributor II
  • 0 kudos

Thanks @Lindsay Olson​ . Yes issue has been resolved,

  • 0 kudos
1 More Replies
BradSheridan
by Databricks Partner
  • 3326 Views
  • 1 replies
  • 0 kudos

using a UDF in a Windows function

I have created a UDF using:%sqlCREATE OR REPLACE FUNCTION f_timestamp_max()....And I've confirmed it works with:%sqlselect f_timestamp_max()But when I try to use it in a Window function (lead over partition), I get:AnalysisException: Using SQL functi...

  • 3326 Views
  • 1 replies
  • 0 kudos
Latest Reply
Debayan
Databricks Employee
  • 0 kudos

Hi, As of now, Spark SQL supports three kinds of window functions: ranking functions. analytic functions. aggregate functions. Please refer: https://docs.databricks.com/sql/language-manual/sql-ref-window-functions.html#parameters

  • 0 kudos
Haima
by New Contributor
  • 1023 Views
  • 0 replies
  • 0 kudos

FileNotFoundError: [Errno 2] /dbfs/fileone.csv

I'm trying to transfer my csv file from databricks to sftp but i'm getting file not found error.here is my code:file_size = sftp.stat("/dbfs/fileone.csv").st_sizewith open("/dbfs/fileone.csv", "rb") as fl:return self.putfo(fl, Destinationpath, file_s...

  • 1023 Views
  • 0 replies
  • 0 kudos
brickster_2018
by Databricks Employee
  • 7835 Views
  • 3 replies
  • 0 kudos

Resolved! How many notebooks/jobs can I run in parallel on a Databricks cluster?

Is there a limit on it and is the limit configurable?

  • 7835 Views
  • 3 replies
  • 0 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 0 kudos

There is a hard limit of 145 active execution contexts on a Cluster. This is to ensure the cluster is not overloaded with too many parallel threads starving for resources. The limit is not configurable. If there are more than 145 parallel jobs to be ...

  • 0 kudos
2 More Replies
Labels