cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

liamod_1
by New Contributor III
  • 30060 Views
  • 11 replies
  • 9 kudos

Resolved! Failure starting repl

Hi, we have several clusters that keep giving this error:Failure starting repl. Try detaching and re-attaching the notebook.All the investigation I've done points to this issue being related to the number of concurrent connections but we only have 1 ...

  • 30060 Views
  • 11 replies
  • 9 kudos
Latest Reply
Kareemlowe46
New Contributor II
  • 9 kudos

After some initial skepticism, Barker agreed to give Plinko https://plnkgame.com a try. The game was an instant hit with both the audience and the contestants. The concept was simple but exciting - players would drop a disc down a large pegboard, and...

  • 9 kudos
10 More Replies
ChingizK
by New Contributor II
  • 850 Views
  • 1 replies
  • 0 kudos

Use Python code from a remote Git repository

I'm trying to create a task where the source is a Python script located in remote GitLab repo. I'm following the instructions HERE and this is how I have the task set up:However, no matter what path I specify all I get is the error below:Cannot read ...

03.png
  • 850 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @ChingizK, The issue you are experiencing might be because you are starting your path with a /. According to the provided information, when you enter the relative path, you should not begin it with / or ./. For example, if the absolute path for th...

  • 0 kudos
space25
by New Contributor
  • 1304 Views
  • 1 replies
  • 0 kudos

I am trying to use SQL join to combine 3 tables but the execution does not go beyond 93 million rows

Hi all,I ran a code to join 3 tables in Azure Databricks using SQL. When I run the code it is indicated "93 million rows read (1GB). It will be showing me " and does not go beyond this. Who knows what the issue could be?  

SQL Join Query.JPG State.JPG
Data Engineering
Azure Databricks SQL Databricks Join
  • 1304 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @space25 , The issue you're facing could be due to a variety of reasons. It's hard to pinpoint the exact cause without more details, but here are a few possibilities. 1. **Large Volume of Data**: The operation might be taking a long time due to t...

  • 0 kudos
Arihant
by New Contributor
  • 2357 Views
  • 1 replies
  • 0 kudos

Unable to login to Databricks Community Edition

Hello All,I have successfully created a databricks account and went to login to the community edition with the exact same login credentials as my account, but it tells me that the email/password are invalid. I can login with these same exact credenti...

  • 2357 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Arihant ,  Please look at this link related to the Community - Edition, which might solve your problem.   I appreciate your interest in sharing your Community-Edition query with us. However, at this time, we are not entertaining any Community-Edi...

  • 0 kudos
niladri
by New Contributor
  • 476 Views
  • 1 replies
  • 0 kudos

how to connect to aws elasticsearch of another account from databricks

Hi -  I have tried my level best to go through both elasticsearch documentation as well as Databricks documentation to get an answer for my question - is it possible to connect to AWS elasticsearch of a different AWS account from Databricks? I did no...

  • 476 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @niladri , It's possible to connect to AWS Elasticsearch of a different AWS account from Databricks.- The error is related to permissions, indicating the user or role lacks necessary access permissions.- To solve this, use AWS SDK boto3 to assume ...

  • 0 kudos
DanBrown
by New Contributor
  • 653 Views
  • 1 replies
  • 0 kudos

Remove WHERE 1=0

I am hoping someone can help me remove the WHERE 1=0 that is constantly getting added onto the end of my Query (see below).  Please let me know if I can provide more info here.This is running a notebook, in Azure Databricks against a cluster that has...

  • 653 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @DanBrown , The WHERE 1=0 clause is being added to your query by the Spark SQL engine during the query planning phase. This is a common optimization technique used to create an empty DataFrame with the same schema as the original data source. It'...

  • 0 kudos
RiyuLite
by New Contributor III
  • 912 Views
  • 1 replies
  • 0 kudos

How to retrieve cluster IDs of a deleted All Purpose cluster ?

I need to retrieve the event logs of deleted All Purpose clusters of a certain workspace.databricks list API ({workspace_url}/api/2.0/clusters/list) provides me with the list of all active/terminated clusters but not the clusters that are deleted. I ...

  • 912 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @RiyuLite, To retrieve the event logs of deleted All Purpose clusters without using the root account details, you can use Databricks audit logs. These logs record the activities in your workspace, allowing you to monitor detailed Databricks usage ...

  • 0 kudos
Divyanshu
by New Contributor
  • 1098 Views
  • 1 replies
  • 0 kudos

java.lang.ArithmeticException: long overflow Exception while writing to table | pyspark

Hey ,I am trying to fetch data from mongo and write to databricks table.I have read data from mongo using pymongo library, then flattened nested struct objects along with renaming columns(since there were few duplicates) and then writing to databrick...

  • 1098 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Divyanshu ,  The error message "org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 12.0 failed 4 times, most recent failure: Lost task 2.3 in stage 12.0 (TID 53) (192.168.23.122 executor 0): org.apache.spark.SparkR...

  • 0 kudos
Alex006
by Contributor
  • 452 Views
  • 1 replies
  • 1 kudos

Resolved! Does DLT use one single SparkSession?

Hi! Does DLT use one single SparkSession for all notebooks in a Delta Live Tables Pipeline?

Data Engineering
Delta Live Tables
dlt
SparkSession
  • 452 Views
  • 1 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @Alex006 , No, a Delta Live Tables (DLT) pipeline does not use a single SparkSession for all notebooks. DLT evaluates and runs all code defined in notebooks but has a different execution model than a notebook 'Run all' command. You cannot rely on ...

  • 1 kudos
Gilg
by Contributor II
  • 492 Views
  • 1 replies
  • 0 kudos

Add data manually to DLT

Hi Team,Is there a way that we can add data manually to the tables that are generated by DLT?We have done a PoC using DLT for Sep 15 to current data. Now, that they are happy, they wanted the previous data from Synapse and put into Databricks.I can e...

  • 492 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Gilg, Yes, you can add data manually to the tables generated by DLT (Delta Live Tables). However, it would be best to be careful not to directly modify, add, or delete Parquet data files in a Delta table, as this can lead to lost data or table c...

  • 0 kudos
mike_engineer
by New Contributor
  • 497 Views
  • 1 replies
  • 1 kudos

Window functions in Change Data Feed

Hello!I am currently exploring the possibility of implementing incremental changes in our company's ETL pipeline and looking into Change Data Feed option. There are a couple of challenges I'm uncertain about.For instance, we have a piece of logic lik...

  • 497 Views
  • 1 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @mike_engineer , - Use the Change Data Feed feature in Databricks to track row-level changes in a Delta table.- Change Data Feed records change events for all data written into the table, including row data and metadata. - Use case scenarios:  1. ...

  • 1 kudos
Gilg
by Contributor II
  • 1109 Views
  • 1 replies
  • 0 kudos

DLT: Waiting for resources took a long time

Hi Team,I have a DLT pipeline running in Production for quite some time now. When I check the pipeline, a couple of jobs took longer than expected. Usually, 1 job only took 10-15 minutes to complete with 2 to 3 mins to provision a resource. Then I ha...

Gilg_0-1696540251644.png
  • 1109 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Gilg, The issue you're experiencing with your DLT pipeline could be due to a couple of factors: 1. **Development Optimizations**: As per the Databricks release notes from September 7-13, 2021, new pipelines run in development mode by default. Thi...

  • 0 kudos
AB_MN
by New Contributor III
  • 3217 Views
  • 4 replies
  • 1 kudos

Resolved! Read data from Azure SQL DB

I am trying to read data into a dataframe from Azure SQL DB, using jdbc. Here is the code I am using.driver = "com.microsoft.sqlserver.jdbc.SQLServerDriver"   database_host = "server.database.windows.net" database_port = "1433" database_name = "dat...

  • 3217 Views
  • 4 replies
  • 1 kudos
Latest Reply
AB_MN
New Contributor III
  • 1 kudos

That did the trick. Thank you!

  • 1 kudos
3 More Replies
Hubert-Dudek
by Esteemed Contributor III
  • 683 Views
  • 1 replies
  • 1 kudos

Foreign catalogs

With the introduction of the Unity Catalog in databricks, many of us have become familiar with creating catalogs. However, did you know that the Unity Catalog also allows you to create foreign catalogs? You can register databases from the following s...

db.png
  • 683 Views
  • 1 replies
  • 1 kudos
Latest Reply
jose_gonzalez
Moderator
  • 1 kudos

Thank you for sharing @Hubert-Dudek !!!

  • 1 kudos
Hubert-Dudek
by Esteemed Contributor III
  • 650 Views
  • 1 replies
  • 3 kudos

row-level concurrency

With the introduction of Databricks Runtime 14, you can now enable row-level concurrency using these simple techniques!

row-level.png
  • 650 Views
  • 1 replies
  • 3 kudos
Latest Reply
jose_gonzalez
Moderator
  • 3 kudos

Thank you for sharing this @Hubert-Dudek 

  • 3 kudos
Labels
Top Kudoed Authors