cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Coders
by New Contributor II
  • 2591 Views
  • 1 replies
  • 0 kudos

How to do perform deep clone for data migration from one Datalake to another?

 I'm attempting to migrate data from Azure Data Lake to S3 using deep clone. The data in the source Data Lake is stored in Parquet format and partitioned. I've tried to follow the documentation from Databricks, which suggests that I need to register ...

  • 2591 Views
  • 1 replies
  • 0 kudos
chakradhar545
by New Contributor
  • 980 Views
  • 0 replies
  • 0 kudos

DatabricksThrottledException Error

Hi,Our scheduled job runs into below error once in a while and job fails. Any leads or thoughts please why we run into this once in a while and how to fix it pleaseshaded.databricks.org.apache.hadoop.fs.s3a.DatabricksThrottledException: Instantiate s...

  • 980 Views
  • 0 replies
  • 0 kudos
Poonam17
by New Contributor II
  • 1156 Views
  • 1 replies
  • 2 kudos

Not able to deploy cluster in databricks community edition

 Hello team, I am not able to launch databricks cluster in community edition. automatically its getting terminated. Can someone please help here ? Regards.,poonam

IMG_6296.jpeg
  • 1156 Views
  • 1 replies
  • 2 kudos
Latest Reply
kakalouk
New Contributor II
  • 2 kudos

I face the exact same problem. The message i get is this:"Bootstrap Timeout:Node daemon ping timeout in 780000 ms for instance i-062042a9d4be8725e @ 10.172.197.194. Please check network connectivity between the data plane and the control plane."

  • 2 kudos
yatharth
by New Contributor III
  • 1138 Views
  • 1 replies
  • 0 kudos

LZO codec not working for graviton instances

Hi databricks:I have a job where I am saving my data in json format lzo compressed which requires the library lzo-codecon shifting to graviton instances I noticed that the same job started throwing exceptionCaused by: java.lang.RuntimeException: nati...

  • 1138 Views
  • 1 replies
  • 0 kudos
Latest Reply
yatharth
New Contributor III
  • 0 kudos

For more context, Please use the following code to replicate the error:# Create a Python list containing JSON objectsjson_data = [    {        "id": 1,        "name": "John",        "age": 25    },    {        "id": 2,        "name": "Jane",        "...

  • 0 kudos
Serhii
by Contributor
  • 9912 Views
  • 7 replies
  • 4 kudos

Resolved! Saving complete notebooks to GitHub from Databricks repos.

When saving notebook to GiHub repo, it is stripped to Python source code. Is it possible to save it in the ipynb formt?

  • 9912 Views
  • 7 replies
  • 4 kudos
Latest Reply
GlennStrycker
New Contributor III
  • 4 kudos

When I save+commit+push my .ipynb file to my linked git repo, I noticed that only the cell inputs are saved, not the output.  This differs from the .ipynb file I get when I choose "File / Export / iPython Notebook".  Is there a way to save the cell o...

  • 4 kudos
6 More Replies
GlennStrycker
by New Contributor III
  • 3036 Views
  • 1 replies
  • 0 kudos

Resolved! Saving ipynb notebooks to git does not include output cells -- differs from export

When I save+commit+push my .ipynb file to my linked git repo, I noticed that only the cell inputs are saved, not the output.  This differs from the .ipynb file I get when I choose "File / Export / iPython Notebook".  Is there a way to save the cell o...

  • 3036 Views
  • 1 replies
  • 0 kudos
Latest Reply
GlennStrycker
New Contributor III
  • 0 kudos

I may have figured this out.  You need to allow output in the settings, which will add a .databricks file to your repo, then you'll need to edit the options on your notebook and/or edit the .databricks file to allow all outputs.

  • 0 kudos
YS1
by Contributor
  • 3218 Views
  • 1 replies
  • 0 kudos

ModuleNotFoundError: No module named 'pulp'

Hello,I'm encountering an issue while running a notebook that utilizes the Pulp library. The library is installed in the first cell of the notebook. Occasionally, I encounter the following error:  org.apache.spark.SparkException: Job aborted due to s...

Data Engineering
Data_Engineering
module_not_found
  • 3218 Views
  • 1 replies
  • 0 kudos
Martinitus
by New Contributor III
  • 956 Views
  • 1 replies
  • 0 kudos

AnalysisException: [ROW_LEVEL_SECURITY_FEATURE_NOT_SUPPORTED.CHECK_CONSTRAINT]

I just tried to setup a row filter via the following two sql snippets:create function if not exists foo.my_test.row_filter (batch_id bigint) return TRUE;alter table foo.my_test.some_table set row filter foo.my_test.row_filter on (batch_id); This resu...

  • 956 Views
  • 1 replies
  • 0 kudos
Latest Reply
Martinitus
New Contributor III
  • 0 kudos

 To be fair, row filters and the check constraints feature are in Public Preview, so I apologize for the slightly harsh words above!

  • 0 kudos
Cblunck
by New Contributor II
  • 3991 Views
  • 3 replies
  • 0 kudos

New to databricks SQL - where clause issue

Hello community,Using Databricks SQL for the first time and I was hoping I could just copy and past my queries from SSMS across and update the table names, but it's not working.Found it's the where statement, which I updated the ' ' to " " but still ...

image.png
  • 3991 Views
  • 3 replies
  • 0 kudos
Latest Reply
justinghavami
New Contributor II
  • 0 kudos

Hi, were you able to get this figured out? I am having the same issue.

  • 0 kudos
2 More Replies
Martinitus
by New Contributor III
  • 2934 Views
  • 4 replies
  • 1 kudos

reading a tab separated CSV quietly drops empty rows

I already reported that as a Bug to the official Spark bug tracker: https://issues.apache.org/jira/browse/SPARK-46876A short summary: When reading a tab separated file, that has lines that only contain of tabs, then this line will not show up in the ...

  • 2934 Views
  • 4 replies
  • 1 kudos
Latest Reply
Martinitus
New Contributor III
  • 1 kudos

@Lakshay Do you know any way to speed up the github merge/review process? The issue has a proposed fix since more than 4 weeks now, but no one seems to care...

  • 1 kudos
3 More Replies
Maxi1693
by New Contributor II
  • 2924 Views
  • 4 replies
  • 1 kudos

Monitoring structure streaming in externar sink

Hi! Today working trying to collect some metrics to create a splot in my spark structure streaming. It is configured with a trigger(processingTime="30 seconds") and I am trying to collect data with the following Listener Class (just an example).  # D...

Screenshot 2024-03-08 113453.png
  • 2924 Views
  • 4 replies
  • 1 kudos
Latest Reply
MichTalebzadeh
Valued Contributor
  • 1 kudos

Hi,I have done further investigation on this.Below I have tried to illustrate the issue through PySpark code def onQueryProgress(self, event): print("onQueryProgress") # Access micro-batch data microbatch_data = event.progre...

  • 1 kudos
3 More Replies
Neha_Gupta
by New Contributor II
  • 1489 Views
  • 1 replies
  • 0 kudos

Job Concurrency Queue not working as expected

Hi, We have created a Databricks Jobs in Workflows where concurrent runs are set to 10 and the queue is enabled. We were trying to perform concurrent users testing by triggering 100 job runs using Jmeter script. We have observed that the first 10 job...

  • 1489 Views
  • 1 replies
  • 0 kudos
_databreaks
by New Contributor II
  • 949 Views
  • 0 replies
  • 0 kudos

Autolodaer schemaHints convert valid values to null

I am ingesting json files from S3 using Autoloader and would like to use schemaHints to define the datatype of one of the fields, that is, I wanted the field id to be of integer type.The DLT code below infers the the id as string, with correct values...

  • 949 Views
  • 0 replies
  • 0 kudos
PassionateDBD
by New Contributor II
  • 2628 Views
  • 0 replies
  • 0 kudos

MLOps + DLT

What are the best practices for using MLOps and DLT together?This page https://learn.microsoft.com/en-us/azure/databricks/compute/access-mode-limitations states that You cannot use a single user cluster to query tables created by a Unity Catalog-enab...

  • 2628 Views
  • 0 replies
  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels