cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Avinash_Narala
by Databricks Partner
  • 7190 Views
  • 2 replies
  • 0 kudos

Bootstrap Timeout: DURING CLUSTER START

Hi,When I start a cluster, I am getting below error:Bootstrap Timeout:[id: InstanceId(i-05bbcfbb30027ce2c), status: INSTANCE_INITIALIZING, workerEnvId:WorkerEnvId(workerenv-2247916891060257-01b40fb4-3eb1-4a26-99b4-30d6aa0bfe83), lastStatusChangeTime:...

  • 7190 Views
  • 2 replies
  • 0 kudos
Latest Reply
dhtubong
New Contributor II
  • 0 kudos

Hello - if you're using DB Community Edition and having Bootstrap Timeout issue, then below resolution may help.Error: Bootstrap Timeout:Node daemon ping timeout in 780000 ms for instance i-00f21ee2d3ca61424 @ 10.172.245.1. Please check network conne...

  • 0 kudos
1 More Replies
Dick1960
by New Contributor II
  • 4622 Views
  • 3 replies
  • 2 kudos

how to know what is the domain of my databricks workspace

hi,I'm trying to open a support case and it asks me for my domain. in the browser I have: https://adb-27xxxx4341636xxx.5.azuredatabricks.net can you help me ? 

  • 4622 Views
  • 3 replies
  • 2 kudos
Latest Reply
Tharun-Kumar
Databricks Employee
  • 2 kudos

@Dick1960 The numeric value you have in the workspace URL is the domain name.In your case, it would be 27xxxx4341636xxx

  • 2 kudos
2 More Replies
Coders
by New Contributor II
  • 5578 Views
  • 0 replies
  • 0 kudos

New delta log folder is not getting created

I have following code which reads the stream of data and process the data in the foreachBatch and writes to the provided path as shown below.public static void writeToDatalake(SparkSession session, Configuration config, Dataset<Row> data, Entity enti...

  • 5578 Views
  • 0 replies
  • 0 kudos
MikeGo
by Valued Contributor
  • 2262 Views
  • 1 replies
  • 0 kudos

WAL for structured streaming

Hi, I cannot find deep-dive on this from latest links. So far the understanding is:Previously SS (structured streaming) copies and caches the data in WAL. After a version, with retrieve less, SS doesn't copy the data to WAL any more, and only stores ...

  • 2262 Views
  • 1 replies
  • 0 kudos
lilo_z
by New Contributor III
  • 5816 Views
  • 2 replies
  • 0 kudos

Resolved! Databricks Asset Bundles - job specific "run_as" user/service_principle

Was wondering if this was possible, since a use case came up in my team. Would it be possible to use a different service principle for a single job than what is specified for that target environment? For example:bundle: name: hello-bundle resource...

  • 5816 Views
  • 2 replies
  • 0 kudos
Latest Reply
lilo_z
New Contributor III
  • 0 kudos

Found a working solution, posting it here for anyone else hitting the same issue - trick was to redefine "resources" under the target you want to make an exception for:bundle: name: hello_bundle include: - resources/*.yml targets: dev: w...

  • 0 kudos
1 More Replies
dbx-user7354
by New Contributor III
  • 5430 Views
  • 3 replies
  • 4 kudos

Create a Job via SKD with JobSettings Object

Hey, I want to create a Job via the Python SDK with a JobSettings object.import os import time from databricks.sdk import WorkspaceClient from databricks.sdk.service import jobs from databricks.sdk.service.jobs import JobSettings w = WorkspaceClien...

  • 5430 Views
  • 3 replies
  • 4 kudos
Latest Reply
nenetto
New Contributor II
  • 4 kudos

I just faced the same problem. The issue is that the when you do JobSettings.as_dict()the settings are parsed to a dict where all the values are also parsed recursively. When you pass the parameters as **params, the create method again tries to parse...

  • 4 kudos
2 More Replies
nihar_ghude
by New Contributor II
  • 6002 Views
  • 1 replies
  • 0 kudos

OSError: [Errno 107] Transport endpoint is not connected

Hi,I am facing this error when performing write operation in foreach() on a dataframe. The piece of code was working fine for over 3 months but started failing since last week.To give some context, I have a dataframe extract_df which contains 2 colum...

nihar_ghude_0-1710175215407.png
Data Engineering
ADLS
azure
python
spark
  • 6002 Views
  • 1 replies
  • 0 kudos
GOW
by New Contributor II
  • 2562 Views
  • 1 replies
  • 0 kudos

Databricks to s3

I am new to data engineering in Databricks. I need some guidance surrounding Databricks to s3. Can I get an example job or approach to do this?

  • 2562 Views
  • 1 replies
  • 0 kudos
Latest Reply
GOW
New Contributor II
  • 0 kudos

Thank you for the reply. Can I apply this to dbt or using a dbt macro to unload the data? So dbt models running in Databricks?

  • 0 kudos
dasiekr
by New Contributor II
  • 4764 Views
  • 3 replies
  • 0 kudos

Merge operation replaces most of the underlying parquets

Hello,I have the following situation which I would like to fully understand.I have the delta table that consists of 10k active parquet files. Everyday I run merge operation based on new deliveries and joining by product_id key attribute. I checked me...

  • 4764 Views
  • 3 replies
  • 0 kudos
Latest Reply
Ajay-Pandey
Databricks MVP
  • 0 kudos

Hi @dasiekr , Please refer to the below content that might help you -MERGE: Under the hoodDelta Lake completes a MERGE in two steps.Perform an inner join between the target table and source table to select all files that have matches.Perform an outer...

  • 0 kudos
2 More Replies
Gray
by Contributor
  • 60586 Views
  • 24 replies
  • 18 kudos

Resolved! Errors Using Selenium/Chromedriver in DataBricks

Hello,I’m programming in a notebook and attempting to use the python library Selenium to automate Chrome/chromedriver. I’ve successfully managed to install selenium using%sh  pip install seleniumI then attempt the following code, which results in the...

  • 60586 Views
  • 24 replies
  • 18 kudos
Latest Reply
aa_204
Databricks Partner
  • 18 kudos

I also tried the script and am getting similar error. Can anyone please give some resolution for it?Error in Failed to fetch http://archive.ubuntu.com/ubuntu/pool/main/s/systemd/udev_245.4-4ubuntu3.18_amd64.deb and Unable to fetch some archives

  • 18 kudos
23 More Replies
William_Scardua
by Valued Contributor
  • 6462 Views
  • 3 replies
  • 1 kudos

Magic Pip Install Error

Hi guys,I receive that erro when try to use pip install, have any idea ?CalledProcessError Traceback (most recent call last) <command-3492276838775365> in <module> ----> 1 get_ipython().run_line_magic('pip', 'install /dbfs/File...

  • 6462 Views
  • 3 replies
  • 1 kudos
Latest Reply
Bartosz
Databricks Partner
  • 1 kudos

Hi @William_Scardua !I changed the cluster runtime to 10.4 LTS and the error disappeared. Just letting you know, maybe it will help you too!Cheers!

  • 1 kudos
2 More Replies
MikeGo
by Valued Contributor
  • 2369 Views
  • 1 replies
  • 1 kudos

Colon sign operator for JSON

Hi,I have a streaming source loading data to a raw table, which has a string type col (whose value is JSON) to hold all data. I want to use colon sign operator to get fields from the JSON string. Is this going to have some perf issues vs. I use a sch...

  • 2369 Views
  • 1 replies
  • 1 kudos
Latest Reply
MikeGo
Valued Contributor
  • 1 kudos

Thanks Kaniz.Yes, I did some testing. With some schema, I read the same data source and write the parsing results to diff tables. For 586K rows, the perf diff is 9sec vs. 37sec. For 2.3 million rows, 16sec vs. 133sec. 

  • 1 kudos
vemash
by New Contributor
  • 3386 Views
  • 1 replies
  • 0 kudos

How to create a docker image to deploy and run in different environments in databricks?

I am new to databricks, and trying to implement below task.Task:Once code merges to main branch and build is successful  CI pipeline and all tests are passed, docker build should start and create a docker image and push to different environments (fro...

  • 3386 Views
  • 1 replies
  • 0 kudos
Latest Reply
MichTalebzadeh
Valued Contributor
  • 0 kudos

Hi,This is no different for building docker image for various environmentsLet us try a simple high level CI/CD pipeline for building Docker images and deploying them to different environments:. It works in all environments including Databricks     ...

  • 0 kudos
Stellar
by New Contributor II
  • 2233 Views
  • 0 replies
  • 0 kudos

DLT DatePlane Error

Hi everyone,I am trying to build the pipeline but when I run it I receive an errorDataPlaneException: Failed to start the DLT service on the cluster. Please check the driver logs for more details or contact Databricks support.This is from the driver ...

  • 2233 Views
  • 0 replies
  • 0 kudos
Labels