cancel
Showing results for 
Search instead for 
Did you mean: 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results for 
Search instead for 
Did you mean: 

Corrupted Python installation on Python restart on DBR 13.3

ivanychev
Contributor II

Hey there, we're using DBR 13.3 (no Docker) as general purpose cluster and init the cluster using the following init script:

```

#!/usr/bin/env bash
export DEBIAN_FRONTEND=noninteractive
set -euxo pipefail

if [[ $DB_IS_DRIVER = "TRUE" ]]; then
echo "I am driver"
else
echo "I am executor"
fi

curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip -q awscliv2.zip
./aws/install
rm -rf awscliv2.zip aws

aws s3 cp "s3://constructor-analytics-data/deploy/dp_release/${MODE}/latest/dp_requirements.txt" /tmp/all_requirements.txt

/databricks/python/bin/pip install -U pip wheel
/databricks/python/bin/pip install --no-cache-dir --no-deps -r /tmp/requirements.txt

```

The init script, in particular, install boto3==1.29.7 (not boto3==1.24.28 from vanilla distribution https://docs.databricks.com/en/release-notes/runtime/13.3lts.html)

When there's any OOM happening on the Python side, the driver doesn't restart the node, but (apparently) the Python interpreter restarts.

After it restarts, boto3 stops working, any S3 operation ends with `An error occurred (AccessDenied) when calling the ListObjects operation: Access Denied`. The reason is that boto3 installation changes (screenshot). 

Note that service-2.json was absent before the OOM but it appeared after. The time of creation is 6 minutes earlier that nearby files, so I suspect this service-2.json was somehow taken from older botocore. This file is used by botocore to construct HTTP API requests from Python calls, and when this directory gets corrupted, boto3 stops working.

Why this file appears suddenly in the botocore library files? Why other files didn't change? What am I doing wrong here?

Sergey
0 REPLIES 0

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group