cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

Migrate to a new account

zMynxx
New Contributor III

Hey Team,

We're looking into migrating our correct Databricks solution from 1 AWS account (us-east-1 region) to another (eu-central-1 region). I have no documentation left on/about how the corrent solution was provisioned, but I can see CloudFormation stacks named "databricks-workspace-stack" and "databricks-workspace-stack2" in the first account, so after going through the docs I guess the quick-start method was used.

Any suggestions about how I should go on with this migration? the migration-tool appears to only be relevant for st-to-e2 migration, while we currently at e2.

 

TIA,
Lior

1 ACCEPTED SOLUTION

Accepted Solutions

zMynxx
New Contributor III

Discovered I've gotten a bad network setup provision (by the networking team).
Complete.

View solution in original post

2 REPLIES 2

zMynxx
New Contributor III

I ended up using the terrafrom-databricks-provider  tool to perform an export and import of the old workspace into the new one. All that was needed was a PAT in each, export from the old, sed the region, account and PAT and apply. This got me about 75% there, now I'm using DataSync to migrate the data across the buckets.

I did however encountered an issue when initiating a cluster, these are the events:

 

Message
TERMINATING
2025-03-19 16:46:21
Compute terminated.
ADD_NODES_FAILED
2025-03-19 16:46:21
Failed to add 1 worker to the compute. Will attempt retry: false.
STARTING
2025-03-19 16:33:24
Started by xxxxx@xxxxx.com.
 
And heres the error msg:
```json
{
"reason": {
"code": "BOOTSTRAP_TIMEOUT",
"type": "SERVICE_FAULT",
"parameters": {
"databricks_error_message": "[id: InstanceId(i-0c77814a3f94be628), status: INSTANCE_INITIALIZING, workerEnvId:WorkerEnvId(workerenv-1474780594966847-24034583-38c5-482f-9496-7afbff442b4e), lastStatusChangeTime: 1742394854901, groupIdOpt Some(0),requestIdOpt Some(0319-083358-4m8i3wyq-c85024f5-af12-470d-9-driver),version 1] with threshold 700 seconds timed out after 706428 milliseconds. Instance bootstrap inferred timeout reason: UnknownReason. Please check network connectivity from the data plane to the control plane.",
"instance_id": "i-0c77814a3f94be628"
}
},
"add_node_failure_details": {
"failure_count": 1,
"resource_type": "container",
"will_retry": false
}
}
```

 

 

 
I've attached the System logs of this machine and found out it's having issues reaching the s3 buckets for the bootstrap script:
 
```log
[Bootstrap Event] Command DownloadBootstrapScript finished. Storage Account: databricks-prod-artifacts-eu-central-1.s3.eu-central-1.amazonaws.com [FAILED] [Will try other storage accounts]. Seconds Elapsed: 70.507
[Bootstrap Event] Command DownloadBootstrapScript finished. Storage Account: databricks-prod-artifacts-us-west-2.s3.us-west-2.amazonaws.com [FAILED] [Will try other storage accounts]. Seconds Elapsed: 70.0181
[Bootstrap Event] Command DownloadBootstrapScript finished. Storage Account: databricks-update-oregon.s3.us-west-2.amazonaws.com [FAILED] [Will try other storage accounts]. Seconds Elapsed: 70.0211
[Bootstrap Event] {FAILED_COMMAND:DownloadBootstrapScript}
[Bootstrap Event] {FAILED_MESSAGE:(Base64 encoded)
2025/03/19 14:35:30 Failed get request: All attempts fail:
#1: Head "https://databricks-prod-artifacts-eu-central-1.s3.eu-central-1.amazonaws.com/bootstrap/instance-mana...": dial tcp 3.5.137.184:443: i/o timeout
#2: Head "https://databricks-prod-artifacts-eu-central-1.s3.eu-central-1.amazonaws.com/bootstrap/instance-mana...": dial tcp 52.219.72.144:443: i/o timeout
#3: Head "https://databricks-prod-artifacts-eu-central-1.s3.eu-central-1.amazonaws.com/bootstrap/instance-mana...": dial tcp 52.219.171.98:443: i/o timeout
#4: Head "https://databricks-prod-artifacts-eu-central-1.s3.eu-central-1.amazonaws.com/bootstrap/instance-mana...": dial tcp 3.5.134.253:443: i/o timeout
#5: Head "https://databricks-prod-artifacts-eu-central-1.s3.eu-central-1.amazonaws.com/bootstrap/instance-mana...": dial tcp 3.5.139.152:443: i/o timeout
2025/03/19 14:35:30 unknown http response
[Bootstrap Event] Command DownloadBootstrapScript finished. Storage Account: databricks-prod-artifacts-eu-central-1.s3.eu-central-1.amazonaws.com [FAILED] [Will try other storage accounts]. Seconds Elapsed: 70.507
2025/03/19 14:36:40 Failed get request: All attempts fail

[Bootstrap Event] SELF BOOTSTRAP FAILED. [Started at: Wed Mar 19 14:34:19 UTC 2025] [Seconds Elapsed:211] [Start Timestamp: 1742394859] [End Timestamp: 1742395070]
[Bootstrap Event] /etc/resolv.conf dump:
# This file is managed by man:systemd-resolved(8). Do not edit.
#
# This is a dynamic resolv.conf file for connecting local clients directly to
# all known uplink DNS servers. This file lists all configured search domains.
#
# Third party programs must not access this file directly, but only through the
# symlink at /etc/resolv.conf. To manage man:resolv.conf(5) in a different way,
# replace this symlink by a static file or a different symlink.
#
# See man:systemd-resolved.service(8) for details about the supported modes of
# operation for /etc/resolv.conf.

nameserver 10.160.126.2
search eu-central-1.compute.internal
[Bootstrap Event] DNS output for frankfurt.cloud.databricks.com:
Server: 10.160.126.2
Address: 10.160.126.2#53

Non-authoritative answer:
frankfurt.cloud.databricks.com canonical name = frankfurt-migration.cloud.databricks.com.
frankfurt-migration.cloud.databricks.com canonical name = k8s-popproxy-popproxy-c6378b8048-95d77c19d3cb5782.elb.eu-central-1.amazonaws.com.
Name: k8s-popproxy-popproxy-c6378b8048-95d77c19d3cb5782.elb.eu-central-1.amazonaws.com
Address: 18.159.44.43
Name: k8s-popproxy-popproxy-c6378b8048-95d77c19d3cb5782.elb.eu-central-1.amazonaws.com
Address: 18.159.44.42
Name: k8s-popproxy-popproxy-c6378b8048-95d77c19d3cb5782.elb.eu-central-1.amazonaws.com
Address: 18.159.44.44

[Bootstrap Event] Can reach frankfurt.cloud.databricks.com: [FAILED]
[Bootstrap Event] DNS output for databricks-prod-artifacts-eu-central-1.s3.eu-central-1.amazonaws.com:
Server: 10.160.126.2
Address: 10.160.126.2#53

Non-authoritative answer:
databricks-prod-artifacts-eu-central-1.s3.eu-central-1.amazonaws.com canonical name = s3-r-w.eu-central-1.amazonaws.com.
Name: s3-r-w.eu-central-1.amazonaws.com
Address: 3.5.136.154
Name: s3-r-w.eu-central-1.amazonaws.com
Address: 52.219.171.110
Name: s3-r-w.eu-central-1.amazonaws.com
Address: 52.219.169.94
Name: s3-r-w.eu-central-1.amazonaws.com
Address: 3.5.135.208
Name: s3-r-w.eu-central-1.amazonaws.com
Address: 3.5.135.14
Name: s3-r-w.eu-central-1.amazonaws.com
Address: 3.5.136.178
Name: s3-r-w.eu-central-1.amazonaws.com
Address: 3.5.137.209
Name: s3-r-w.eu-central-1.amazonaws.com
Address: 3.5.138.184

[Bootstrap Event] Can reach databricks-prod-artifacts-eu-central-1.s3.eu-central-1.amazonaws.com: [FAILED]
[Bootstrap Event] DNS output for databricks-prod-artifacts-us-west-2.s3.us-west-2.amazonaws.com:
Server: 10.160.126.2
Address: 10.160.126.2#53

Non-authoritative answer:
databricks-prod-artifacts-us-west-2.s3.us-west-2.amazonaws.com canonical name = s3-r-w.us-west-2.amazonaws.com.
Name: s3-r-w.us-west-2.amazonaws.com
Address: 3.5.86.207
Name: s3-r-w.us-west-2.amazonaws.com
Address: 3.5.82.15
Name: s3-r-w.us-west-2.amazonaws.com
Address: 3.5.81.175
Name: s3-r-w.us-west-2.amazonaws.com
Address: 52.92.129.58
Name: s3-r-w.us-west-2.amazonaws.com
Address: 52.92.154.138
Name: s3-r-w.us-west-2.amazonaws.com
Address: 3.5.76.108
Name: s3-r-w.us-west-2.amazonaws.com
Address: 52.218.221.49
Name: s3-r-w.us-west-2.amazonaws.com
Address: 3.5.82.107

[Bootstrap Event] Can reach databricks-prod-artifacts-us-west-2.s3.us-west-2.amazonaws.com: [FAILED]
[Bootstrap Event] DNS output for databricks-update-oregon.s3.us-west-2.amazonaws.com:
Server: 10.160.126.2
Address: 10.160.126.2#53

Non-authoritative answer:
databricks-update-oregon.s3.us-west-2.amazonaws.com canonical name = s3-r-w.us-west-2.amazonaws.com.
Name: s3-r-w.us-west-2.amazonaws.com
Address: 3.5.82.105
Name: s3-r-w.us-west-2.amazonaws.com
Address: 52.92.136.210
Name: s3-r-w.us-west-2.amazonaws.com
Address: 52.92.196.186
Name: s3-r-w.us-west-2.amazonaws.com
Address: 3.5.85.1
Name: s3-r-w.us-west-2.amazonaws.com
Address: 52.218.238.9
Name: s3-r-w.us-west-2.amazonaws.com
Address: 52.92.186.2
Name: s3-r-w.us-west-2.amazonaws.com
Address: 52.92.224.162
Name: s3-r-w.us-west-2.amazonaws.com
Address: 3.5.83.150

[Bootstrap Event] Can reach databricks-update-oregon.s3.us-west-2.amazonaws.com: [FAILED]

```

I tried to duplicate the scenario, used the same AMI, lunched a new instance with the same profile, on the same vpc and subnet. All tests were good:
```log

ubuntu@ip-10-160-254-24:~$ nc -zv databricks-update-oregon.s3.us-west-2.amazonaws.com 443
Connection to databricks-update-oregon.s3.us-west-2.amazonaws.com 443 port [tcp/https] succeeded!
ubuntu@ip-10-160-254-24:~$ nc -zv databricks-prod-artifacts-eu-central-1.s3.eu-central-1.amazonaws.com 443
Connection to databricks-prod-artifacts-eu-central-1.s3.eu-central-1.amazonaws.com 443 port [tcp/https] succeeded!
ubuntu@ip-10-160-254-24:~$ curl -I "https://databricks-prod-artifacts-eu-central-1.s3.eu-central-1.amazonaws.com/bootstrap/instance-mana..."
HTTP/1.1 200 OK
x-amz-id-2: /Spm8FbeUVhdDax5dGAQ6jng6t0MwImYF82g0dsqVdFYrmzHtG4GpGpqTCDrNzLTrFi/JE45RkuFSjqLPD3mYg==
x-amz-request-id: GG4ZBE27YXJWV656
Date: Thu, 20 Mar 2025 10:51:42 GMT
Last-Modified: Mon, 10 Mar 2025 20:50:06 GMT
ETag: "faee7f6cf29cffa357f92444b00502e7-3"
x-amz-storage-class: INTELLIGENT_TIERING
x-amz-server-side-encryption: AES256
x-amz-version-id: wD4HHqrOh4XL0J1hFzZi2FwqdsUQHv8B
Accept-Ranges: bytes
Content-Type: binary/octet-stream
Content-Length: 17848708
Server: AmazonS3

ubuntu@ip-10-160-254-24:~$ nc -zv databricks-update-oregon.s3.us-west-2.amazonaws.com 443
Connection to databricks-update-oregon.s3.us-west-2.amazonaws.com 443 port [tcp/https] succeeded!
ubuntu@ip-10-160-254-24:~$ nc -zv databricks-prod-artifacts-eu-central-1.s3.eu-central-1.amazonaws.com/ 443
nc: getaddrinfo for host "databricks-prod-artifacts-eu-central-1.s3.eu-central-1.amazonaws.com/" port 443: Name or service not known
ubuntu@ip-10-160-254-24:~$ nc -zv databricks-prod-artifacts-eu-central-1.s3.eu-central-1.amazonaws.com 443
Connection to databricks-prod-artifacts-eu-central-1.s3.eu-central-1.amazonaws.com 443 port [tcp/https] succeeded!
ubuntu@ip-10-160-254-24:~$ nslookup databricks-prod-artifacts-eu-central-1.s3.eu-central-1.amazonaws.com
Server: 10.160.126.2
Address: 10.160.126.2#53

Non-authoritative answer:
databricks-prod-artifacts-eu-central-1.s3.eu-central-1.amazonaws.com canonical name = s3-r-w.eu-central-1.amazonaws.com.
Name: s3-r-w.eu-central-1.amazonaws.com
Address: 52.219.169.138
Name: s3-r-w.eu-central-1.amazonaws.com
Address: 3.5.135.40
Name: s3-r-w.eu-central-1.amazonaws.com
Address: 3.5.139.192
Name: s3-r-w.eu-central-1.amazonaws.com
Address: 3.5.138.124
Name: s3-r-w.eu-central-1.amazonaws.com
Address: 52.219.208.26
Name: s3-r-w.eu-central-1.amazonaws.com
Address: 3.5.139.121
Name: s3-r-w.eu-central-1.amazonaws.com
Address: 3.5.136.188
Name: s3-r-w.eu-central-1.amazonaws.com
Address: 52.219.75.184

ubuntu@ip-10-160-254-24:~$ nc -zv 3.5.137.184 443
Connection to 3.5.137.184 443 port [tcp/https] succeeded!
ubuntu@ip-10-160-254-24:~$ nc -zv 52.219.72.144: 443
nc: getaddrinfo for host "52.219.72.144:" port 443: Name or service not known
ubuntu@ip-10-160-254-24:~$ nc -zv 52.219.72.144 443
Connection to 52.219.72.144 443 port [tcp/https] succeeded!
ubuntu@ip-10-160-254-24:~$ nc -zv 52.219.171.98 443
Connection to 52.219.171.98 443 port [tcp/https] succeeded!
ubuntu@ip-10-160-254-24:~$ nc -zv 3.5.139.152 443
Connection to 3.5.139.152 443 port [tcp/https] succeeded!
```

Any ideas?

The only thing I can think of is either PrivateLink is essential or there might be a unknown block in my org that i am not aware of that is essintial for this proccess to complete (like ICMP is blocked).

 

zMynxx
New Contributor III

Discovered I've gotten a bad network setup provision (by the networking team).
Complete.