cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Issue with Adding New Members to Existing Groups During Migration in User group Service Principle

Sudheer2
New Contributor III

 

Hi all,

I have implemented a migration process to move groups from a source workspace to a target workspace using the following code. The code successfully migrates groups and their members to the target system, but I am facing an issue when it comes to adding new members to an existing group during subsequent migrations.

Scenario:

  • Initially, a group is created in both the source and target workspaces.
  • When I add new members to an existing source group, I expect that during the next migration, these new members should be added to the corresponding group in the target workspace.

However, the current code does not seem to handle this scenario correctly. It only checks for existing groups and verifies if any members are missing. But it doesn't seem to handle the case when a new member is added to the source group after the first migration.

Code Overview:

The code checks if the group already exists in the target system. If the group exists, it checks for missing members and adds them. But if a new member is added to the source group after the initial migration, the new member is not added during the subsequent migration.

 

python
# Code snippet
def import_group(group, target_host, target_token๐Ÿ˜ž
    headers = get_headers(target_token)
    url = f'{target_host}/api/2.0/preview/scim/v2/Groups'
    group_check_url = f'{url}?filter=displayName eq "{group["displayName"]}"'  # Check if group already exists by display name

    response = make_request_with_error_handling(group_check_url, headers)
    existing_groups = response.json().get('Resources', [])

    if existing_groups:
        existing_group = existing_groups[0]
        logging.warning(f"Group {group['displayName']} already exists in target system. Checking members.")
        existing_members = {member['value'] for member in existing_group.get('members', [])}
        new_members = {member['value'] for member in group.get('members', [])}
        missing_members = new_members - existing_members

        if missing_members:
            logging.info(f"Adding missing members to group {group['displayName']}.")
            patch_data = {
                "schemas": ["urn:ietf:params:scim:api:messages:2.0:PatchOp"],
                "Operations": [
                    {
                        "op": "add",
                        "path": "members",
                        "value": [{"value": member} for member in missing_members]
                    }
                ]
            }
            update_url = f'{url}/{existing_group["id"]}'
            make_request_with_error_handling(update_url, headers, method='PATCH', data=patch_data)
            logging.info(f"Group {group['displayName']} updated with missing members.")
        else:
            logging.info(f"No missing members for group {group['displayName']}.")
        return False

    data = {
        "schemas": group.get("schemas", []),
        "displayName": group["displayName"],
        "members": group.get("members", [])
    }

    try:
        make_request_with_error_handling(url, headers, method='POST', data=data)
        logging.info(f"Group {group['displayName']} imported successfully.")
        return True
    except KeyError as ke:
        logging.error(f"KeyError importing group: {group}. Missing key: {ke}")
    except requests.exceptions.HTTPError as err:
        if err.response.status_code == 409:
            logging.warning(f"Group {group['displayName']} already exists in target system. Skipping import.")
        else:
            logging.error(f"HTTP error occurred importing group: {group}. Error: {err}")
    except Exception as e:
        logging.error(f"Error importing group: {group}. Error: {e}")
    return False

Problem:

The new member added to the source group is not getting added to the corresponding group in the target workspace during the second migration.

Question:

Can anyone suggest a solution or approach to ensure that new members are added to the existing target group when I migrate again, even if the group already exists in the target?

Thank you for your help!

5 REPLIES 5

Walter_C
Databricks Employee
Databricks Employee

To address the issue of adding new members to existing groups during subsequent migrations, you need to modify your import_group function to compare the members of the source group with the existing members in the target group. You could try with the following:

def import_group(group, target_host, target_token):
    headers = get_headers(target_token)
    url = f'{target_host}/api/2.0/preview/scim/v2/Groups'
    group_check_url = f'{url}?filter=displayName eq "{group["displayName"]}"'

    response = make_request_with_error_handling(group_check_url, headers)
    existing_groups = response.json().get('Resources', [])

    if existing_groups:
        existing_group = existing_groups[0]
        logging.info(f"Group {group['displayName']} already exists in target system. Checking members.")
        existing_members = {member['value'] for member in existing_group.get('members', [])}
        source_members = {member['value'] for member in group.get('members', [])}
        
        members_to_add = source_members - existing_members
        members_to_remove = existing_members - source_members

        if members_to_add or members_to_remove:
            logging.info(f"Updating members for group {group['displayName']}.")
            patch_operations = []
            
            if members_to_add:
                patch_operations.append({
                    "op": "add",
                    "path": "members",
                    "value": [{"value": member} for member in members_to_add]
                })
            
            if members_to_remove:
                patch_operations.append({
                    "op": "remove",
                    "path": "members",
                    "value": [{"value": member} for member in members_to_remove]
                })
            
            patch_data = {
                "schemas": ["urn:ietf:params:scim:api:messages:2.0:PatchOp"],
                "Operations": patch_operations
            }
            update_url = f'{url}/{existing_group["id"]}'
            make_request_with_error_handling(update_url, headers, method='PATCH', data=patch_data)
            logging.info(f"Group {group['displayName']} updated with new members.")
        else:
            logging.info(f"No member changes for group {group['displayName']}.")
        return False

    data = {
        "schemas": group.get("schemas", []),
        "displayName": group["displayName"],
        "members": group.get("members", [])
    }

    try:
        make_request_with_error_handling(url, headers, method='POST', data=data)
        logging.info(f"Group {group['displayName']} imported successfully.")
        return True
    except KeyError as ke:
        logging.error(f"KeyError importing group: {group}. Missing key: {ke}")
    except requests.exceptions.HTTPError as err:
        if err.response.status_code == 409:
            logging.warning(f"Group {group['displayName']} already exists in target system. Skipping import.")
        else:
            logging.error(f"HTTP error occurred importing group: {group}. Error: {err}")
    except Exception as e:
        logging.error(f"Error importing group: {group}. Error: {e}")
    return False


Sudheer2
New Contributor III

 

Hi @Walter_C,

Thank you for the suggestion! I tried retrying the process, but unfortunately, I'm still facing issues. Here's what I encountered:

  • The group add_test already exists in the target system, and the members are being checked.
  • When trying to update the members, I received the following error:

     

     
    2024-12-19 17:23:32,608 - INFO - Group add_test already exists in target system. Checking members. 2024-12-19 17:23:32,609 - INFO - Updating members for group add_test. 2024-12-19 17:23:32,609 - ERROR - Error in group import thread: Unsupported HTTP method: PATCH
     

    I also attempted using PUT and POST methods, but the issue persists.

    Could you please provide guidance on how to resolve this, or if there's a different approach I should take to successfully update members in an existing group?

    Thanks in advance!

Walter_C
Databricks Employee
Databricks Employee

Just to confirm the group members on the source group are being added manually is this correct?

Sudheer2
New Contributor III

 

Hi @Walter_C ,

I recently tried a different approach, and itโ€™s working well overall, but I am encountering an issue specifically with the Service Principals during migration.

I am currently migrating Service Principals from a non-Unity workspace to a Unity-enabled workspace in Databricks. The Service Principals themselves seem to be migrating correctly, but I am noticing a mismatch in the permissions (entitlements) between the source and target workspaces.

The Service Principals in the source workspace have certain permissions (such as Allow cluster creation, Databricks SQL access, and Workspace access), but after the migration, these entitlements are not matching as expected in the target Unity-enabled workspace.

Hereโ€™s the code Iโ€™m using for migrating the Service Principals:

 

python
def import_service_principal(sp, target_host, target_token): headers = get_headers(target_token) url = f'{target_host}/api/2.0/preview/scim/v2/ServicePrincipals' sp_check_url = f'{url}/{sp["applicationId"]}' # Check if service principal already exists by application ID if resource_exists(sp_check_url, headers): logging.warning(f"Service Principal with ID {sp['applicationId']} already exists in target system. Skipping import.") return False data = { "schemas": sp.get("schemas", []), "applicationId": sp["applicationId"], "displayName": sp.get("displayName", ""), "description": sp.get("description", "") } try: make_request_with_error_handling(url, headers, method='POST', data=data) logging.info(f"Service Principal {sp['applicationId']} imported successfully.") return True except KeyError as ke: logging.error(f"KeyError importing service principal: {sp}. Missing key: {ke}") except requests.exceptions.HTTPError as err: if err.response.status_code == 409: logging.warning(f"Service Principal with ID {sp['applicationId']} already exists in target system. Skipping import.") else: logging.error(f"HTTP error occurred importing service principal: {sp}. Error: {err}") except Exception as e: logging.error(f"Error importing service principal: {sp}. Error: {e}") return False
 

Iโ€™m specifically facing an issue with the Entitlements for Service Principals in the Unity-enabled workspace not matching the source workspace after migration. The permissions related to Allow cluster creation, Databricks SQL access, and Workspace access are not being transferred properly, resulting in mismatches.

Has anyone encountered this issue when migrating Service Principals between non-Unity and Unity-enabled Databricks workspaces? If so, could you suggest any solutions or steps to ensure that the permissions and entitlements are correctly migrated as well?

Any help would be greatly appreciated!

Thank you in advance!

Walter_C
Databricks Employee
Databricks Employee

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group