<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Connect databricks in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/connect-databricks/m-p/65129#M32745</link>
    <description>&lt;P&gt;I discovered recently mlflow managed by Databricks so I'm very new to this and I need some help.&lt;/P&gt;&lt;P&gt;Can someone explain for me clearly the steps to do to be able to track my runs into the Databricks API.&lt;/P&gt;&lt;P&gt;Here are the steps I followed :&lt;/P&gt;&lt;P&gt;1/ Installing Databricks CLI&lt;/P&gt;&lt;P&gt;2/ I sewed up&amp;nbsp;authentication between Databricks CLI and my Databricks workspaces according to instructions here&amp;nbsp;&lt;A href="https://docs.databricks.com/en/dev-tools/cli/authentication.html#token-auth" target="_self" rel="nofollow noreferrer"&gt;text&lt;/A&gt;&lt;/P&gt;&lt;P&gt;I checked the file&amp;nbsp;cat ~/.databrickscfg and everything is fine&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Capture d’écran 2024-03-30 à 01.20.12.png" style="width: 999px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/6870i357C408C0DB9E27E/image-size/large/is-moderation-mode/true?v=v2&amp;amp;px=999" role="button" title="Capture d’écran 2024-03-30 à 01.20.12.png" alt="Capture d’écran 2024-03-30 à 01.20.12.png" /&gt;&lt;/span&gt;3/&amp;nbsp; I'm using Pycharm and I'm writing a python script including mlflow and I want to track the runs on my Databricks workspace Here is a part of my code&amp;nbsp;:&lt;/P&gt;&lt;PRE&gt;mlflow.autolog(
        log_input_examples=True,
        log_model_signatures=True,
        log_models=True,
        disable=False,
        exclusive=False,
        disable_for_unsupported_versions=True,
        silent=False
    )

    #mlflow.login()
    mlflow.set_tracking_uri("databricks")
    
    mlflow.set_experiment("mlflowAUS")
    with mlflow.start_run() as run:
        bestModel.fit(X_train, y_train)
        y_pred = bestModel.predict(X_test)
        runId = run.info.run_id
        mlflow.set_tag('mlflow.runName', datetime.datetime.now().strftime("%Y%m%d_%H%M%S"))
        mlflow.log_param("model_name", str(bestModel)[: str(bestModel).index("(")])&lt;/PRE&gt;&lt;P&gt;&lt;SPAN&gt;and if I add to my code&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;PRE&gt;mlflow.login()&lt;/PRE&gt;&lt;P&gt;&lt;SPAN&gt;I got another time the same error but before the error I got : (I hided my the link with xxxx )&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;PRE&gt;2024/03/30 01:40:44 INFO mlflow.utils.credentials: Successfully connected to MLflow hosted tracking server! Host: https://dbc-xxxxxx-xxxx.cloud.databricks.com.&lt;/PRE&gt;&lt;P&gt;What I don't understand why it says that I'm successfully connected after adding mlflow.login() but I still got the same error and I can't track my runs ? Please help me and&amp;nbsp;I thank you in advance for the support&lt;/P&gt;&lt;P&gt;this code return error :&lt;/P&gt;&lt;PRE&gt;Traceback (most recent call last):
  File "/Users/kevin/PycharmProjects/Projects2024/nov23_continu_mlops_meteo/src/models/bestModel.py", line 184, in &amp;lt;module&amp;gt;
    mlflow.set_experiment("mlflowAUS")
  File "/Users/kevin/opt/anaconda3/envs/env_2023/lib/python3.11/site-packages/mlflow/tracking/fluent.py", line 142, in set_experiment
    experiment = client.get_experiment_by_name(experiment_name)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kevin/opt/anaconda3/envs/env_2023/lib/python3.11/site-packages/mlflow/tracking/client.py", line 539, in get_experiment_by_name
    return self._tracking_client.get_experiment_by_name(name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kevin/opt/anaconda3/envs/env_2023/lib/python3.11/site-packages/mlflow/tracking/_tracking_service/client.py", line 236, in get_experiment_by_name
    return self.store.get_experiment_by_name(name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kevin/opt/anaconda3/envs/env_2023/lib/python3.11/site-packages/mlflow/store/tracking/rest_store.py", line 323, in get_experiment_by_name
    response_proto = self._call_endpoint(GetExperimentByName, req_body)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kevin/opt/anaconda3/envs/env_2023/lib/python3.11/site-packages/mlflow/store/tracking/rest_store.py", line 60, in _call_endpoint
    return call_endpoint(self.get_host_creds(), endpoint, method, json_body, response_proto)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kevin/opt/anaconda3/envs/env_2023/lib/python3.11/site-packages/mlflow/utils/rest_utils.py", line 220, in call_endpoint
    response = verify_rest_response(response, endpoint)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kevin/opt/anaconda3/envs/env_2023/lib/python3.11/site-packages/mlflow/utils/rest_utils.py", line 170, in verify_rest_response
    raise MlflowException(f"{base_msg}. Response body: '{response.text}'")
mlflow.exceptions.MlflowException: API request to endpoint was successful but the response body was not in a valid JSON format. Response body: '&amp;lt;!doctype html&amp;gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;You can see that if I use mlflow ui everything works without any issues&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Capture d’écran 2024-03-31 à 16.54.55.png" style="width: 999px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/6871i97D44B576EADCA71/image-size/large/is-moderation-mode/true?v=v2&amp;amp;px=999" role="button" title="Capture d’écran 2024-03-31 à 16.54.55.png" alt="Capture d’écran 2024-03-31 à 16.54.55.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
    <pubDate>Sun, 31 Mar 2024 14:57:24 GMT</pubDate>
    <dc:creator>Khaled75</dc:creator>
    <dc:date>2024-03-31T14:57:24Z</dc:date>
    <item>
      <title>Connect databricks</title>
      <link>https://community.databricks.com/t5/data-engineering/connect-databricks/m-p/65129#M32745</link>
      <description>&lt;P&gt;I discovered recently mlflow managed by Databricks so I'm very new to this and I need some help.&lt;/P&gt;&lt;P&gt;Can someone explain for me clearly the steps to do to be able to track my runs into the Databricks API.&lt;/P&gt;&lt;P&gt;Here are the steps I followed :&lt;/P&gt;&lt;P&gt;1/ Installing Databricks CLI&lt;/P&gt;&lt;P&gt;2/ I sewed up&amp;nbsp;authentication between Databricks CLI and my Databricks workspaces according to instructions here&amp;nbsp;&lt;A href="https://docs.databricks.com/en/dev-tools/cli/authentication.html#token-auth" target="_self" rel="nofollow noreferrer"&gt;text&lt;/A&gt;&lt;/P&gt;&lt;P&gt;I checked the file&amp;nbsp;cat ~/.databrickscfg and everything is fine&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Capture d’écran 2024-03-30 à 01.20.12.png" style="width: 999px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/6870i357C408C0DB9E27E/image-size/large/is-moderation-mode/true?v=v2&amp;amp;px=999" role="button" title="Capture d’écran 2024-03-30 à 01.20.12.png" alt="Capture d’écran 2024-03-30 à 01.20.12.png" /&gt;&lt;/span&gt;3/&amp;nbsp; I'm using Pycharm and I'm writing a python script including mlflow and I want to track the runs on my Databricks workspace Here is a part of my code&amp;nbsp;:&lt;/P&gt;&lt;PRE&gt;mlflow.autolog(
        log_input_examples=True,
        log_model_signatures=True,
        log_models=True,
        disable=False,
        exclusive=False,
        disable_for_unsupported_versions=True,
        silent=False
    )

    #mlflow.login()
    mlflow.set_tracking_uri("databricks")
    
    mlflow.set_experiment("mlflowAUS")
    with mlflow.start_run() as run:
        bestModel.fit(X_train, y_train)
        y_pred = bestModel.predict(X_test)
        runId = run.info.run_id
        mlflow.set_tag('mlflow.runName', datetime.datetime.now().strftime("%Y%m%d_%H%M%S"))
        mlflow.log_param("model_name", str(bestModel)[: str(bestModel).index("(")])&lt;/PRE&gt;&lt;P&gt;&lt;SPAN&gt;and if I add to my code&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;PRE&gt;mlflow.login()&lt;/PRE&gt;&lt;P&gt;&lt;SPAN&gt;I got another time the same error but before the error I got : (I hided my the link with xxxx )&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;PRE&gt;2024/03/30 01:40:44 INFO mlflow.utils.credentials: Successfully connected to MLflow hosted tracking server! Host: https://dbc-xxxxxx-xxxx.cloud.databricks.com.&lt;/PRE&gt;&lt;P&gt;What I don't understand why it says that I'm successfully connected after adding mlflow.login() but I still got the same error and I can't track my runs ? Please help me and&amp;nbsp;I thank you in advance for the support&lt;/P&gt;&lt;P&gt;this code return error :&lt;/P&gt;&lt;PRE&gt;Traceback (most recent call last):
  File "/Users/kevin/PycharmProjects/Projects2024/nov23_continu_mlops_meteo/src/models/bestModel.py", line 184, in &amp;lt;module&amp;gt;
    mlflow.set_experiment("mlflowAUS")
  File "/Users/kevin/opt/anaconda3/envs/env_2023/lib/python3.11/site-packages/mlflow/tracking/fluent.py", line 142, in set_experiment
    experiment = client.get_experiment_by_name(experiment_name)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kevin/opt/anaconda3/envs/env_2023/lib/python3.11/site-packages/mlflow/tracking/client.py", line 539, in get_experiment_by_name
    return self._tracking_client.get_experiment_by_name(name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kevin/opt/anaconda3/envs/env_2023/lib/python3.11/site-packages/mlflow/tracking/_tracking_service/client.py", line 236, in get_experiment_by_name
    return self.store.get_experiment_by_name(name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kevin/opt/anaconda3/envs/env_2023/lib/python3.11/site-packages/mlflow/store/tracking/rest_store.py", line 323, in get_experiment_by_name
    response_proto = self._call_endpoint(GetExperimentByName, req_body)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kevin/opt/anaconda3/envs/env_2023/lib/python3.11/site-packages/mlflow/store/tracking/rest_store.py", line 60, in _call_endpoint
    return call_endpoint(self.get_host_creds(), endpoint, method, json_body, response_proto)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kevin/opt/anaconda3/envs/env_2023/lib/python3.11/site-packages/mlflow/utils/rest_utils.py", line 220, in call_endpoint
    response = verify_rest_response(response, endpoint)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kevin/opt/anaconda3/envs/env_2023/lib/python3.11/site-packages/mlflow/utils/rest_utils.py", line 170, in verify_rest_response
    raise MlflowException(f"{base_msg}. Response body: '{response.text}'")
mlflow.exceptions.MlflowException: API request to endpoint was successful but the response body was not in a valid JSON format. Response body: '&amp;lt;!doctype html&amp;gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;You can see that if I use mlflow ui everything works without any issues&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Capture d’écran 2024-03-31 à 16.54.55.png" style="width: 999px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/6871i97D44B576EADCA71/image-size/large/is-moderation-mode/true?v=v2&amp;amp;px=999" role="button" title="Capture d’écran 2024-03-31 à 16.54.55.png" alt="Capture d’écran 2024-03-31 à 16.54.55.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 31 Mar 2024 14:57:24 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/connect-databricks/m-p/65129#M32745</guid>
      <dc:creator>Khaled75</dc:creator>
      <dc:date>2024-03-31T14:57:24Z</dc:date>
    </item>
  </channel>
</rss>

