MLOPs
I’m here to learn more about DataBricks MLOps. I’ve learnt so much about how to build and maintain a production-level ML models. I will apply this knowledge to build a scalable ML solutions for my company.
- 1334 Views
- 0 replies
- 0 kudos
I’m here to learn more about DataBricks MLOps. I’ve learnt so much about how to build and maintain a production-level ML models. I will apply this knowledge to build a scalable ML solutions for my company.
How to efficiently use automl
Can Databricks feature tables be stored outside of DBFS?
Yes, Databricks feature tables can be stored outside of Databricks File System (DBFS). You can store your feature tables in external storage systems such as Amazon S3, Azure Blob Storage, Azure Data Lake Storage, or Hadoop Distributed File System (HD...
The Databricks Data + AI Summit 2023 has been great so far. I just completed the two day Data Management Training, where i learned a lot of practical tips on making my piepelines more efficient and Robust.After these two days sessions I got a good id...
I get an exception when attempting to run the following line of code, which filters a spark DataFrame based on the geometry.df_tx = df_zip.filter(st_intersects(st_aswkt("zip_code_geom"), tx_poly)) df_tx.show()where, `tx_poly` is,`tx_poly = shapely....
I am not familiar with st_intersects, but it seems that it runs solely on the driver (as python code, not spark).Does mosaic work in pyspark?If not: try to use a larger driver.
Today, many R packages are pre-installed on the standard clusters on Databricks. Libraries like "tidyverse", "ggplot2", etc are there. Also the great library "readxl" to load Excel files. But unfortunately, its counterpart "writexl" is not pre-instal...
I just need to figure who decides which R packages are pre-installed on the cluster.
please check the attached image which needs to resolve. anyone have come across this kind of issues
Hi @sherbin w​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.
My session timezone is Australia/Sydneyif i run the below query my expectation is first column and third column should show the same value. But it is not working as expected for 1753-01-01 00:00:00 timestamp.spark.conf.set("spark.sql.session.timeZone...
Hi @Rahul Lalwani​ (Customer)​,In Interactive cluster spark.sql.datetime.java8API.enabled is disabled when we enable spark.sql.datetime.java8API.enabled to true , we can see crt values for 1753-01-01 as well.The reason for enabling the above config ...
I get the following error when getting a list of files stored in an Azure Storage account using "dbutils.fs.ls" command in Databrciks.Failure to initialize configuration for storage account AAAAA.dfs.core.windows.net: Invalid configuration value dete...
Hi @Mohammad Saber​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.
If I clone an existing job without making any changes, I am able to reconfigure the compute successfully. If I remove or add a spark environment variable to test modifications, such as using secrets for example, and I confirm the changes to the job, ...
Hi @Marvin Ginns​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.
I have a collection fixed width files that I would like to ingest monthly with autoloader but I can't seem to find an example. I can read the files into Dataframes using a python function to map the index and length of each field with no issues but ...
I found a way to get what I needed and I can apply this to any fixed width file. Will share for anyone trying to do the same thing. I accomplished this in a Python notebook and will explain the code:Import the libraries needed and define a schema.i...
0I have a problem with reading the file from ADLS gen 2.I have dont the mounting properly as after executing dbutils.fs.ls('/mnt/bronze') I can see the file path.the way how I did the mounting: # dbutils.fs.mount( # source = "abfss://"+container_r...
Hi @Givi Salu​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.
Is it possible to use merge command when source file is parquet and destination file is delta? Or both files must delta files? Currently, I'm using this code and I transform parquet into delta and it works. But I want to avoid of this tranformation.T...
Hi @Ales ventus​ We haven't heard from you since the last response from @Kaniz Fatma​ , and I was checking back to see if her suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be helpful to others...
HI ML Practitioners, I want to ask you all how are you productionizing your ML workloads? Are you using ML Flow?Whats your take on ML Flow Recipies? Lets get the conversation started.MLflow Recipes (previously known as MLflow Pipelines) is a framewo...
I want to connect to my Azure Synapse database using Spark. I can do this in pyodbc no problem but that is not what I want.Here is how I get my credentialscredential = AzureCliCredential() databaseToken = credential.get_token('https://database.window...
Hi @Patrick Grover​ We haven't heard from you since the last response from @Kaniz Fatma​ ​, and I was checking back to see if her suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be helpful to ot...
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up NowUser | Count |
---|---|
89 | |
39 | |
38 | |
25 | |
25 |