by
MCosta
• New Contributor III
- 11809 Views
- 10 replies
- 19 kudos
Hi ML folks,
We are using Databricks to train deep learning models. The code, however, has a complex structure of classes. This would work fine in a perfect bug-free world like Alice in Wonderland.
Debugging in Databricks is awkward. We ended up do...
- 11809 Views
- 10 replies
- 19 kudos
Latest Reply
Has this been solved yet; a mature way to debug code on databricks. I'm running in the same kind of issue.Variable explorer can be used and pdb, but not the same really..
9 More Replies
- 17386 Views
- 3 replies
- 4 kudos
Hello, I am doing the Data Science and Machine Learning course.
The Boston housing has unintuitive column names. I want to rename them, e.g. so 'zn' becomes 'Zoning'.
When I run this command:
df_bostonLegible = df_boston.rename({'zn':'Zoning'}, axi...
- 17386 Views
- 3 replies
- 4 kudos
Latest Reply
If df_boston is a DataFrame, but you still face issues, try an alternative syntax: df_boston = df_boston.rename(columns={'zn': 'Zoning'}).Make sure df_boston is a proper DataFrame and you're using a recent version of Pandas.
2 More Replies
- 10248 Views
- 17 replies
- 28 kudos
HiOn December 28th ,I attempted the Databricks Certified Machine Learning Professional exam for 1st time , unfortunately I ended up by failing grade. My passing grade was 70%, and I received 68.33%.I am planning to reattempt the exam, Could you kindl...
- 10248 Views
- 17 replies
- 28 kudos
- 2298 Views
- 2 replies
- 2 kudos
Hello Everyone,I'm interested to learn about the certifications you're pursuing to enhance your skills. Sharing your goals can inspire those who may have started their certification journey but struggled with motivation. Personally, I recently comple...
- 2298 Views
- 2 replies
- 2 kudos
Latest Reply
I'm trying the Data Engineering professional exam at the end of the month. It's like a shot in the dark because no practice exams stop are available and from what I've seen online from people who already passed it, the Advanced Data Engineering with ...
1 More Replies
- 2635 Views
- 3 replies
- 0 kudos
Hi!I am training a Random Forest (pyspark.ml.classification.RandomForestClassifier) on Databricks with 1,000,000 training examples and 25 features. I employ a cluster with one driver (16 GB Memory, 4 Cores), 2-6 workers (32-96 GB Memory, 8-24 Cores),...
- 2635 Views
- 3 replies
- 0 kudos
Latest Reply
Hi @John B Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we can...
2 More Replies
- 1620 Views
- 2 replies
- 5 kudos
"Hey everyone, it seems like there's some confusion about enhanced autoscaling in Databricks lately. If you're feeling lost or unsure about how it works, don't worry - you're not"Enhanced autoscaling is a feature in Databricks that enables dynamic sc...
- 1620 Views
- 2 replies
- 5 kudos
by
Deiry
• New Contributor III
- 989 Views
- 1 replies
- 3 kudos
Hi I'm Deiry I'm 25 (almost 26) years old, I'm a Databricks expert Or at least that's my goalI work at Celerik.My goal is to be a certified Machine Learning professional, so here we go
- 989 Views
- 1 replies
- 3 kudos
Latest Reply
Very confident, go ahead. :D
by
Sri_H
• New Contributor III
- 1848 Views
- 2 replies
- 1 kudos
Hi All,I attended a 2 day ML training during the Data & AI 2022 summit and I received an email from the events team (ataaisummit@typeaevents.com) telling that the recordings for training and related material will be available in my Databricks Academy...
- 1848 Views
- 2 replies
- 1 kudos
Latest Reply
Hi @Sri H ! I am checking on this for you - hang tight! I'll try and get an update asap from the Academy Team.
1 More Replies
- 4042 Views
- 4 replies
- 6 kudos
Project_Details.csvProjectNo|ProjectName|EmployeeNo100|analytics|1100|analytics|2101|machine learning|3101|machine learning|1101|machine learning|4Find each employee in the form of list working on each project?Output:ProjectNo|employeeNo100|[1,2]101|...
- 4042 Views
- 4 replies
- 6 kudos
Latest Reply
@SANJEEV BANDRU You can simply do thisJust change the file path CREATE TEMPORARY VIEW readcsv USING CSV OPTIONS ( path "dbfs:/docs/test.csv", header "true", delimiter "|", mode "FAILFAST");select ProjectNo, collect_list(EmployeeNo) Employeesfrom re...
3 More Replies
- 3132 Views
- 3 replies
- 0 kudos
Hi Guys. I've implemented a Machine Learning model on Databricks and have registered it with a Model URL. I wanted to enquire if I could use this model on Power BI. Basically the model predicts industries based on client demographics. Ideally I would...
- 3132 Views
- 3 replies
- 0 kudos
Latest Reply
Thank you @Werner Stinckens and @Joseph Kambourakis for your replies.
2 More Replies
- 13014 Views
- 2 replies
- 2 kudos
I built a machine learning model:lr = LinearRegression()
lr.fit(X_train, y_train)which I can save to the filestore by:filename = "/dbfs/FileStore/lr_model.pkl"
with open(filename, 'wb') as f:
pickle.dump(lr, f)Ideally, I wanted to save the model ...
- 13014 Views
- 2 replies
- 2 kudos
Latest Reply
Workspace and Repo is not full available via dbfs as they have separate access rights. It is better to use MLFlow for your models as it is like git but for ML. I think using MLOps you can than put your model also to git.
1 More Replies
- 784 Views
- 0 replies
- 0 kudos
Online IT Training: ERP/SAP Online Training | JAVA Online Training | C++Online Training | ORACLE Online Training | Online Python Training | Machine Learning Training. If you Need more Details and Information Regarding IT Online Training. Please Visi...
- 784 Views
- 0 replies
- 0 kudos
- 2051 Views
- 1 replies
- 0 kudos
My dataset has an "item" column which groups the rows into many groups. (Think of these groups as items in a store.) I want to fit 1 ML model per group. Should I tune hyperparameters for each group separately? Or should I tune them for the entire...
- 2051 Views
- 1 replies
- 0 kudos
Latest Reply
For the first question ("which option is better?"), you need to answer that via your understanding of the problem domain.Do you expect similar behavior across the groups (items)?If so, that's a +1 in favor of sharing hyperparameters. And vice versa....
- 1977 Views
- 1 replies
- 0 kudos
Is there a way to automatically distribute training and model tuning across a Spark cluster, if I want to keep using scikit-learn?
- 1977 Views
- 1 replies
- 0 kudos
Latest Reply
It depends on what you mean by "automagically."If you want to keep using scikit-learn, there are ways to distribute parts of training and tuning with minimal effort. However, there is no "magic" way to distribute training an individual model in scik...
- 3549 Views
- 1 replies
- 0 kudos
Normalization typically means rescales the values into a range of [0,1].Standardization typically means rescales data to have a mean of 0 and a standard deviation of 1 (unit variance).
- 3549 Views
- 1 replies
- 0 kudos
Latest Reply
Normalization typically means rescales the values into a range of [0,1]. Standardization typically means rescales data to have a mean of 0 and a standard deviation of 1 (unit variance).A link which explains better is - https://towardsdatascience.com...