cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

MCosta
by New Contributor III
  • 11809 Views
  • 10 replies
  • 19 kudos

Resolved! Debugging!

Hi ML folks, We are using Databricks to train deep learning models. The code, however, has a complex structure of classes. This would work fine in a perfect bug-free world like Alice in Wonderland. Debugging in Databricks is awkward. We ended up do...

  • 11809 Views
  • 10 replies
  • 19 kudos
Latest Reply
petern
New Contributor II
  • 19 kudos

Has this been solved yet; a mature way to debug code on databricks. I'm running in the same kind of issue.Variable explorer can be used and pdb, but not the same really..

  • 19 kudos
9 More Replies
PHorniak
by New Contributor II
  • 17386 Views
  • 3 replies
  • 4 kudos

Resolved! AttributeError: 'DataFrame' object has no attribute 'rename'

Hello, I am doing the Data Science and Machine Learning course. The Boston housing has unintuitive column names. I want to rename them, e.g. so 'zn' becomes 'Zoning'. When I run this command: df_bostonLegible = df_boston.rename({'zn':'Zoning'}, axi...

  • 17386 Views
  • 3 replies
  • 4 kudos
Latest Reply
KrunalLathiya
New Contributor II
  • 4 kudos

If df_boston is a DataFrame, but you still face issues, try an alternative syntax: df_boston = df_boston.rename(columns={'zn': 'Zoning'}).Make sure df_boston is a proper DataFrame and you're using a recent version of Pandas.

  • 4 kudos
2 More Replies
varunsaagar
by New Contributor III
  • 10248 Views
  • 17 replies
  • 28 kudos

Request for reattempt voucher. Databricks Certified Machine Learning Professional exam

HiOn December 28th ,I attempted the Databricks Certified Machine Learning Professional exam for 1st time , unfortunately I ended up by failing grade. My passing grade was 70%, and I received 68.33%.I am planning to reattempt the exam, Could you kindl...

  • 10248 Views
  • 17 replies
  • 28 kudos
Latest Reply
girl_chan
New Contributor II
  • 28 kudos

What is the next event where they will give a voucher?

  • 28 kudos
16 More Replies
Anonymous
by Not applicable
  • 2298 Views
  • 2 replies
  • 2 kudos

Hello Everyone, I'm interested to learn about the certifications you're pursuing to enhance your skills. Sharing your goals can inspire those ...

Hello Everyone,I'm interested to learn about the certifications you're pursuing to enhance your skills. Sharing your goals can inspire those who may have started their certification journey but struggled with motivation. Personally, I recently comple...

  • 2298 Views
  • 2 replies
  • 2 kudos
Latest Reply
FJ
Contributor III
  • 2 kudos

I'm trying the Data Engineering professional exam at the end of the month. It's like a shot in the dark because no practice exams stop are available and from what I've seen online from people who already passed it, the Advanced Data Engineering with ...

  • 2 kudos
1 More Replies
johnb1
by Contributor
  • 2635 Views
  • 3 replies
  • 0 kudos

Cluster Configuration for ML Model Training

Hi!I am training a Random Forest (pyspark.ml.classification.RandomForestClassifier) on Databricks with 1,000,000 training examples and 25 features. I employ a cluster with one driver (16 GB Memory, 4 Cores), 2-6 workers (32-96 GB Memory, 8-24 Cores),...

  • 2635 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @John B​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we can...

  • 0 kudos
2 More Replies
Rishabh-Pandey
by Esteemed Contributor
  • 1620 Views
  • 2 replies
  • 5 kudos

"Hey everyone, it seems like there's some confusion about enhanced autoscaling in Databricks lately. If you're feeling lost or unsure abo...

"Hey everyone, it seems like there's some confusion about enhanced autoscaling in Databricks lately. If you're feeling lost or unsure about how it works, don't worry - you're not"Enhanced autoscaling is a feature in Databricks that enables dynamic sc...

  • 1620 Views
  • 2 replies
  • 5 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 5 kudos

Very informativeThanks for sharing!

  • 5 kudos
1 More Replies
Deiry
by New Contributor III
  • 989 Views
  • 1 replies
  • 3 kudos

Hi I'm Deiry �� I'm 25 (almost 26) years old, I'm a Databricks expert ��  Or at least that's my goal I work at Celerik....

Hi I'm Deiry I'm 25 (almost 26) years old, I'm a Databricks expert Or at least that's my goalI work at Celerik.My goal is to be a certified Machine Learning professional, so here we go

  • 989 Views
  • 1 replies
  • 3 kudos
Latest Reply
NhatHoang
Valued Contributor II
  • 3 kudos

Very confident, go ahead. :D​

  • 3 kudos
Sri_H
by New Contributor III
  • 1848 Views
  • 2 replies
  • 1 kudos

Databricks Academy - Access to training recording attended during Data & AI Summit 2022

Hi All,I attended a 2 day ML training during the Data & AI 2022 summit and I received an email from the events team (ataaisummit@typeaevents.com) telling that the recordings for training and related material will be available in my Databricks Academy...

  • 1848 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Sri H​ ! I am checking on this for you - hang tight! I'll try and get an update asap from the Academy Team.

  • 1 kudos
1 More Replies
sannycse
by New Contributor II
  • 4042 Views
  • 4 replies
  • 6 kudos

Resolved! read the csv file as shown in description

Project_Details.csvProjectNo|ProjectName|EmployeeNo100|analytics|1100|analytics|2101|machine learning|3101|machine learning|1101|machine learning|4Find each employee in the form of list working on each project?Output:ProjectNo|employeeNo100|[1,2]101|...

  • 4042 Views
  • 4 replies
  • 6 kudos
Latest Reply
User16764241763
Honored Contributor
  • 6 kudos

@SANJEEV BANDRU​  You can simply do thisJust change the file path CREATE TEMPORARY VIEW readcsv USING CSV OPTIONS ( path "dbfs:/docs/test.csv", header "true", delimiter "|", mode "FAILFAST");select ProjectNo, collect_list(EmployeeNo) Employeesfrom re...

  • 6 kudos
3 More Replies
adnanzak
by New Contributor II
  • 3132 Views
  • 3 replies
  • 0 kudos

Resolved! Deploy Databricks Machine Learing Models On Power BI

Hi Guys. I've implemented a Machine Learning model on Databricks and have registered it with a Model URL. I wanted to enquire if I could use this model on Power BI. Basically the model predicts industries based on client demographics. Ideally I would...

  • 3132 Views
  • 3 replies
  • 0 kudos
Latest Reply
adnanzak
New Contributor II
  • 0 kudos

Thank you @Werner Stinckens​  and @Joseph Kambourakis​  for your replies.

  • 0 kudos
2 More Replies
MichaelO
by New Contributor III
  • 13014 Views
  • 2 replies
  • 2 kudos

Resolved! Transfer files saved in filestore to either the workspace or to a repo

I built a machine learning model:lr = LinearRegression() lr.fit(X_train, y_train)which I can save to the filestore by:filename = "/dbfs/FileStore/lr_model.pkl" with open(filename, 'wb') as f: pickle.dump(lr, f)Ideally, I wanted to save the model ...

  • 13014 Views
  • 2 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

Workspace and Repo is not full available via dbfs as they have separate access rights. It is better to use MLFlow for your models as it is like git but for ML. I think using MLOps you can than put your model also to git.

  • 2 kudos
1 More Replies
NextIT
by New Contributor
  • 784 Views
  • 0 replies
  • 0 kudos

www.nextitvision.com

Online IT Training: ERP/SAP Online Training | JAVA Online Training | C++Online Training | ORACLE Online Training | Online Python Training | Machine Learning Training. If you Need more Details and Information Regarding IT Online Training. Please Visi...

  • 784 Views
  • 0 replies
  • 0 kudos
Joseph_B
by Databricks Employee
  • 2051 Views
  • 1 replies
  • 0 kudos

How should I tune hyperparameters when fitting models for every item?

My dataset has an "item" column which groups the rows into many groups. (Think of these groups as items in a store.) I want to fit 1 ML model per group. Should I tune hyperparameters for each group separately? Or should I tune them for the entire...

  • 2051 Views
  • 1 replies
  • 0 kudos
Latest Reply
Joseph_B
Databricks Employee
  • 0 kudos

For the first question ("which option is better?"), you need to answer that via your understanding of the problem domain.Do you expect similar behavior across the groups (items)?If so, that's a +1 in favor of sharing hyperparameters. And vice versa....

  • 0 kudos
Joseph_B
by Databricks Employee
  • 1977 Views
  • 1 replies
  • 0 kudos

How can I use Databricks to "automagically" distribute scikit-learn model training?

Is there a way to automatically distribute training and model tuning across a Spark cluster, if I want to keep using scikit-learn?

  • 1977 Views
  • 1 replies
  • 0 kudos
Latest Reply
Joseph_B
Databricks Employee
  • 0 kudos

It depends on what you mean by "automagically."If you want to keep using scikit-learn, there are ways to distribute parts of training and tuning with minimal effort. However, there is no "magic" way to distribute training an individual model in scik...

  • 0 kudos
User15787040559
by Databricks Employee
  • 3549 Views
  • 1 replies
  • 0 kudos

What's the difference between Normalization and Standardization?

Normalization typically means rescales the values into a range of [0,1].Standardization typically means rescales data to have a mean of 0 and a standard deviation of 1 (unit variance).

  • 3549 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16826994223
Honored Contributor III
  • 0 kudos

Normalization typically means rescales the values into a range of [0,1]. Standardization typically means rescales data to have a mean of 0 and a standard deviation of 1 (unit variance).A link which explains better is - https://towardsdatascience.com...

  • 0 kudos
Labels