- 4639 Views
- 7 replies
- 13 kudos
Resolved! Getting started with Databricks Machine Learning
hello all,I am fairly new to Databricks technologies and I have taken the Lakehouse Fundamentals course but I am interested in Machine Learning technologies. I will appreciate any help with materials and curated free study paths and packs that can he...
- 4639 Views
- 7 replies
- 13 kudos

- 13 kudos
https://pages.databricks.com/rs/094-YMS-629/images/LearningSpark2.0.pdf is a free book and has some machine learning examples. The way I learned was mostly from the docs, which are good and have good coding examples.
- 13 kudos
- 3507 Views
- 2 replies
- 1 kudos
MLflow stopped working after one a few successful runs
Both the mlflow.log_metrics() call and the web UI worked in the first few days but started failing at some point. The log doesn't give any clue why this is happening. It's suspicious but is there a limit of mlflow requests? It's quite annoying that t...
- 3507 Views
- 2 replies
- 1 kudos

- 1 kudos
The MLflow slack may be a good place to ask:
- 1 kudos
- 6677 Views
- 3 replies
- 15 kudos
Resolved! Do One-Hot-Encoding (OHE) before or after split data to train and test dataframe
Hi,I wonder that I should do OHE before or after I split data to build up a ML model.Please give some advise.
- 6677 Views
- 3 replies
- 15 kudos
- 15 kudos
Hi @Nhat Hoang​ ,While not Databricks-specific, here's a good answer:"If you perform the encoding before the split, it will lead to data leakage (train-test contamination). In this sense, you will introduce new data (integers of Label Encoders) and u...
- 15 kudos
- 686 Views
- 0 replies
- 0 kudos
Databricks-hosted MLFlow ignores the `mlflow.user` tag when set in the "runs/create" REST API call and doesn't let me change it after a ...
Databricks-hosted MLFlow ignores the `mlflow.user` tag when set in the "runs/create" REST API call and doesn't let me change it after a run is created.Open source MLFlow respects the tag.Could you please change the server to respect this field when i...
- 686 Views
- 0 replies
- 0 kudos
- 5196 Views
- 7 replies
- 15 kudos
What programming frameworks and languages can be used with Databricks Feature Store
To leverage Databricks feature store, can only Python be utilized? If otherwise, what other language frameworks are supported. Below is my question in 2 partsPart 1) What languages can be utilized to write data frames as feature tables in the Feature...
- 5196 Views
- 7 replies
- 15 kudos
- 15 kudos
you can use any of these languages Python, SQL, Scala and R
- 15 kudos
- 2221 Views
- 3 replies
- 10 kudos
I am Avi, a Solutions Architect at Databricks. We have built an application to demonstrate how AI-capabilities could be easily integrated to deliver n...
I am Avi, a Solutions Architect at Databricks. We have built an application to demonstrate how AI-capabilities could be easily integrated to deliver novel user experiences. The application allows users to submit images and text, and uses these inputs...
- 2221 Views
- 3 replies
- 10 kudos
- 10 kudos
Hi @Avinash Sooriyarachchi​ Thanks for sharing it.
- 10 kudos
- 2089 Views
- 3 replies
- 1 kudos
inability to consume model
Hello, I would like to ask where the problem may be. I want to create a real time endpoint to real time model infering. . i have created a simple cluster but i am not able to deploy the model i still get a yellow status - pending. the whole pro...
- 2089 Views
- 3 replies
- 1 kudos
- 1 kudos
Hi, already solved.... it was just wrong selected runtime
- 1 kudos
- 1233 Views
- 0 replies
- 5 kudos
youtu.be
I'm Avi, a Solutions Architect at Databricks working at the intersection of Data Engineering and Machine Learning.Streaming data processing has moved from niche to mainstream, and deploying machine learning models in such data streams opens up a mult...
- 1233 Views
- 0 replies
- 5 kudos
- 8072 Views
- 3 replies
- 3 kudos
Resolved! Spark Error/Exception Handling
I am creating new application and looking for ideas how to handle exceptions in Spark, for example ThreadPoolExecution. Are there any good practice in terms of error handling and dealing with specific exceptions ?
- 8072 Views
- 3 replies
- 3 kudos
- 3 kudos
@Krzysztof Nojman​ Can you please click on the "Select As Best" button if you find the information provided helps resolve your question.
- 3 kudos
- 14192 Views
- 7 replies
- 16 kudos
Resolved! Way of using pymc.model_to_graphviz into a Databricks notebook
Hi everybody,I created a simple bayesian model using the pymc library in Python. I would like to graphically represent my model using the pymc.model_to_graphviz(model=model) method.However, it seems it does not work within a databrcks notebook, even ...
- 14192 Views
- 7 replies
- 16 kudos
- 5192 Views
- 1 replies
- 4 kudos
Resolved! Insert into delta table fails
Hello experts. We are trying to execute an insert command with less columns than the target table:Insert into table_name( col1, col2, col10)Select col1, col2, col10from table_name2However the above fails with:Error in SQL statement: DeltaAnalysisExce...
- 5192 Views
- 1 replies
- 4 kudos
- 4 kudos
Hi @ELENI GEORGOUSI​ Yes. When you are doing an insert, your provided schema should match with the target schema else it would throw an error.But you can still insert the data using another approach. Create a dataframe with your data having less colu...
- 4 kudos
- 2433 Views
- 3 replies
- 1 kudos
How does mlflow determine if a pyfunc model uses SparkContext?
I've been getting this error pretty regularly while working with mlflow:"It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transformation. SparkContext can only be used on the driver, not in code that ...
- 2433 Views
- 3 replies
- 1 kudos

- 1 kudos
I checked the page and it looks like there is no integration with Datarobot and Datarobot doesn't contribute to mlflow. https://mlflow.org/ has all the integrations listed
- 1 kudos
- 2848 Views
- 1 replies
- 2 kudos
Resolved! Unable to create feature table using databricks API .FeatureStoreClient()
I am following example steps from databricks documentation https://docs.databricks.com/_static/notebooks/machine-learning/feature-store-taxi-example.htmlI am using Feature Store client v0.3.6 and above.However on trying to create feature table with f...
- 2848 Views
- 1 replies
- 2 kudos
- 2 kudos
After much digging, observed i was using standard runtime. Once i switched to ML runtime of databricks, issue was resolved. To use Feature Store capability, ensure that you select a Databricks Runtime ML version from the Databricks Runtime Version dr...
- 2 kudos
- 1631 Views
- 1 replies
- 4 kudos
Stream data from Delta tables replicated with Fivetran into DLT
I'm attempting to stream into a DLT pipeline with data replicated from Fivetran directly into Delta tables in another database than the one that the DLT pipeline uses. This is not an aggregate, and I don't want to recompute the entire data model eac...
- 1631 Views
- 1 replies
- 4 kudos

- 4 kudos
Hi @M A​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question first. Or else bricksters will get back to you soon. Thanks
- 4 kudos
- 819 Views
- 0 replies
- 2 kudos
What is "git source" supposed to be in the run page for an mlflow experiment?
I have an mlflow experiment with runs in it. When I go to a run's page (with the parameters/metrics/logged artifacts), there is a part that says `git source: my_project_name@some_letters` and I was wondering what that was supposed to point to.When I ...
- 819 Views
- 0 replies
- 2 kudos
Join Us as a Local Community Builder!
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now-
Access control
3 -
Access Data
2 -
AccessKeyVault
1 -
ADB
2 -
Airflow
1 -
Amazon
2 -
Apache
1 -
Apache spark
3 -
APILimit
1 -
Artifacts
1 -
Audit
1 -
Autoloader
6 -
Autologging
2 -
Automation
2 -
Automl
32 -
AWS
7 -
Aws databricks
1 -
AWSSagemaker
1 -
Azure
32 -
Azure active directory
1 -
Azure blob storage
2 -
Azure data lake
1 -
Azure Data Lake Storage
3 -
Azure data lake store
1 -
Azure databricks
32 -
Azure event hub
1 -
Azure key vault
1 -
Azure sql database
1 -
Azure Storage
2 -
Azure synapse
1 -
Azure Unity Catalog
1 -
Azure vm
1 -
AzureML
2 -
Bar
1 -
Beta
1 -
Better Way
1 -
BI Integrations
1 -
BI Tool
1 -
Billing and Cost Management
1 -
Blob
1 -
Blog
1 -
Blog Post
1 -
Broadcast variable
1 -
Business Intelligence
1 -
CatalogDDL
1 -
CatalogPricing
1 -
Centralized Model Registry
1 -
Certification
2 -
Certification Badge
1 -
Change
1 -
Change Logs
1 -
Chatgpt
2 -
Check
2 -
CHUNK
1 -
Classification Model
1 -
Cloud Storage
1 -
Cluster
10 -
Cluster policy
1 -
Cluster Start
1 -
Cluster Tags
1 -
Cluster Termination
2 -
Clustering
1 -
ClusterMemory
1 -
CNN HOF
1 -
Column names
1 -
Community Edition
1 -
Community Edition Password
1 -
Community Members
1 -
Company Email
1 -
Concat Ws
1 -
Conda
1 -
Condition
1 -
Config
1 -
Configure
3 -
Confluent Cloud
1 -
Container
2 -
ContainerServices
1 -
Control Plane
1 -
ControlPlane
1 -
Copy
1 -
Copy into
2 -
CosmosDB
1 -
Cost
1 -
Courses
2 -
Csv files
1 -
Dashboards
1 -
Data
8 -
Data Engineer Associate
1 -
Data Engineer Certification
1 -
Data Explorer
1 -
Data Ingestion
2 -
Data Ingestion & connectivity
11 -
Data Quality
1 -
Data Quality Checks
1 -
Data Science & Engineering
2 -
databricks
5 -
Databricks Academy
3 -
Databricks Account
1 -
Databricks AutoML
9 -
Databricks Cluster
3 -
Databricks Community
5 -
Databricks community edition
4 -
Databricks connect
1 -
Databricks dbfs
1 -
Databricks Feature Store
1 -
Databricks Job
1 -
Databricks Lakehouse
1 -
Databricks Mlflow
4 -
Databricks Model
2 -
Databricks notebook
10 -
Databricks ODBC
1 -
Databricks Platform
1 -
Databricks Pyspark
1 -
Databricks Python Notebook
1 -
Databricks Runtime
9 -
Databricks SQL
8 -
Databricks SQL Permission Problems
1 -
Databricks Terraform
1 -
Databricks Training
2 -
Databricks Unity Catalog
1 -
Databricks V2
1 -
Databricks version
1 -
Databricks Workflow
2 -
Databricks Workflows
1 -
Databricks workspace
2 -
Databricks-connect
1 -
DatabricksContainer
1 -
DatabricksML
6 -
Dataframe
3 -
DataSharing
1 -
Datatype
1 -
DataVersioning
1 -
Date Column
1 -
Dateadd
1 -
DB Notebook
1 -
DB Runtime
1 -
DBFS
5 -
DBFS Rest Api
1 -
Dbt
1 -
Dbu
1 -
DDL
1 -
DDP
1 -
Dear Community
1 -
DecisionTree
1 -
Deep learning
4 -
Default Location
1 -
Delete
1 -
Delt Lake
4 -
Delta
24 -
Delta lake table
1 -
Delta Live
1 -
Delta Live Tables
6 -
Delta log
1 -
Delta Sharing
3 -
Delta-lake
1 -
Deploy
1 -
DESC
1 -
Details
1 -
Dev
1 -
Devops
1 -
Df
1 -
Different Notebook
1 -
Different Parameters
1 -
DimensionTables
1 -
Directory
3 -
Disable
1 -
Distribution
1 -
DLT
6 -
DLT Pipeline
3 -
Dolly
5 -
Dolly Demo
2 -
Download
2 -
EC2
1 -
Emr
2 -
Ensemble Models
1 -
Environment Variable
1 -
Epoch
1 -
Error handling
1 -
Error log
2 -
Eventhub
1 -
Example
1 -
Experiments
4 -
External Sources
1 -
Extract
1 -
Fact Tables
1 -
Failure
2 -
Feature Lookup
2 -
Feature Store
52 -
Feature Store API
2 -
Feature Store Table
1 -
Feature Table
6 -
Feature Tables
4 -
Features
2 -
FeatureStore
2 -
File Path
2 -
File Size
1 -
Fine Tune Spark Jobs
1 -
Forecasting
2 -
Forgot Password
2 -
Garbage Collection
1 -
Garbage Collection Optimization
1 -
Github
2 -
Github actions
2 -
Github Repo
2 -
Gitlab
1 -
GKE
1 -
Global Init Script
1 -
Global init scripts
4 -
Governance
1 -
Hi
1 -
Horovod
1 -
Html
1 -
Hyperopt
4 -
Hyperparameter Tuning
2 -
Iam
1 -
Image
3 -
Image Data
1 -
Inference Setup Error
1 -
INFORMATION
1 -
Input
1 -
Insert
1 -
Instance Profile
1 -
Int
2 -
Interactive cluster
1 -
Internal error
1 -
INVALID STATE
1 -
Invalid Type Code
1 -
IP
1 -
Ipython
1 -
Ipywidgets
1 -
JDBC Connections
1 -
Jira
1 -
Job
4 -
Job Parameters
1 -
Job Runs
1 -
JOBS
5 -
Jobs & Workflows
6 -
Join
1 -
Jsonfile
1 -
Kafka consumer
1 -
Key Management
1 -
Kinesis
1 -
Lakehouse
1 -
Large Datasets
1 -
Latest Version
1 -
Learning
1 -
Limit
3 -
LLM
3 -
Local computer
1 -
Local Machine
1 -
Log Model
2 -
Logging
1 -
Login
1 -
Logs
1 -
Long Time
2 -
Low Latency APIs
2 -
LTS ML
3 -
Machine
3 -
Machine Learning
24 -
Machine Learning Associate
1 -
Managed Table
1 -
Max Retries
1 -
Maximum Number
1 -
Medallion Architecture
1 -
Memory
3 -
Merge Into
1 -
Metadata
1 -
Metrics
3 -
Microsoft azure
1 -
ML Lifecycle
4 -
ML Model
4 -
ML Pipeline
2 -
ML Practioner
3 -
ML Runtime
1 -
MlFlow
75 -
MLflow API
5 -
MLflow Artifacts
2 -
MLflow Experiment
6 -
MLflow Experiments
3 -
Mlflow Model
10 -
Mlflow project
4 -
Mlflow registry
3 -
Mlflow Run
1 -
Mlflow Server
5 -
MLFlow Tracking Server
3 -
MLModel
1 -
MLModels
2 -
Model Deployment
4 -
Model Lifecycle
6 -
Model Loading
2 -
Model Monitoring
1 -
Model registry
5 -
Model Serving
1 -
Model Serving Cluster
2 -
Model Serving REST API
6 -
Model Training
2 -
Model Tuning
1 -
Models
8 -
Module
3 -
Modulenotfounderror
1 -
MongoDB
1 -
Mount Point
1 -
Mounts
1 -
Multi
1 -
Multiline
1 -
Multiple users
1 -
Nested
1 -
New
2 -
New Feature
1 -
New Features
1 -
New Workspace
1 -
Nlp
3 -
Note
1 -
Notebook
6 -
Notebooks
5 -
Notification
2 -
Object
3 -
Onboarding
1 -
Online Feature Store Table
1 -
OOM Error
1 -
Open source
2 -
Open Source MLflow
4 -
Optimization
2 -
Optimize Command
1 -
OSS
3 -
Overwatch
1 -
Overwrite
2 -
Packages
2 -
Pandas udf
4 -
Pandas_udf
1 -
Parallel
1 -
Parallel processing
1 -
Parallel Runs
1 -
Parallelism
1 -
Parameter
2 -
PARAMETER VALUE
2 -
Partner Academy
1 -
Pending State
2 -
Performance Tuning
1 -
Permission
1 -
Photon Engine
1 -
Pickle
1 -
Pickle Files
2 -
Pip
2 -
Pipeline Model
1 -
Points
1 -
Possible
1 -
Postgres
1 -
Presentation
1 -
Pricing
2 -
Primary Key
1 -
Primary Key Constraint
1 -
Production
2 -
Progress bar
2 -
Proven Practices
2 -
Public
2 -
Pycharm IDE
1 -
Pymc3 Models
2 -
PyPI
1 -
Pyspark
6 -
Python
21 -
Python API
1 -
Python Code
1 -
Python Function
3 -
Python Libraries
1 -
Python Packages
1 -
Python Project
1 -
Python script
1 -
Pytorch
3 -
R Shiny
1 -
Reading-excel
2 -
Redis
2 -
Region
1 -
Remote RPC Client
1 -
Rest
1 -
Rest API
14 -
RESTAPI
1 -
Result
1 -
Runtime update
1 -
S3bucket
2 -
Sagemaker
1 -
Salesforce
1 -
SAP
1 -
Scalability
1 -
Scalable Machine
2 -
Schema
4 -
Schema evolution
1 -
Script
1 -
Search
1 -
Security
2 -
Security Exception
1 -
Self Service Notebooks
1 -
Server
1 -
Serverless
1 -
Serverless Real
1 -
Serving
1 -
Sf Username
1 -
Shap
2 -
Similar Issue
1 -
Similar Support
1 -
Size
1 -
Sklearn
1 -
Slow
1 -
Small Scale Experimentation
1 -
Source Table
1 -
Spark
13 -
Spark config
1 -
Spark connector
1 -
Spark Error
1 -
Spark MLlib
2 -
Spark Pandas Api
1 -
Spark ui
1 -
Spark Version
2 -
Spark-submit
1 -
SparkML Models
2 -
Sparknlp
3 -
Spot
1 -
SQL
19 -
SQL Editor
1 -
SQL Queries
1 -
SQL Visualizations
1 -
Stage failure
2 -
Storage
3 -
Stream
2 -
Stream Data
1 -
Structtype
1 -
Structured streaming
2 -
Study Material
1 -
Summit23
2 -
Support
1 -
Support Team
1 -
Synapse
1 -
Synapse ML
1 -
Table
4 -
Table access control
1 -
Tableau
1 -
Task
1 -
Temporary View
1 -
Tensor flow
1 -
Test
1 -
Timeseries
1 -
Timestamps
1 -
TODAY
1 -
Training
6 -
Transaction Log
1 -
Trying
1 -
Tuning
2 -
UAT
1 -
Ui
1 -
Unexpected Error
1 -
Unity Catalog
12 -
Use Case
2 -
Use cases
1 -
UTC
2 -
Uuid
1 -
Validate ML Model
2 -
Values
1 -
Variable
1 -
Vector
1 -
Versioncontrol
1 -
Visualization
2 -
Web App Azure Databricks
1 -
Web ui
1 -
Weekly Release Notes
2 -
Whl
1 -
Worker Nodes
1 -
Workflow
2 -
Workflow Jobs
1 -
Workspace
2 -
Write
1 -
Writing
1 -
Xgboost
2 -
Z-ordering
1 -
Zorder
1
- « Previous
- Next »
User | Count |
---|---|
89 | |
39 | |
36 | |
25 | |
25 |