JAHNAVI
Databricks Employee
Databricks Employee

@tkfm_s
Yes, using SynapseML's LightGBMClassifier / LightGBMRegressor lets you train directly on a Spark DataFrame, no pandas conversion required and also ensure partitions match executor cores so LightGBM uses them all. And if you have wide range of columns it is advised to decrease them to avoid OOM. 

Attaching the document for lightgbm distributed training: 
https://lightgbm.readthedocs.io/en/latest/Parallel-Learning-Guide.html

Jahnavi N