Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
The shapley progress bar or tqdm progress bar in general doesn't show in notebooks. Do I need to set something special to get this or any other similar widgets to work?
Wondering if anyone is willing to share their project ideas here. It would be great to know how things are going and if anyone has a good open-source dataset they are willing to share.
Hi,i stream data from postgis to s3 using debezium. postgis->debezium->s3->spark(databricks)once read it i decode it and i can see that the binary representation is similiar to what i have in postgis, on a wkb formated column.once i try to read it ei...
Hello,I know how to create .shp file from Geopandas dataframe using code similar to this, also mentioned on SO:gpd_df = geopandas.GeoDataFrame(pandas_df, geometry='geom')
gpd_df .to_file("username/nh.shp")However I have .parquet files that I can load...
@Bartosz Maciejewski :Spark does not have native support for writing Shapefiles directly. However, you can use a third-party library such as GeoPandas or PyShp to write your Spark DataFrame to a Shapefile.Here's an example of how to use GeoPandas to...
Hi I'm facing an issue when writing to a salesforce object. I'm using the springml/spark-salesforce library. I have the above libraries installed as recommended based on my research.I try to write like this:(_sqldf .write .format("com.springml.spar...
I have files in azure data lake. I am using autoloader to read the incremental filesfiles don't have primary key to load, In this case i want to use some columns and generate an hashkey and use it as primary key to do changes.In this case i want to ...
Happy to share that #WAVICLE was able to do a hands-on workshop on #[Databricks notebook] #[Databricks SQL] #[Databricks cluster] Fundamentals with KCT College, Coimbatore, India.
I am looking to display SHAP plots, here is the code:import xgboost import shap
shap.initjs() # load JS visualization code to notebookX,y = shap.datasets.boston() # train XGBoost model
model = xgboost.train({"learning_rate": 0.01}, xgboost.DMatri...
As @Vinh dqvinh87 noted, the accepted solution only works for force_plot. For other plots, the following trick works for me:import matplotlib.pyplot as plt
p = shap.summary_plot(shap_values, test_df, show=False)
display(p)