I am trying to insert 6GB of data into cosmos db using OLTP Connector
Container RU's:40000
Cluster Config:
cfg = {
"spark.cosmos.accountEndpoint" : cosmosdbendpoint,
"spark.cosmos.accountKey" : cosmosdbmasterkey,
"spark.cosmos.database" : cosmosdatabase,
"spark.cosmos.container" : cosmosdbcontainer,
}
spark.conf.set("spark.sql.catalog.cosmosCatalog", "com.azure.cosmos.spark.CosmosCatalog")
spark.conf.set("spark.sql.catalog.cosmosCatalog.spark.cosmos.accountEndpoint", cosmosdbendpoint)
spark.conf.set("spark.sql.catalog.cosmosCatalog.spark.cosmos.accountKey", cosmosdbmasterkey)
spark.conf.set("spark.cosmos.write.bulk.enabled", "true")
json_df.write.format("cosmos.oltp").options(**cfg).mode("APPEND").save()
It is taking around 3hrs for me to load into cosmos db
1.Is increasing RU's is the only approach to decrease the execution time
2.Other than OLTP connector, do we have any ways to insert bulk data within less time
3.How to calculate RU's based on data size