cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

CrisCampos
by New Contributor II
  • 3173 Views
  • 1 replies
  • 1 kudos

How to load a "pickle/joblib" file on Databricks

Hi Community, I am trying to load a joblib on Databricks, but doesn't seems to be working.Getting an error message: "Incompatible format detected"  Any idea of how to load this type of file on db?Thanks!

image image
  • 3173 Views
  • 1 replies
  • 1 kudos
Latest Reply
tapash-db
Databricks Employee
  • 1 kudos

You can import joblib/joblibspark package to load joblib files

  • 1 kudos
UmaMahesh1
by Honored Contributor III
  • 1805 Views
  • 1 replies
  • 2 kudos

Checkpoint issue when loading data from confluent kafka

I have a streaming notebook which fetches messages from confluent Kafka topic and loads them into adls. It is a streaming notebook with the trigger as continuous processing. Before loading the message (which is in Avro format), I'm flattening out the...

  • 1805 Views
  • 1 replies
  • 2 kudos
Latest Reply
Avinash_94
New Contributor III
  • 2 kudos

Best approach is to not to depend on Kafka’s commit mechanism! We can store processing result and message offset to external data store in the same database transaction. So, if the database transaction fails, both commit and processing will fail and ...

  • 2 kudos
Arunsundar
by New Contributor III
  • 2918 Views
  • 4 replies
  • 4 kudos

The possibility of finding the workload dynamically and spin up the cluster based on the workload

Hi Team,Good morning. I would like to understand if there is a possibility to determine the workload automatically through code (data load from a file to a table, determine the file size, kind of a benchmark that we can check), based on which we can ...

  • 2918 Views
  • 4 replies
  • 4 kudos
Latest Reply
pvignesh92
Honored Contributor
  • 4 kudos

Hi @Arunsundar Muthumanickam​ , When you say workload, I believe you might be handling various volumes of data between Dev and Prod environment. If you are using Databricks cluster and do not have much idea on how the volumes might turn out in differ...

  • 4 kudos
3 More Replies
RamaSantosh
by New Contributor II
  • 3953 Views
  • 2 replies
  • 3 kudos

Data load from Azure databricks dataframe to cosmos db container

I am trying to load data from Azure databricks dataframe to cosmos db container using below commandcfg = { "spark.cosmos.accountEndpoint" : cosmosEndpoint, "spark.cosmos.accountKey" : cosmosMasterKey, "spark.cosmos.database" : cosmosDatabaseName, "sp...

  • 3953 Views
  • 2 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hey @Rama Santosh Ravada​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from y...

  • 3 kudos
1 More Replies
Labels