cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Anonymous
by Not applicable
  • 1007 Views
  • 1 replies
  • 0 kudos

Resolved! Best practice for Image manipulation

Can you please recommend suggestions for image manipulation once you read the data as an image ? Any specific library to use?

  • 1007 Views
  • 1 replies
  • 0 kudos
Latest Reply
sean_owen
Honored Contributor II
  • 0 kudos

Spark has a built-in 'image' data source which will read a directory of images files as a DataFrame: spark.read.format("image").load(...). The resulting DataFrame has the pixel data, dimensions, channels, etc.You can also read image files 'manually' ...

  • 0 kudos
User16826994223
by Honored Contributor III
  • 3595 Views
  • 2 replies
  • 0 kudos

Can I access Delta tables outside of Databricks Runtime?

Is it possible to write same table with Databricks and from OSS too, Also what if I want to read the data from Map redeuce or hive

  • 3595 Views
  • 2 replies
  • 0 kudos
Latest Reply
sean_owen
Honored Contributor II
  • 0 kudos

Yes. The Delta client is open source, and lets you read/write Delta tables if you add it to your external application. See https://docs.delta.io/latest/index.html

  • 0 kudos
1 More Replies
User16826994223
by Honored Contributor III
  • 332 Views
  • 0 replies
  • 0 kudos

Databricks Certified Professional Data Scientist  Does this exam require Databricks-specific or Spark-specific knowledge?No. Test-takers will be asse...

Databricks Certified Professional Data Scientist Does this exam require Databricks-specific or Spark-specific knowledge?No. Test-takers will be assessed on their understanding of the basics of machine learning and data science, how to complete each ...

  • 332 Views
  • 0 replies
  • 0 kudos
User16826994223
by Honored Contributor III
  • 281 Views
  • 0 replies
  • 0 kudos

python Vs Scala in Spark Daatricks. we are seeing Datbricks platform is more used with Python language than scala language , and databricks is also e...

python Vs Scala in Spark Daatricks.we are seeing Datbricks platform is more used with Python language than scala language , and databricks is also enhancing its python API more than the scala API, so is Scala will be past for Spark.Thanks

  • 281 Views
  • 0 replies
  • 0 kudos
User15986662700
by New Contributor III
  • 3455 Views
  • 1 replies
  • 0 kudos
  • 3455 Views
  • 1 replies
  • 0 kudos
Latest Reply
User15986662700
New Contributor III
  • 0 kudos

If your data frame has complex fields, there's no standard way to convert it to a csv file and enable exporting, thus the option is disabled. Try to flatten/map the data frame before displaying, this will enable the "download full results" option aga...

  • 0 kudos
User16753724663
by Valued Contributor
  • 1397 Views
  • 1 replies
  • 0 kudos
  • 1397 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16753724663
Valued Contributor
  • 0 kudos

We can use the below api to list out the jobs and then use the delete job api:https://docs.databricks.com/dev-tools/api/latest/jobs.html#listListEndpoint HTTP Method2.0/jobs/list GETOnce we list out the jobs, then we can use below API to delete them:...

  • 0 kudos
User16753724663
by Valued Contributor
  • 2744 Views
  • 1 replies
  • 1 kudos

Unable to install sf and rgeos R packages on the cluster

Got following errorjava.lang.RuntimeException: Installation failed with message:Error installing R package: Could not install package with error: installation of package ‘rgdal’ had non-zero exit status   Full error log available at /databricks/drive...

  • 2744 Views
  • 1 replies
  • 1 kudos
Latest Reply
User16753724663
Valued Contributor
  • 1 kudos

We can use the below init script to install the packages in the cluster:%scala   dbutils.fs.put("dbfs:/databricks/init_scripts/rlib.sh", """   #!/bin/bash   sudo apt-get install -y libudunits2-dev   sudo add-apt-repository ppa:ubuntugis/ubuntugis-uns...

  • 1 kudos
User16753724663
by Valued Contributor
  • 5084 Views
  • 1 replies
  • 0 kudos

Error importing pip package s3fs

A job recently began failing with the following error when a python notebook imports the pip package s3fs.ImportError: cannot import name 'maybe_sync' from 'fsspec.asyn' (/databricks/python/lib/python3.8/site-packages/fsspec/asyn.py)   ImportError Tr...

  • 5084 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16753724663
Valued Contributor
  • 0 kudos

While checking the init script is installing the s3fs version 0.5.2.This version has issues at the moment from the pypi. I have tested version 0.6.0 that works fine. please change your requirement.txt file with a newer version of s3fs. Below is the p...

  • 0 kudos
Joseph_B
by New Contributor III
  • 1691 Views
  • 1 replies
  • 1 kudos
  • 1691 Views
  • 1 replies
  • 1 kudos
Latest Reply
Joseph_B
New Contributor III
  • 1 kudos

You can find a lot more info on this at this MLflow product page, including a comparison table at the bottom. I'd summarize that comparison as: Databricks provides three key things in its managed MLflow service.Security: MLflow experiments, models, ...

  • 1 kudos
Anonymous
by Not applicable
  • 736 Views
  • 0 replies
  • 0 kudos

Feature Discovery

How would one discover features here and also know how to make sense of these features?Ideally, we can trace the usage of features in code as well.

  • 736 Views
  • 0 replies
  • 0 kudos
Joseph_B
by New Contributor III
  • 1830 Views
  • 1 replies
  • 0 kudos
  • 1830 Views
  • 1 replies
  • 0 kudos
Latest Reply
Joseph_B
New Contributor III
  • 0 kudos

You can find the MLflow version in the runtime release notes, along with a list of every other library provided. E.g., for DBR 8.3 ML, you can look at the release notes for AWS, Azure, or GCP.The MLflow client API (i.e., the API provided by installi...

  • 0 kudos
User16826994223
by Honored Contributor III
  • 1275 Views
  • 1 replies
  • 0 kudos

Muliple Where condition vs AND && in Pyspark

.where((col('state')==state) & (col('month')>startmonth)I can do the where conditions both ways. I think the one below add readability. Is there any other difference and which is the best?.where(col('state')==state).where(col('month')>startmonth)

  • 1275 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16826994223
Honored Contributor III
  • 0 kudos

You can use explain to see what type of physical and logical plans are getting created . This is the best way to see difference , but as mentioned in the question , it should give the same physical plan

  • 0 kudos
User16788317466
by New Contributor II
  • 1129 Views
  • 2 replies
  • 0 kudos

How do I efficiently read image data for a deep learning model?

How do I efficiently read image data for a deep learning model?

  • 1129 Views
  • 2 replies
  • 0 kudos
Latest Reply
Joseph_B
New Contributor III
  • 0 kudos

Our documentation provides nice examples of preparing image data for training and inference.Training: See docs for AWS, Azure, GCPInference: See reference solution for AWS, Azure, GCP

  • 0 kudos
1 More Replies
User16789201666
by Contributor II
  • 1503 Views
  • 4 replies
  • 0 kudos

How do you control the cost of provisioning a cluster?

How do you govern the cost of running clusters in Databricks so you're not sticker shocked?

  • 1503 Views
  • 4 replies
  • 0 kudos
Latest Reply
User16826994223
Honored Contributor III
  • 0 kudos

Less use of Interactive cluster and more use of job cluster can one of the way above others

  • 0 kudos
3 More Replies
Labels