- 944 Views
- 1 replies
- 1 kudos
Best NLP APIs to use with Spark which gives better performance
- 944 Views
- 1 replies
- 1 kudos
Latest Reply
By far the most popular and comprehensive library, to my knowledge, for Spark-native distributed NLP, is spark-nlp from John Snow Labs. https://nlp.johnsnowlabs.com/ It is open source (but with commercial support options) and has a whole lot of funct...
- 949 Views
- 1 replies
- 0 kudos
Can you please recommend suggestions for image manipulation once you read the data as an image ? Any specific library to use?
- 949 Views
- 1 replies
- 0 kudos
Latest Reply
Spark has a built-in 'image' data source which will read a directory of images files as a DataFrame: spark.read.format("image").load(...). The resulting DataFrame has the pixel data, dimensions, channels, etc.You can also read image files 'manually' ...
- 3438 Views
- 2 replies
- 0 kudos
Is it possible to write same table with Databricks and from OSS too, Also what if I want to read the data from Map redeuce or hive
- 3438 Views
- 2 replies
- 0 kudos
Latest Reply
Yes. The Delta client is open source, and lets you read/write Delta tables if you add it to your external application. See https://docs.delta.io/latest/index.html
1 More Replies
- 313 Views
- 0 replies
- 0 kudos
Databricks Certified Professional Data Scientist Does this exam require Databricks-specific or Spark-specific knowledge?No. Test-takers will be assessed on their understanding of the basics of machine learning and data science, how to complete each ...
- 313 Views
- 0 replies
- 0 kudos
- 255 Views
- 0 replies
- 0 kudos
python Vs Scala in Spark Daatricks.we are seeing Datbricks platform is more used with Python language than scala language , and databricks is also enhancing its python API more than the scala API, so is Scala will be past for Spark.Thanks
- 255 Views
- 0 replies
- 0 kudos
- 2652 Views
- 1 replies
- 1 kudos
Got following errorjava.lang.RuntimeException: Installation failed with message:Error installing R package: Could not install package with error: installation of package ‘rgdal’ had non-zero exit status
Full error log available at /databricks/drive...
- 2652 Views
- 1 replies
- 1 kudos
Latest Reply
We can use the below init script to install the packages in the cluster:%scala
dbutils.fs.put("dbfs:/databricks/init_scripts/rlib.sh", """
#!/bin/bash
sudo apt-get install -y libudunits2-dev
sudo add-apt-repository ppa:ubuntugis/ubuntugis-uns...
- 4753 Views
- 1 replies
- 0 kudos
A job recently began failing with the following error when a python notebook imports the pip package s3fs.ImportError: cannot import name 'maybe_sync' from 'fsspec.asyn' (/databricks/python/lib/python3.8/site-packages/fsspec/asyn.py)
ImportError Tr...
- 4753 Views
- 1 replies
- 0 kudos
Latest Reply
While checking the init script is installing the s3fs version 0.5.2.This version has issues at the moment from the pypi. I have tested version 0.6.0 that works fine. please change your requirement.txt file with a newer version of s3fs. Below is the p...
- 700 Views
- 0 replies
- 0 kudos
How would one discover features here and also know how to make sense of these features?Ideally, we can trace the usage of features in code as well.
- 700 Views
- 0 replies
- 0 kudos