For example, let's say I have a file called some-file, which is a gzipped text file. If I try spark.read.text('some-file'), it will return a bunch of gibberish since it doesn't know that the file is gzipped. I'm looking to manually tell spark the fil...
The community is field for the approval of the terms. The struggle of a great site is recommend for the norms. The value is suggested for the top of the vital paths for the finding members.
Is there a way to use sql desktop tools? because delta OSS or databricks does not provide desktop client (similar to azure data studio) to browse and query delta lake objects.I currently use databricks SQL , a webUI in the databricks workspace but se...
DSR is Delta Standalone Reader. see more here - https://docs.delta.io/latest/delta-standalone.htmlIts a crate (and also now a py library) that allows you to connect to delta tables without using spark (e.g. directly from python and not using pyspa...
Yes you need just to create service account for databricks and than assign storage admin role to bucket. After that you can mount GCS standard way:bucket_name = "<bucket-name>"
mount_name = "<mount-name>"
dbutils.fs.mount("gs://%s" % bucket_name, "/m...
Hi,I have a databricks database that has been created in the dbfs root S3 bucket, containing managed tables. I am looking for a way to move/migrate it to a mounted S3 bucket instead, and keep the database name.Any good ideas on how this can be done?T...
Hi Folks,I have installed and configured databricks CLI in my local machine. I tried to move a local file from my personal computer using dbfs cp to dbfs:/ path. I can see the file is copied from local, and is only visible in local. I am not able to ...
Hi, Could you try to save the file from your local machine to dbfs:/FileStore location?# Put local file test.py to dbfs:/FileStore/test.pydbfs cp test.py dbfs:/FileStore/test.py
Hi,i'm using concat_ws in scala to calculate a checksum for the dataframe, i.e.:df.withColumn("CHECKSUM", sha2(functions.concat_ws("", dataframe.columns.map(col): _*), 512))I have one example here with just 24 columns that already throws the followin...
at least one of column names can have some strange char, whitespace or something,or at least one of column type is not compatible (for example StructType)you can separate your code to two or more steps. First generate list of columns as some variable...
I am using a docker recipe for configuring my databricks cluster. It is working fine for everything else however when I tried to display any image data using any python utility such as matplotlib, PIL or Opencv etc. the image does not get displayed o...
I have a large time series with many measuring stations recording the same 5 data (Temperature, Humidity, etc.) I want to predict a future moment with a time series model, for which I pass the data from all the measuring stations to the Deep Learning...
Below are the steps that I followed. I still get an error message.Create a repo in gitlab enterprise editionIn GitLab, create a personal access token that allows access to your repositories ( with read_repository and write_repository permissions)Save...
Hi @Sarvagna Mahakali​ the repository which you are trying to add might be behind the VPN, our service cannot access it since it has no access to the VPN network.You may need the Enterprise Git / VPC to connect to the repository.Kindly check and let...
The user is trying to cast string to decimal when encountering zeros. The cast function displays the '0' as '0E-16'. could you please let us know your thoughts on whether 0s can be displayed as 0s?from pyspark.sql import functions as F
df = spark.s...
If the scale of decimal type is greater than 6, scientific notation kicks in hence seeing 0E-16.This behavior is described in the existing OSS spark issue - https://issues.apache.org/jira/browse/SPARK-25177Kindly cast the column to a decimal type les...
Hello,I want to run some notebooks from notebook "A".And regardless of the contents of the some notebook, it is run for a long time (20 seconds). It is constans value and I do not know why it takes so long.I tried run simple notebook with one input p...
Okay I am not able to set the same session for the both notebooks (parent and children).So my result is to use %run ./notebook_name .I put all the code to functions and now I can use them.Example:# Children notebook
def do_something(param1, param2):
...
OK so I'm trying to determine if a timestamp column has a regular interval or not, i.e. the difference between each consecutive value is the same across the entire column.I tried something like thisval timeColumn: String =
val groupByColumn: String...
Hi, is there any way to get alert automatically from databricks ganglia ? That means that a developer don’t need to review the logs manually but would get notification that resources are underutilized for example.
Hi @Md Tahseen Anam​ , You can install Datadog agents on cluster nodes to send Datadog metrics to your Datadog account. The following notebook demonstrates how to install a Datadog agent on a cluster using a cluster-scoped init script.To install the ...