Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
Hello and welcome to our wonderful Community!Whether you are here by chance or intention, we're thrilled to have you join us. Before you dive into the plethora of discussions and activities happening here, we'd love to get to know you better!
...
Can I get some help from Databricks to help me understand how those timestamps being interpreted? Some are really confusing me. I have timestamp coming into AWS Databricks as String type. And the string timestamp is represented in UTC. I ran below qu...
Hi All,I am testing the sql generated by our ETL software to see if it can run on data bricks SQL which I believe is Delta Tables underneath. This is the statement we are testing. As far as I can tell from the manual the from clause is not supported ...
Hi All,I am testing the sql generated by our ETL software to see if it can run on data bricks SQL which I believe is Delta Tables underneath. This is the statement we are testing. As far as I can tell from the manual the from clause is not supported ...
Hello I am trying to set max batch size for pandas-udf in Databricks notebook, but in my tests it doesn’t have any effect on size. spark.conf.set("spark.sql.execution.arrow.enabled", "true")spark.conf.set('spark.sql.execution.arrow.maxRecordsPerBatch...
Hi,I'm using Databricks Connect to run Scala code from IntelliJ on a Databricks single node cluster.Even with the simplest code, I'm experiencing this error:org.apache.spark.SparkException: grpc_shaded.io.grpc.StatusRuntimeException: INTERNAL: org.ap...
I have found that the results of the bitmap_count() function output differs significantly between databricks and snowflake.eg: snowflake returns a value of '1' for this code. "select bitmap_count(X'0001056c000000000000') " while Databricks returns a...
Hi @vigneshp , Good Day!
In Databricks, bitmap_count function returns the number of bits set in a BINARY string representing a bitmap. This function is typically used to count distinct values in combination with the bitmap_bucket_number() and the bi...
Hi Guys,I am a complete newbie to data bricks, we are trying to figure out if our data models and ETL can run on it.I have got the failure to launch message. I have read this message as well.https://community.databricks.com/t5/data-engineering/cluste...
Hi Guys,I am a complete newbie to data bricks, we are trying to figure out if our data models and ETL can run on it.I have got the failure to launch message. I have read this message as well.https://community.databricks.com/t5/data-engineering/cluste...
Hi Team,Could you please help me what is the best way/best practices to copy around 3 TB of data(parquet) from HDFS to Databricks delta format and create external tables on top of it?Regards,Phanindra
How do committed-use discounts work for Databricks? Do I purchase a chunk of DBUs for a flat fee and then draw down on them until exhausted? Or am I purchasing a % discount to all DBUs I use until the time period ends?In either case, is this reflec...
Hi, I wonder if you could help me on the below please.We tried Databricks Data Intelligence platform for one of our clients and found that its very expensive when compared to AWS EMR. I understand its not apple-apple comparision as one being platform...
Hi @Retired_mod Thanks for getting back with so valuable information.SystemFile sizeDurationSystemDurationCommentsComments1EMR225 GB22 minsDatabricks63 minsEMR is cheaper than Databricks by 5 timesThis involves various S3 writes with m5d4xlargeEMR225...
I created a 14-day trial account on Databricks.com and linked it to my AWS. I'm aware that DBUs are free for 14 days, but any AWS charges are my own. I created one workspace, and the CloudFormation was successful. I haven't used it for two days and t...
Hi,I am using a spark pipeline having stages VectoreAssembler, StandardScalor, StringIndexers, VectorAssembler, GbtClassifier. And then logging this pipeline using feature store log_model function as follows:fe = FeatureStoreClient() // I have tried ...
Hi,I am using a spark pipeline having stages VectoreAssembler, StandardScalor, StringIndexers, VectorAssembler, GbtClassifier. And then logging this pipeline using feature store log_model function as follows:fe = FeatureStoreClient() // I have tried ...
I'm creating a series of runs using the /api/2.1/jobs/runs/submit, I wanted to add some tags for more control on the cost and usage, but I notice it's not an option. My first idea was using /api/2.1/jobs/update but it returns that it doesn't have any...
It could be, but I can still list the job permissions, so it's creating some kind of job... Is there a way of adding from the begining/updating tags into that job?
Hi community,Is it possible to use Databricks service principals for authentication on Databricks connect 12.2 to connect my notebook or code to Databricks compute, rather than using personal access token? I checked the docs and got to know that upgr...
Hi @Retired_modThanks for your response. I was able to generate the token of the service principal following this doc, later saved it in the <Databricks Token> variable prompted when running databricks-connect configure command in terminal. And was a...
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.