Delta Live Tables
is DLT supported for Scala? Any reference implementations or wikis to get started?
- 1797 Views
- 2 replies
- 0 kudos
is DLT supported for Scala? Any reference implementations or wikis to get started?
I am facing issue in while accessing python data frame in Scala shell and vice versa. I am getting error variable not defined.
The context is not shared between Scala and Python so you won't be able to access the same variables directly. However you can use createOrReplaceTempView to create a temporary view of your dataframe and read it in the other language with read_df = s...
I need to retrieve job id and run id of the job from a jar file in Scala.When I try to compile below code in IntelliJ, below error is shown.import com.databricks.dbutils_v1.DBUtilsHolder.dbutils object MainSNL { @throws(classOf[Exception]) de...
Maybe its worth going through the Task Parameter variables section of the below dochttps://docs.databricks.com/data-engineering/jobs/jobs.html#task-parameter-variables
I read somewhere that Python code is converted to SQL at the end. So is it true or there is any difference in performance while working with Scala, Python or SQL ?
To add on the consideration of UDFs, try to consider using HOFs (Higher Order Functions) whenever possible first as there is a signifcant performance benefit as seen here.
I want to get current date in scala as a string for example today current date is 3rd jan, want to store it as a new variable dynamically as below, how to get it.val currdate : String = “20220103”when I am using val currdate = Calendar.getInstance.ge...
Hey @Vibhor Sethi Hope you are well!Thank you for posting your question and letting us know that you were able to resolve the issue. Would you be happy to mark it as the best solution? It would be really helpful for the other members too.Cheers!
Hello,I want to have a high concurrency cluster with table access control and I want to use R language on it.I know that the documentation says that R and Scala is not available with table access control.But maybe you have some tricks or best practic...
@Łukasz Jaremek, Currently it is only available in Python and SQL.
I expected the code below to print "hello" for each partition, and "world" for each record. But when I ran it the code ran but had no print outs of any kind. No errors either. What is happening here?%scala val rdd = spark.sparkContext.parallelize(S...
Is it lazy evaluated so you need to trigger action I guess
Hello,I'm learning Scala / Spark and try to understand what's wrong with my function:I have a spark.sql query, stored in a variable:val uViewName = spark.sql(""" SELECT v.Data_View_Name FROM apoHierarchy AS h INNER JOIN apoView AS v ON h.View_N...
try add .first()(0) it will return only value from first row/column as currently you are returning Dataset: var uViewName = spark.sql(s""" SELECT v.Data_View_Name FROM apoHierarchy AS h INNER JOIN apoView AS v ON h.View_Name = v.Context_View_N...
Currently I am learning how to use databricks-connect to develop Scala code using IDE (VS Code) locally. The set-up of the databricks-connect as described here https://docs.microsoft.com/en-us/azure/databricks/dev-tools/databricks-connect was succues...
Hi everybody,Trigger.AvailableNow is released within the databricks 10.1 runtime and we would like to use this new feature with autoloader.We write all our data pipeline in scala and our projects import spark as a provided dependency. If we try to sw...
You can switch to python. Depending on what you're doing and if you're using UDFs, there shouldn't be any difference at all in terms of performance.
I've been trying to use the HiveMetastoreClient class in Scala to extract some metadata from Databricks internal Metastore, without success. I'm currently using the 7.3 LTS runtime.The error seems to be related to some kind of inconsistency between...
Thanks for the reference, @Atanu Sarkar .Seems a little odd to me that I'd need to change the internal Databricks Metastore table to add a column expected by the client default Scala client. I'm afraid this could cause issues with other users/jobs ...
Hi Team, I was trying to call/run multiple notebooks in one notebook concurrent. But the caller notebooks are getting executing one by one whereas I need to run all the caller notebooks concurrently. I have also tried using Threading in Scala Databri...
Auto Loader provides Python and Scala methods to ingest new data from a folder location into a Delta Lake table by using directory listing or file notifications. Here's a quick video (7:00) on how to use Auto Loader for Databricks on AWS with Databri...
I have a million in rows that I need to update which looks for the highest count of the predecessor from the same source data and replaces the same value on a different row. For example.Original DF.sno Object Name shape rating1 Fruit apple round ...
basically you have to create a dataframe (or use a window function, that will also work) which gives you the group combination with the most occurances. So a window/groupby on object, name, shape with a count().Then you have to determine which shape...
I was reading an excel file with one column,country india India india India indiadataframe i got from this data : df.show()+-------+ |country| +-------+ | india | | India | | india | | India | | india | +-------+In the next step i removed last value ...
@sarvesh singh - Thank you for letting us know. Would you be happy to mark the best answer so others can find the solution easily?