Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
am running a k-means algorithm. My feature are DoubleType and have no nulls, but I get : raise TypeError("Params must be either a param map or a list/tuple of param maps but got %s." % type(params). Anyone have any idea how to solve this?File /datab...
I found the answer just by trying several things, although I do not understand exactly what the problem was. All I had to do was to cache the input data before fitting the model:assemble=VectorAssembler(inputCols=columns_input, outputCol='features')...
I have a delta table created by:%sql
CREATE TABLE IF NOT EXISTS dev.bronze.test_map (
id INT,
table_updates MAP<STRING, TIMESTAMP>,
CONSTRAINT test_map_pk PRIMARY KEY(id)
) USING DELTA
LOCATION "abfss://bronze@Table Path"With initi...
Hi @Mohammad Saber Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedba...
I have a table with latitude and longitude for a few addresses (no more than 10 at the moment) but when I select the appropriate columns in the visualization editor for Map (Markers) I get an message that states "error while rendering visualization"....
I've tried multiple variations of the following code. It seems like the map parameters are being completely ignored. CREATE LIVE TABLE a_raw2
TBLPROPERTIES ("quality" = "bronze")
AS SELECT * FROM cloud_files("dbfs:/mnt/c-raw/a/c_medcheck_export*.csv"...
I have a delta table in Databricks with single column of type map<string, string> and I have a data file in JSON format created by Hive 3 for the table with thecolumn of same type. And I want to load data from file to Databricks's table using COPY IN...
Hi Alexey,Just a friendly follow-up. Did any of the responses help you to resolve your question? if it did, please mark it as best. Otherwise, please let us know if you still need help.
Hello,We are new on Databricks and we would like to know if our working method are good.Currently, we are working like this :spark.sql("CREATE TABLE Temp (SELECT avg(***), sum(***) FROM aaa LEFT JOIN bbb WHERE *** >= ***)")With this method, are we us...
Spark will handle the map/reduce for you.So as long as you use Spark provided functions, be it in scala, python or sql (or even R) you will be using distributed processing.You just care about what you want as a result.And afterwards when you are more...
I need to pass data between sql azure databases but the columns of some tables are different, the information must go to that column but with a different name.