Both `hive_metastore` and `spark_catalog`?
HiIs it possible for a workspace to have both a `hive_metastore` catalog and a `spark_catalog` catalog?
- 2381 Views
- 0 replies
- 0 kudos
HiIs it possible for a workspace to have both a `hive_metastore` catalog and a `spark_catalog` catalog?
Hi all,I am new to Spark, trying to write below code but getting an error.Code:df1 = df.filter(df.col1 > 60 and df.col2 != 'abc') Any suggestion?
Is anyone able to advise why I am getting the error not a delta table? The table was created in Unity Catalog. I've also tried DeltaTable.forName and also using 13.3 LTS and 14.3 LTS clusters. Any advice would be much appreciated
@StogponI believe if you are using DeltaTable.forPath then you have to pass the path where the table is. You can get this path from the Catalog. It is available in the details tab of the table.Example:delta_table_path = "dbfs:/user/hive/warehouse/xyz...
I have an Azure Function that receives files (not volumes) and dumps them to cloud storage. One-five files are received approx. per second. I want to create a partitioned table in Databricks to work with. How should I do this? E.g.: register the cont...
Hi,I have pyspark dataframe and pyspark udf which calls mlflow model for each row but its performance is too slow.Here is sample codedef myfunc(input_text): restult = mlflowmodel.predict(input_text) return resultmyfuncUDF = udf(myfunc,StringType(...
Team ,I am trying understand how the parquet files and JSON under the delta log folder stores the data behind the scenesTable Creation:from delta.tables import *DeltaTable.create(spark) \.tableName("employee") \.addColumn("id", "INT") \.addColumn("na...
@Ramakrishnan83 - Kindly go through the blog post - https://www.databricks.com/blog/2019/08/21/diving-into-delta-lake-unpacking-the-transaction-log.html which discuss in detail on delta's transaction log.
Hi,I am struggling with truly understanding how to work with external locations. As far as I am able to read, you have:1) Managed catalogs2) Managed schemas3) Managed tables/volumes etc.4) External locations that contains external tables and/or volum...
hello friends ! i have project where i need databricks to train eval model then put it to productioni trained model & eval in databricks i used mlflow everything is good now i have another two steps that i have zeroclue how they should be done : usag...
This repo has examples that you can use in your Databricks workspace for FastAPI and Streamlit. I recommend only using these for development or lightweight use cases.
In databricks database table I was able to set permissions to groups but Now I get this error when using a cluster:Error getting permissionssummary: SparkException: Trying to perform permission action on Hive Metastore /CATALOG/`hive_metastore`/DATAB...
I was going through this tutorial https://mlflow.org/docs/latest/getting-started/tracking-server-overview/index.html#method-2-start-your-own-mlflow-server, I ran the whole script and when I try to open the experiment on the databricks website I get t...
Hi did u resolve that? I encountered the same error
Hi dear Databricks community,We tried to use databricks-jdbc inside oracle store procedure to load something from hive. However Oracle marked databricks-jdbc invalid because some classes (for example com.databricks.client.jdbc42.internal.io.netty.ut...
Hi everyone!I've set up an Azure cloud environment for the analytical team that I am part of and everythings is working wonderfully except Databricks Repos. Whenever we open Databricks, we find ourselves in the branch that the most recent person work...
use a separate a Databricks Git folder mapped to a remote Git repo for each user who works in their own development branch .Run Git operations on Databricks Repos | Databricks on AWS
Yesterday I created a ton of csv files via joined_df.write.partitionBy("PartitionColumn").mode("overwrite").csv( output_path, header=True )Today, when working with them I realized, that they were not loaded. Upon investigation I saw...
Then removing the "_commited_" file stops spark form reading in the other files
Am trying to get azure databricks cluster metrics such as memory utilization, CPU utilization, memory swap utilization, free file system using REST API by writing pyspark code. Its showing always cpu utilization & memory usage as N/A where as data...
Hi @databricksdev You can use System tables for Azure Databricks cluster metrics.Please refer below blog for the same -Compute system tables reference | Databricks on AWS
Hello,I have a code on Databricks (Scala) that constructs a df and then write it to a Database table. It is working fine for almost all of the tables, but there is a table with a problem. It says No module named 'delta.connect' - TASK_WRITE_FAILED.In...
| User | Count |
|---|---|
| 140 | |
| 134 | |
| 57 | |
| 46 | |
| 42 |