Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best prac...
Explore discussions on Databricks administration, deployment strategies, and architectural best prac...
Join discussions on data engineering best practices, architectures, and optimization strategies with...
Join discussions on data governance practices, compliance, and security within the Databricks Commun...
Explore discussions on generative artificial intelligence techniques and applications within the Dat...
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithm...
Engage in discussions on data warehousing, analytics, and BI solutions within the Databricks Communi...
Engage in discussions about the Databricks Free Trial within the Databricks Community. Share insight...
I recently tried applying Liquid Clustering to a partitioned table in Databricks and encountered the followingerror: [DELTA_ALTER_TABLE_CLUSTER_BY_ON_PARTITIONED_TABLE_NOT_ALLOWED] ALTER TABLE CLUSTER BY cannot be applied to a partitioned table. I u...
Hi @Akshay_Petkar Since we cannot use Liquid Clustering with a partitioned table, the only way I can think of is migrating from partitioning to Liquid Clustering. The same partitioning key columns and the additional columns you wanted to add can be ...
Hi All,I have registered on Databricks successfully. However, SQL is not enabled.Please help me how to activate SQL.Thank you very much,
I don't quite understand your question.If you haven't executed a query yet, then you won't see a query history. Have you ever executed a query in the SQL Editor?
This question is on the Databricks Certified Data Engineer Professional exam in section 1: "Implement Delta tables optimized for Databricks SQL service"I do not understand what is being asked by this question. i would assume that their different way...
Hi @joseph_sf , I assume you are referring to the exam guide PDF file. As you assumed, there are different techniques to optimize a Delta table. Some of them are already mentioned in the other bullet points in the same section 1, such as partitioning...
Hi ,I am getting this Error,when i am trying to give the exam ofFundamentals of the Databricks Lakehouse Platform.403FORBIDDENYou don't have permission to access this page2023-05-20 12:37:41 | Error 403 | https://customer-academy.databricks.com/I al...
Hi,I am currently using PySpark version 3.5.0 on my Databricks cluster. Despite setting the required configuration using the command: spark.conf.set("spark.databricks.ml.whitelist", "true"), I am still encountering an issue while trying to use the Ve...
Glad to hear it works for you now! The ML runtime has variety of preinstalled integrations such as MLflow, which provides ML lifecycle management, MLOps ... etc. Please explore them if you haven't done it already, to establish benefits of the extra
I have some questions regarding Databricks App.1) Can we use Framework other than mentioned in documentation( Streamlit,Flask,Dash,Gradio,Shiny).2) Can we allocate compute more than 2 vCPU and 6GB memory to any App.3) Any other programming language o...
1.) You can use most Python-based application frameworks, including some beyond those mentioned above.(Reference here) 2.) Currently, app capacity is limited to 2 vCPUs and 6 GB of RAM. However, future updates may introduce options for scaling out an...
I have a pipeline in databricks with this flowSQL SERVER (Source) -> Staging (Parquet) -> Bronze (DLT) -> Silver(DLT) -> Gold (DLT)The pipeline is up and running smoothly for months but recently, there was a schema update at my source level and one o...
Having issue getting UDF's to work within a DLT where the UDF is externalized outside of the notebook and it attempts to call other functions. End goal to put unit test coverage around the various functions, hence the pattern. For test purpose I cre...
Hi @drollason. In DLT pipelines, I would try packaging up your code as a wheel and then install it via pip. I had the same scenario as you and was able to bring in my custom code this way.
I have created a custom transformer to be used in a ml pipeline. I was able to write the pipeline to storage by extending the transformer class with DefaultParamsWritable. Reading the pipeline back in however, does not seem possible in Scala. I have...
Hi Everyone,I have created a spark pipeline in which I have a stage which is a Custom Transformer. Now I am using feature stores to log my model. But the issue is that the custom Transformer stage is not serialized properly and is not logged along wi...
Hi @NaeemS,Did you ever get a solution to this problem? I've now encountered this myself. When I save the pipeline using ML Flow log_model, I am able to load the model fine. When I log it with Databricks Feature Engineering package, it throws an erro...
I have a fairly simple ETL Pipeline that uses dlt. It streams data from ADLS2 SA and creates materialized view using two tables. It works fine when i execute it on its own. Materialized view is properly refreshed.Now I wanted to add this as a task to...
I'm trying to use Iceberg's SQL extensions in my Databricks Notebook, but I get a syntax error. Specifically, I'm trying to run 'ALTER TABLE my_iceberg_table WRITE LOCALLY ORDERED BY timestamp;'. This command is listed as part of Iceberg's SQL extens...
Hi @samanthacr, Could you please share the exact syntax error, I would try to reproduce this in my environment.
Hello, we are trying to evaluate Databricks solution to extract the data from existing cloudera schema hosted on physical server. We are using the Databricks serverless compute provided by databricks express setup and we assume we will not need t...
Also, stay tuned to what happens with the BladeBridge acquisition. That has connections to Cloudera Impala, and might help your situation.
Say we have completed the migration of tables from Hive Metastore to UC. All the users, jobs and clusters are switched to UC. There is no more activity on Legacy Hive Metastore.What is the best recommendation on deleting or cleaning the Hive Metastor...
The ability to do this will be in public preview this quarter, so sometime soon you should be able to disable this.
Hi, I see with unity catalog we have the workflow and now the lakeflow schema. I guess the intention is to capture audit logs of changes/ monitor runs but I wonder why we don't have all the metadata info on the jobs /tasks too for a given job =...
User | Count |
---|---|
1788 | |
844 | |
464 | |
311 | |
299 |