cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

mayank_gupta
by Databricks Partner
  • 2661 Views
  • 2 replies
  • 0 kudos

Trying to create external table in Hive Metastore

Receiving this error: KeyProviderException: Failure to initialize configuration for storage account adlspersonal.dfs.core.windows.net: Invalid configuration value detected for fs.azure.account.keyI used hive meta store to save my table %python spark....

  • 2661 Views
  • 2 replies
  • 0 kudos
Latest Reply
amr
Databricks Employee
  • 0 kudos

not sure about that error, but try to do it using SQL, see if it will workcreate table hive_metastore.annual_enterprise_survey as select * from catalog.defaul.table

  • 0 kudos
1 More Replies
User16752244127
by Databricks Employee
  • 21247 Views
  • 2 replies
  • 1 kudos

Databricks public roadmap

Where can we see when what feature is expected? can you share the schedule?

  • 21247 Views
  • 2 replies
  • 1 kudos
Latest Reply
amr
Databricks Employee
  • 1 kudos

You need to subscribe to Databricks newsletter, and tune in for the Quarterly Product Roadmap Webinars

  • 1 kudos
1 More Replies
AhmedAlnaqa
by Contributor
  • 2110 Views
  • 1 replies
  • 1 kudos

Resolved! Enhancements: interact with DBFS breadcrumbs

Hi there,This is my first thread and it's a baby-foot step in the Databricks community, especially Data engineering section.am working in the community edition and I found this enhancement needed to be implemented: The need is to make the breadcrumbs...

DBFS.png
  • 2110 Views
  • 1 replies
  • 1 kudos
Latest Reply
amr
Databricks Employee
  • 1 kudos

Good feedback, thank you. we are actully looking to complelty revamp the databricks community edition and the experience will be much simpler. stay tuned.

  • 1 kudos
Sid_SBA
by New Contributor
  • 1430 Views
  • 1 replies
  • 0 kudos

Resolved! How to integrate the CI/CD process with Databricks using Azure Devops on Catalog level.

How to integrate the CI/CD process with Databricks using Azure Devops on Catalog level instead of workspace level. I would like to understand the process if this is possible, given that if the catalog is used in different workspaces in same subscript...

  • 1430 Views
  • 1 replies
  • 0 kudos
Latest Reply
amr
Databricks Employee
  • 0 kudos

CICD is not related to catalogs, it is related to environment (workspaces), there are lots of tutorials on youtube on how to setup Azure DevOps CICD to move assets from one workspace to another and start a job. you will need to use Databricks Plugin ...

  • 0 kudos
anoopdk
by New Contributor II
  • 2687 Views
  • 1 replies
  • 1 kudos

Add option to skip or deactivate a task

It would be beneficial to have an option like a toggle to activate or deactivate a Task in the Job graph interface. This mainly helps to skip execution of a task and reactivate it as required. Currently there is no option to say I want this task to b...

  • 2687 Views
  • 1 replies
  • 1 kudos
Latest Reply
amr
Databricks Employee
  • 1 kudos

Maybe just load the task with an empty notebook, and once decided just load the right notebook. not ideal but should do the job I guess

  • 1 kudos
Priya_Data_Eng
by New Contributor
  • 1638 Views
  • 1 replies
  • 0 kudos

Special character data preservation

This data frame has two columns name and info. Name has value as John and info has vale as 1® VOC.After writing this data, I can read correct values in databricks but when I download the csv file and load it in notepad ( utf-8 ) , it is showing no va...

  • 1638 Views
  • 1 replies
  • 0 kudos
Latest Reply
amr
Databricks Employee
  • 0 kudos

Try to read the file back into databricks using spark.read, do you the see the charchaters showing? if yes, then it is an editor problem, use another editor such as Notepad++, if not, then the data is not on the write encoder, try different encoder o...

  • 0 kudos
alesventus
by Contributor
  • 2596 Views
  • 2 replies
  • 0 kudos

Tasks in job are in pending state

I have databricks job with around 70 notebooks. When job starts, only one notebook gets executed and the rest of the notebooks that are at the beginning of the line are in state PENDING (not blocked). Looks like notebooks cannot run in parallel for t...

job_start.jpg job_middle.jpg
  • 2596 Views
  • 2 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Maybe something related to autoscaling options? So when databricks detects increased workload it will scale up number of workers and then the rest of notebooks get executed. Do you use DLT ?

  • 0 kudos
1 More Replies
alesventus
by Contributor
  • 8436 Views
  • 4 replies
  • 2 kudos

Unity Catalog metastore is down error

When I want to run notebook in databricks all queries, saves and read take really long and I found error message in the clusters event log that says: Metastore is down. So, I think cluster is not able to connect to the metastore right now. Could be t...

Data Engineering
metastore
Unity Catalog
  • 8436 Views
  • 4 replies
  • 2 kudos
Latest Reply
alesventus
Contributor
  • 2 kudos

This issue is solely related to the VNET. Azure engineer must set up connection within VNET correctly. 

  • 2 kudos
3 More Replies
jwilliam
by Contributor
  • 6177 Views
  • 3 replies
  • 2 kudos

Resolved! How to mount Azure Blob Storage with OAuth2?

We already know that we can mount Azure Data Lake Gen2 with OAuth2 using this:configs = {"fs.azure.account.auth.type": "OAuth", "fs.azure.account.oauth.provider.type": "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider", ...

  • 6177 Views
  • 3 replies
  • 2 kudos
Latest Reply
dssatpute
New Contributor II
  • 2 kudos

Try replacing wasbs with abfss and dfs with blob in the URI, should work! 

  • 2 kudos
2 More Replies
ayush25091995
by New Contributor III
  • 5127 Views
  • 6 replies
  • 0 kudos

Resolved! how to get schema and catalog name in sql warehouse query history API

Hi,we are using SQL history query API by selecting catalog and schema name directly from SQL editor instead of passing it through query, we are not getting the schema name and catalog name in query text for that particular id.So, how can we get the s...

  • 5127 Views
  • 6 replies
  • 0 kudos
Latest Reply
mtajmouati
Contributor
  • 0 kudos

True  ! try this :import requests import json # Define your Databricks workspace URL and API token databricks_instance = "https://<your-databricks-instance>" api_token = "dapi<your-api-token>" # Fetch SQL query history def get_query_history(): ...

  • 0 kudos
5 More Replies
Anonymous
by Not applicable
  • 40450 Views
  • 7 replies
  • 0 kudos

Resolved! Tuning shuffle partitions

Is the best practice for tuning shuffle partitions to have the config "autoOptimizeShuffle.enabled" on? I see it is not switched on by default. Why is that?

  • 40450 Views
  • 7 replies
  • 0 kudos
Latest Reply
mtajmouati
Contributor
  • 0 kudos

AQE applies to all queries that are:Non-streamingContain at least one exchange (usually when there’s a join, aggregate, or window), one sub-query, or both.Not all AQE-applied queries are necessarily re-optimized. The re-optimization might or might no...

  • 0 kudos
6 More Replies
ayush25091995
by New Contributor III
  • 1409 Views
  • 1 replies
  • 0 kudos

how to pass page_token while calling API to get query history in SQL warehouse

Hi,I am getting each queryid getting duplicated in next page when calling API query history for SQL warehouse in next page, how ever page token is different for different pages.how should we pass Page token ?since in databricks doc, it is mentioned w...

  • 1409 Views
  • 1 replies
  • 0 kudos
Latest Reply
ayush25091995
New Contributor III
  • 0 kudos

any help on this plz?

  • 0 kudos
Labels