- 19302 Views
- 2 replies
- 0 kudos
spark is case sensitive?Spark is not case sensitive by default. If you have same column name in different case (Name, name), if you try to select either "Name" or "name" column you will get column ambiguity error.There is a way to handle this issue b...
- 19302 Views
- 2 replies
- 0 kudos
Latest Reply
Hi I had similar issues with parquet files when trying to query athena, fix was i had to inspect the parquet file since it contained columns such as "Name", "name" which the aws crawler / athena would interpret as a duplicate column since it would se...
1 More Replies
- 10080 Views
- 2 replies
- 1 kudos
Hi allIn spark config for a cluster, it works well to refer to a Azure Keyvault secret in the "value" part of the name/value combo on a config row/setting.For example, this works fine (I've removed the string that is our specific storage account name...
- 10080 Views
- 2 replies
- 1 kudos
Latest Reply
Hello,Is there any update on this issue please? Databricks no longer recommend mounting external location, so the other way to access Azure storage is to use spark config as mentioned in this document - https://learn.microsoft.com/en-us/azure/databri...
1 More Replies
- 7764 Views
- 2 replies
- 2 kudos
Partner want to use adf managed identity to connect to my databricks cluster and connect to my azure storage and copy the data from my azure storage to their azure storage storage
- 7764 Views
- 2 replies
- 2 kudos
Latest Reply
Hi @SAI PUSALA Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedback w...
1 More Replies
by
KVNARK
• Honored Contributor II
- 3818 Views
- 4 replies
- 6 kudos
how can we parameterize key of the spark-config in the job cluster linked service from Azure datafactory, we can parameterize the values but any idea how can we parameterize the key so that when deploying to further environment it takes the PROD/QA v...
- 3818 Views
- 4 replies
- 6 kudos
Latest Reply
@KVNARK . You can use Databricks Secrets (create a Secret scope from AKV https://learn.microsoft.com/en-us/azure/databricks/security/secrets/secret-scopes) and then reference a secret in spark configuration (https://learn.microsoft.com/en-us/azure/d...
3 More Replies
- 3039 Views
- 3 replies
- 2 kudos
Hi Team,I am trying to configure access to adls through Service Principal through Spark Config in Databricks job cluster. like, fs.azure.account.oauth2.client.id.<adls_account_name>.dfs.core.windows.net {{secrets/scopeName/clientID}}The above stateme...
- 3039 Views
- 3 replies
- 2 kudos
Latest Reply
@Kaniz Fatma We are blocked on this issue. Can you please look into the thread and give your suggestion to workaround it.
2 More Replies
- 3958 Views
- 4 replies
- 0 kudos
Is it possible to temporarily disable Photon?I have a large workload that greatly benefits from Photon apart from a specific operation therein that is actually slowed by Photon. It's not worth creating a separate cluster for this operation however, s...
- 3958 Views
- 4 replies
- 0 kudos
Latest Reply
Hi @Aaron Morgan Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thank...
3 More Replies
by
Cassio
• New Contributor II
- 3821 Views
- 4 replies
- 3 kudos
In Databricks 10.1 it is possible to define in the "Spark Config" of the cluster something like:spark.fernet {{secrets/myscope/encryption-key}} . In my case my scopes are tied to Azure Key Vault.With that I can make a query as follows:%sql
SELECT d...
- 3821 Views
- 4 replies
- 3 kudos
Latest Reply
This solution exposes the entire secret if I use commands like belowsql("""explain select upper("${spark.fernet.email}") as data """).display()Please dont use this
3 More Replies
- 16910 Views
- 6 replies
- 6 kudos
Hi, I am executing a simple job in Databricks for which I am getting below error. I increased the Driver size still I faced same issue. Spark config :from pyspark.sql import SparkSessionspark_session = SparkSession.builder.appName("Demand Forecasting...
- 16910 Views
- 6 replies
- 6 kudos
Latest Reply
I am getting the above issue while writing a Spark DF as a parquet file to AWS S3. Not doing any broadcast join actually.
5 More Replies
- 4425 Views
- 4 replies
- 0 kudos
wondering if this is to parameterize the azure storage account name part in the spark cluster config in Databricks?I have a working example where the values are referencing secret scopes:spark.hadoop.fs.azure.account.oauth2.client.id.<azurestorageacc...
- 4425 Views
- 4 replies
- 0 kudos
- 1189 Views
- 0 replies
- 0 kudos
I am trying to read a 16mb excel file and I was getting a gc overhead limit exceeded error to resolve that i tried to increase my executor memory with,spark.conf.set("spark.executor.memory", "8g")but i got the following stack :Using Spark's default l...
- 1189 Views
- 0 replies
- 0 kudos
- 4590 Views
- 3 replies
- 4 kudos
I am trying to read a 16mb excel file and I was getting a gc overhead limit exceeded error to resolve that i tried to increase my executor memory with,spark.conf.set("spark.executor.memory", "8g")but i got the following stack :Using Spark's default l...
- 4590 Views
- 3 replies
- 4 kudos
Latest Reply
On the cluster configuration page, go to the advanced options. Click it to expand the field. There you will find the Spark tab and you can set the values there in the "Spark config".
2 More Replies
- 48773 Views
- 2 replies
- 1 kudos
set spark.conf.set("spark.driver.maxResultSize", "20g")
get spark.conf.get("spark.driver.maxResultSize") // 20g which is expected in notebook , I did not do in cluster level setting
still getting 4g while executing the spark job , why?
because of th...
- 48773 Views
- 2 replies
- 1 kudos
Latest Reply
Hi @sachinmkp1@gmail.com ,You need to add this Spark configuration at your cluster level, not at the notebook level. When you add it to the cluster level it will apply the settings properly. For more details on this issue, please check our knowledge...
1 More Replies
- 8763 Views
- 1 replies
- 0 kudos
I am using a Spark Databricks cluster and want to add a customized Spark configuration.There is a Databricks documentation on this but I am not getting any clue how and what changes I should make. Can someone pls share the example to configure the Da...
- 8763 Views
- 1 replies
- 0 kudos
Latest Reply
You can set the configurations on the Databricks cluster UIhttps://docs.databricks.com/clusters/configure.html#spark-configurationTo see the default configuration, run the below code in a notebook:%sql
set;
- 1664 Views
- 2 replies
- 0 kudos
Is there anyway to add a Spark Config that reverts the default behavior when doing tables writes from Delta to Parquet in DBR 8.0+? I know you can simply specify .format("parquet") but that could involve a decent amount of code change for some client...
- 1664 Views
- 2 replies
- 0 kudos