- 87 Views
- 4 replies
- 0 kudos
I see two articles on databricks documentationshttps://docs.databricks.com/en/archive/azure/synapse-polybase.html#language-pythonhttps://docs.databricks.com/en/connect/external-systems/synapse-analytics.html#service-principal Polybase one is legacy o...
- 87 Views
- 4 replies
- 0 kudos
Latest Reply
Hi @dilkushpatel, Thank you for sharing your confusion regarding PolyBase and the COPY INTO command in Databricks when working with Azure Synapse.
PolyBase (Legacy):
PolyBase was previously used for data loading and unloading operations in Azure...
3 More Replies
- 55 Views
- 2 replies
- 0 kudos
Dear Members,I need your help in below scenario.I am passing few parameters from ADF pipeline to Databricks notebook.If I execute ADF pipeline to run my databricks notebook and use these variables as is in my code (python) then it works fine.But as s...
- 55 Views
- 2 replies
- 0 kudos
Latest Reply
Hi @Abhi0607 Can you please help me to find if you are taking or defining these parameter value outside try catch or inside it ?
1 More Replies
- 1329 Views
- 4 replies
- 1 kudos
I renamed our service principal in Terraform, which forces a replacement where the old service principal is removed and a new principal with the same permission is recreated. The Terraform succeeds to apply, but when I try to run dbt that creates tab...
- 1329 Views
- 4 replies
- 1 kudos
Latest Reply
This is also true for removing groups before unassigning them (removing and unassigning in Terraform)│ Error: cannot update grants: Could not find principal with name <My Group Name>
3 More Replies
- 89 Views
- 4 replies
- 0 kudos
We have a data feed with files whose filenames stays the same but the contents change over time (brand_a.csv, brand_b.csv, brand_c.csv ....).Copy Into seems to ignore the files when they change.If we set the Force flag to true and run it, we end up w...
- 89 Views
- 4 replies
- 0 kudos
Latest Reply
That's the question, short of treating the initial copy into as a temp table and executing a merge statement after it into another table where we can do the add, update type operations is there another option - with COPY INTO or AUTOLOADER or DLT - t...
3 More Replies
- 54 Views
- 1 replies
- 0 kudos
Hello,I'm using the auto loader to stream a table of data and have added schema hints to specify field values.I've observed that when my initial data file is missing fields specified in the schema hint,the auto loader correctly identifies this and ad...
- 54 Views
- 1 replies
- 0 kudos
Latest Reply
Hi @my_super_name,
Default Schema Inference: By default, Auto Loader schema inference aims to avoid schema evolution issues due to type mismatches. For formats like JSON, CSV, and XML that don’t encode data types explicitly, Auto Loader infers a...
- 69 Views
- 1 replies
- 0 kudos
I want to confirm if this understanding is correct ???To calculate the number of parallel tasks that can be executed in a Databricks PySpark cluster with the given configuration, we need to consider the number of executors that can run on each node a...
- 69 Views
- 1 replies
- 0 kudos
Latest Reply
Hi @manish1987c, Your understanding is almost correct!
Node Configuration:
You have 10 nodes in your Databricks PySpark cluster.Each node has 16 CPU cores and 64 GB RAM.
Executor Size:
Each executor requires 5 CPU cores and 20 GB RAM.Additional...
- 46 Views
- 1 replies
- 0 kudos
We have a table using timestampNtz type for timestamp, which is also a cluster key for this table using liquid clustering. I ran OPTIMIZE <table-name>, it failed with errorUnsupported datatype 'TimestampNTZType' But the failed optmization also broke ...
- 46 Views
- 1 replies
- 0 kudos
Latest Reply
Hi @Jennifer,
Since TimestampNTZType is not currently supported for optimization, you can try a workaround by converting the timestamp column to a different data type before running the OPTIMIZE command.For example, you could convert the timestampNt...
- 97 Views
- 1 replies
- 0 kudos
When trying to setup databricks-connect on WSL2 using 13.3 cluster, I receive the following error regarding OpenSSL CERTIFICATE_ERIFY_FAILED.The authentication is done via SPARK_REMOTE env. variable. E0415 11:24:26.646129568 142172 ssl_transport_sec...
- 97 Views
- 1 replies
- 0 kudos
Latest Reply
Hi @jp_allard,
One approach to resolve this is to disable SSL certificate verification. However, keep in mind that this approach may compromise security.In your Databricks configuration file (usually located at ~/.databrickscfg), add the following l...
- 58 Views
- 1 replies
- 0 kudos
Hi!As suggested by Databricks, we are working with Databricks from VSCode using Databricks bundles for our deployment and using the VSCode Databricks Extension and Databricks Connect during development.However, there are some limitations that we are ...
- 58 Views
- 1 replies
- 0 kudos
Latest Reply
Hi @pernilak, It’s great that you’re using Databricks with Visual Studio Code (VSCode) for your development workflow!
Let’s address the limitations you’ve encountered when working with files from Unity Catalog using native Python.
When running Python...
- 13 Views
- 0 replies
- 0 kudos
When I am running a query on Databricks itself from notebook, it is running fine and giving me results. But the same query when executed from FastAPI (Python, using databricks library) is giving me "TypeError: 'NoneType' object is not iterable".I can...
- 13 Views
- 0 replies
- 0 kudos
- 52 Views
- 1 replies
- 0 kudos
I have been able to perform a selective overwrite using replace Where to a hive_metastore table, but when I use the same code for the same table in a unity catalog, no data is written.Has anyone else had this issue or is there common mistakes that ar...
- 52 Views
- 1 replies
- 0 kudos
Latest Reply
Hi @jp_allard ,
The Unity Catalog is a newer feature in Databricks, designed to replace the traditional Hive Metastore.When transitioning from Hive Metastore to Unity Catalog, there might be differences in behavior due to underlying architectural ch...
by
SG
• New Contributor II
- 443 Views
- 1 replies
- 1 kudos
Hi guys, i am running my Databricks jobs on a cluster job from azure datafactory using a databricks Python activity When I monitor my jobs in workflow-> job runs . I see that the run name is a concatenation of adf pipeline name , Databricks python ac...
- 443 Views
- 1 replies
- 1 kudos
Latest Reply
Hi Hamza,did u got any solution for this issueThanks
- 35 Views
- 0 replies
- 0 kudos
Hi all!I need to copy multiple tables from one workspace to another with metadata information. Is there any way to do it?Please reply as soon as possible.
- 35 Views
- 0 replies
- 0 kudos
- 23 Views
- 0 replies
- 0 kudos
Hi all!I need to migrate multiple notebooks from one workspace to another. Is there any way to do it without using Git?Since Manual Import and Export is difficult to do for multiple notebooks and folders, need an alternate solution.Please reply as so...
- 23 Views
- 0 replies
- 0 kudos
- 68 Views
- 2 replies
- 0 kudos
We are running into errors when running workflows with multiple jobs using the same notebook/different parameters. They are reading from tables we still have in hive_metastore, there's no Unity Catalog tables or functionality referenced anywhere. We'...
- 68 Views
- 2 replies
- 0 kudos
Latest Reply
Hi @Kayla,
This may happen if your Unity Catalog is not configured properly. In that case, consider setting it up following the appropriate guidelines for your Databricks environment.Check if any environment variables related to your workflow (especi...
1 More Replies