- 524 Views
- 2 replies
- 0 kudos
Hello all,The official documentation for Databricks Connect states that, for Databricks Runtime versions 13.0 and above, my cluster needs to have Unity Catalog enabled for me to use Databricks Connect, and use a Databricks cluster through an IDE like...
- 524 Views
- 2 replies
- 0 kudos
Latest Reply
Hi, I'm currently using Databricks Connect without the Unity Catalog on VS Code. Although I have connected the Unity Catalog separately on multiple occasion I don't thing its required.Here is the doc:https://docs.databricks.com/en/dev-tools/databrick...
1 More Replies
- 551 Views
- 2 replies
- 1 kudos
I managed to extract the Google Analytics data via lakehouse federation and the Big Query connection but the events table values are in a weird JSON format{"v":[{"v":{"f":[{"v":"ga_session_number"},{"v":{"f":[{"v":null},{"v":"2"},{"v":null},{"v":null...
- 551 Views
- 2 replies
- 1 kudos
Latest Reply
@AnaMocanu I was using this function, with a little modifications on my end:https://gist.github.com/shreyasms17/96f74e45d862f8f1dce0532442cc95b2Maybe this will be helpful for you
1 More Replies
- 154 Views
- 1 replies
- 0 kudos
I am brand new to Databricks and am working on connecting a power bi semantic model to our databricks instance. I have successfully connected it to an All Purpose Compute but was wondering if there was a way I could see the queries that power bi is ...
- 154 Views
- 1 replies
- 0 kudos
Latest Reply
For All purpose compute, best bet would be to use the system tables,specifically the system.access.audit table.
https://docs.databricks.com/en/administration-guide/system-tables/index.html
- 307 Views
- 1 replies
- 0 kudos
Hello Databricks Community,I am currently working in a Databricks environment and trying to set up custom logging using Log4j in a Python notebook. However, I've run into a problem due to the use of Spark Connect, which does not support the _jvm attr...
- 307 Views
- 1 replies
- 0 kudos
Latest Reply
import logging
logging.getLogger().setLevel(logging.WARN)
log = logging.getLogger("DATABRICKS-LOGGER")
log.warning("Hello")
- 440 Views
- 4 replies
- 1 kudos
In my notebook, i am performing few join operations which are taking more than 30s in cluster 14.3 LTS where same operation is taking less than 4s in 13.3 LTS cluster. Can someone help me how can i optimize pyspark operations like joins and withColum...
- 440 Views
- 4 replies
- 1 kudos
by
SG
• New Contributor II
- 662 Views
- 3 replies
- 1 kudos
Hi guys, i am running my Databricks jobs on a cluster job from azure datafactory using a databricks Python activity When I monitor my jobs in workflow-> job runs . I see that the run name is a concatenation of adf pipeline name , Databricks python ac...
- 662 Views
- 3 replies
- 1 kudos
Latest Reply
I don't think that level of customisation is provided. However, I can suggest some workarounds:REST API: Create a job on the fly with desired name within ADF and trigger it using REST API in Web activity. This way you can track job completion status ...
2 More Replies
- 1329 Views
- 2 replies
- 3 kudos
User is running a job triggered from ADF in Databricks. In this job they need to use custom libraries that are in jars. Most of the times jobs are running fine, however sometimes it fails with:java.lang.NoClassDefFoundError: Could not initializeAny s...
- 1329 Views
- 2 replies
- 3 kudos
Latest Reply
Can you please check if there are more than one jar containing this class . If multiple jars of the same type are available on the cluster, then there is no guarantee of JVM picking the proper classes for processing, which results in the intermittent...
1 More Replies
by
Jorge3
• New Contributor III
- 470 Views
- 3 replies
- 2 kudos
Hi everyone!I'm setting up a workflow using Databricks Assets Bundles (DABs). And I want to configure my workflow to be trigger on file arrival. However all the examples I've found in the documentation use schedule triggers. Does anyone know if it is...
- 470 Views
- 3 replies
- 2 kudos
Latest Reply
Hi @Jorge3 Yes, you can use continues mode also.Please find syntax below - resources:
jobs:
dbx_job:
name: continuous_job_name
continuous:
pause_status: UNPAUSED
queue:
enabled: true
2 More Replies
- 875 Views
- 2 replies
- 2 kudos
I'm using Databricks asset bundles and I have pipelines that contain "if all done rules". When running on CI/CD, if a task fails, the pipeline returns a message like "the job xxxx SUCCESS_WITH_FAILURES" and it passes, potentially deploying a broken p...
- 875 Views
- 2 replies
- 2 kudos
Latest Reply
Awesome answer, I will try the first approach. I think it is a less intrusive solution than changing the rules of my pipeline in development scenarios. This way, I can maintain a general pipeline for deployment across all environments. We plan to imp...
1 More Replies
- 209 Views
- 2 replies
- 1 kudos
I've defined a streaming deltlive table in a notebook using python.running on "preview" channeldelta cache accelerated (Standard_D4ads_v5) computeIt fails withorg.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = xxx, ru...
- 209 Views
- 2 replies
- 1 kudos
Latest Reply
Hi @smedegaard,
You’re encountering a StreamingQueryException with the message: “getPrimaryKeys not implemented for debezium SQLSTATE: XXKST.”
This error suggests that the getPrimaryKeys operation is not supported for the Debezium connector in your ...
1 More Replies
by
Phani1
• Valued Contributor
- 181 Views
- 1 replies
- 0 kudos
Hi Team,Is there any impact when integrating Databricks with Boomi as opposed to Azure Event Hub? Could you offer some insights on the integration of Boomi with Databricks?https://boomi.com/blog/introducing-boomi-event-streams/Regards,Janga
- 181 Views
- 1 replies
- 0 kudos
Latest Reply
Hi @Phani1, Let’s explore the integration of Databricks with Boomi and compare it to Azure Event Hub.
Databricks Integration with Boomi:
Databricks is a powerful data analytics platform that allows you to process large-scale data and build machin...
- 660 Views
- 1 replies
- 0 kudos
Hello All,My scenario required me to create a code that reads tables from the source catalog and writes them to the destination catalog using Spark. Doing one by one is not a good option when there are 300 tables in the catalog. So I am trying the pr...
- 660 Views
- 1 replies
- 0 kudos
Latest Reply
Hi @ETLdeveloper You can use the multithreading that help you to run notebook in parallel.Attaching code for your reference - from concurrent.futures import ThreadPoolExecutor
class NotebookData:
def __init__(self, path, timeout, parameters = Non...
- 184 Views
- 1 replies
- 0 kudos
Hi All! Im in a project where i need to connect azure devops and databricks using managed identity to avoid the using of service account, PAT, etc.The thing is i cant move forward with the connection since i cannot take the ownership of the files wh...
- 184 Views
- 1 replies
- 0 kudos
Latest Reply
Hi @TitaMn, Connecting Azure DevOps and Azure Databricks using managed identity is a great approach to enhance security and avoid using service accounts or personal access tokens (PATs).
Let’s explore some options:
Azure Managed Identity for Dat...
by
Anske
• New Contributor III
- 221 Views
- 4 replies
- 0 kudos
Hi,Would anyone happen to know whether it's possible to cache a dataframe in memory that the result of a query on a federated table?I have a notebook that queries a federated table, does some transformations on the dataframe and then writes this data...
- 221 Views
- 4 replies
- 0 kudos
Latest Reply
Anske
New Contributor III
@daniel_sahal , this is the code snippet:lsn_incr_batch = spark.sql(f"""select start_lsn,tran_begin_time,tran_end_time,tran_id,tran_begin_lsn,cast('{current_run_ts}' as timestamp) as appendedfrom externaldb.cdc.lsn_time_mappingwhere tran_end_time > '...
3 More Replies
- 337 Views
- 4 replies
- 1 kudos
Hi Community,i was trying to load a ML Model from a Azure Storageaccount (abfss://....) with: model = PipelineModel.load(path) i set the spark config: spark.conf.set("fs.azure.account.auth.type", "OAuth")
spark.conf.set("fs.azure.account.oauth.provi...
- 337 Views
- 4 replies
- 1 kudos
Latest Reply
@daniel_sahal using the settings above did indeed work.
3 More Replies