I am trying to read a table that is hosted on a different workspace. We have been told to establish a connection to said workspace using a table and consume the table.Code I am using isfrom databricks import sqlconnection = sql.connect(server_hostnam...
Hey there! Thanks a bunch for being part of our awesome community! We love having you around and appreciate all your questions. Take a moment to check out the responses – you'll find some great info. Your input is valuable, so pick the best solution...
In databricks, if we want to see the live log of the exuction we can able to see it from the driver log page of the cluster.But in that we can't able to search by key word instead of that we need to download every one hour log file and live logs are ...
Hi @shan_chandra ,It is like we are putting our driver log into another cloud platform, But here I want to check the live log in local machine tools, is this possible?
I am trying to load data from a table in SQL warehouse using spark.sql("SELECT * FROM <table>") in a spark submit job, but the job is failing with [TABLE_OR_VIEW_NOT_FOUND] The table or view . The same statement is working in notebook but not in a jo...
- when you query table manually and running job - do both those actions happens in same Databricks Workspace- what is job configuration - who is job Owner or Run As Account -> do this principal/persona has access to the table ?
Thanks! Converting PDF format is sometimes a difficult task as not all converters provide accuracy. I want to share with you one interesting tool I recently discovered that can make your work even more efficient. I recently came across an amazing onl...
I have two workspaces, one in us-west-2 and the other in ap-southeast-1. I have configured the same instance profile for both workspaces. I followed the documentation to set up the instance profile for Databricks SQL Warehouse Serverless by adding th...
Hi @Tam , Hope you are doing well!
I checked the error in details and it would be because the Instance Profile Name and the Role ARN name don't match exactly. Please see points 3 and 4 here in the docs: https://docs.databricks.com/sql/admin/serverle...
We are trying to build data quality process for initial file level or data ingestion level for bronze and add more specific business times for silver and business related aggregates for golden layer.
Hi @arun laksh​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!
Hi all,I would appreciate some clarity regarding the DLT and CDC. So my first question would be, when it comes to the "source" table in the synta, is that CDC table or? Further, if we want to use only databricks, would mounting foreign catalog be a g...
Hi @Stellar, Let’s dive into your questions about Delta Live Tables (DLT) and Change Data Capture (CDC).
CDC Implementation with Delta Live Tables (DLT):
DLT simplifies CDC using the APPLY CHANGES API. Previously, the commonly used method was the...
Hello, recently I've tried to upgrade my runtime env to the 13.3 LTS ML and found that it breaks my workload during applyInPandas.My job started to hang during the applyInPandas execution. Thread dump shows that it hangs on direct memory allocation: ...
We experienced similar issues and after an extensive back-and-forth with customer support from Azure and Databricks we gave up. Our current "solution" is to stick with version 12.2 LTS ML also for new projects until they maybe release a version where...
Hello,I want to refactor the notebook programatically. So, written the code as follows: import requestsimport base64# Databricks Workspace API URLsworkspace_url = f"{host}/api/2.0/workspace"export_url = f"{workspace_url}/export"import_url = f"{worksp...
Hi @Avinash_Narala, Let’s create a new notebook in your Databricks workspace using the modified JSON content you have. Below are the steps to achieve this programmatically:
Create a New Notebook:
To create a new notebook, you’ll need to use the D...
Working on an asset bundle/yml file to deploy a job and some notebooks. How to specify within the yml file a trigger to run the job when files arrive at a specified location? Thanks in advance!
Hi @DatabricksDude, To trigger a job in your YAML file when specific files arrive at a specified location, you can use the following approaches based on the context of your deployment:
GitLab CI/CD:
If you’re using GitLab CI/CD, you can achieve t...
I'm trying to adopt a code base to use asset bundles. I was trying to come up with folder structure that will work for our bundles and came up with layout as below:common/ (source code)services/ (source code)dist/ (here artifacts from monorepo are bu...
Hi @kamilmuszynski, When working with Databricks Asset Bundles, there are specific rules and guidelines for structuring your configuration files.
Let’s break down the key points to address your concerns:
Bundle Configuration File (databricks.yml...
Hello.I am creating a table to monitor the usage of All-purpose Compute and SQL Warehouses. From the tables in 'system' catalog, I can get cluster_name and cluster_id. However only warehouse_id is available and not warehouse name. Is there a way to g...
Hi @valjas, To monitor and manage SQL warehouses in your Databricks workspace, you can utilize the warehouse events system table. This table records events related to warehouse activity, including when a warehouse starts, stops, scales up, or scales ...
When creating a Foreign Catalog SQL Server Connection, a port number is required. However, many sql servers have dynamic ports and the port number keeps changing. Is there a solution for this?In most common cases, it should allow instance name instea...
Hi @jim12321, Dealing with dynamic ports in SQL Server connections can be tricky, but there are ways to address this challenge.
Let’s explore a couple of options:
Static Port Configuration:
By default, SQL Server named instances are configured t...
Hi @NT911, Ensure that you have the correct versions of Spark, GeoPandas, and other relevant libraries installed. Sometimes compatibility issues arise due to outdated or incompatible dependencies.
Hi @Avinash_Narala , You can use Python to convert JSON content into a DataFrame in Databricks.
To do this, you'll first convert the JSON content into a list of JSON strings, then parallelize the list to create an RDD, and finally use spark.read.jso...