cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

addy
by New Contributor III
  • 752 Views
  • 3 replies
  • 2 kudos

Reading a table from a catalog that is in a different/external workspace

I am trying to read a table that is hosted on a different workspace. We have been told to establish a connection to said workspace using a table and consume the table.Code I am using isfrom databricks import sqlconnection = sql.connect(server_hostnam...

Data Engineering
catalog
Databricks
sql
  • 752 Views
  • 3 replies
  • 2 kudos
Latest Reply
Kaniz
Community Manager
  • 2 kudos

Hey there! Thanks a bunch for being part of our awesome community!  We love having you around and appreciate all your questions. Take a moment to check out the responses – you'll find some great info. Your input is valuable, so pick the best solution...

  • 2 kudos
2 More Replies
Data_Engineer3
by Contributor II
  • 867 Views
  • 3 replies
  • 0 kudos

live spark driver log analysis

In databricks, if we want to see the live log of the exuction we can able to see it from the driver log page of the cluster.But in that we can't able to search by key word instead of that we need to download every one hour log file and live logs are ...

  • 867 Views
  • 3 replies
  • 0 kudos
Latest Reply
Data_Engineer3
Contributor II
  • 0 kudos

Hi @shan_chandra ,It is like we are putting our driver log into another cloud platform, But here I want to check the live log in local machine tools, is this possible? 

  • 0 kudos
2 More Replies
akhileshp
by New Contributor III
  • 746 Views
  • 6 replies
  • 0 kudos

Query Serverless SQL Warehouse from Spark Submit Job

I am trying to load data from a table in SQL warehouse using spark.sql("SELECT * FROM <table>") in a spark submit job, but the job is failing with [TABLE_OR_VIEW_NOT_FOUND] The table or view . The same statement is working in notebook but not in a jo...

  • 746 Views
  • 6 replies
  • 0 kudos
Latest Reply
Wojciech_BUK
Contributor III
  • 0 kudos

- when you query table manually and running job - do both those actions happens in same Databricks Workspace- what is job configuration - who is job Owner or Run As Account -> do this principal/persona has access to the table ?

  • 0 kudos
5 More Replies
User16826987838
by Contributor
  • 1105 Views
  • 2 replies
  • 0 kudos

Convert pdf's is into structured data

Is there anything on Databricks to help read PDF (payment invoices and receipts for example) and convert it to structured data?

  • 1105 Views
  • 2 replies
  • 0 kudos
Latest Reply
SoniaFoster
New Contributor II
  • 0 kudos

Thanks! Converting PDF format is sometimes a difficult task as not all converters provide accuracy. I want to share with you one interesting tool I recently discovered that can make your work even more efficient. I recently came across an amazing onl...

  • 0 kudos
1 More Replies
Tam
by New Contributor III
  • 873 Views
  • 4 replies
  • 0 kudos

Resolved! Error on Starting Databricks SQL Warehouse Serverless with Instance Profile

I have two workspaces, one in us-west-2 and the other in ap-southeast-1. I have configured the same instance profile for both workspaces. I followed the documentation to set up the instance profile for Databricks SQL Warehouse Serverless by adding th...

Tam_1-1709300806768.png
  • 873 Views
  • 4 replies
  • 0 kudos
Latest Reply
Ayushi_Suthar
Honored Contributor
  • 0 kudos

Hi @Tam , Hope you are doing well!  I checked the error in details and it would be because the Instance Profile Name and the Role ARN name don't match exactly. Please see points 3 and 4 here in the docs: https://docs.databricks.com/sql/admin/serverle...

  • 0 kudos
3 More Replies
laksh
by New Contributor II
  • 1441 Views
  • 4 replies
  • 3 kudos

What kind of data quality rules that can be run using unity catalog

We are trying to build data quality process for initial file level or data ingestion level for bronze and add more specific business times for silver and business related aggregates for golden layer.

  • 1441 Views
  • 4 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @arun laksh​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 3 kudos
3 More Replies
Stellar
by New Contributor II
  • 1267 Views
  • 2 replies
  • 0 kudos

CDC DLT

Hi all,I would appreciate some clarity regarding the DLT and CDC. So my first question would be, when it comes to the "source" table in the synta, is that CDC table or? Further, if we want to use only databricks, would mounting foreign catalog be a g...

  • 1267 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Stellar, Let’s dive into your questions about Delta Live Tables (DLT) and Change Data Capture (CDC). CDC Implementation with Delta Live Tables (DLT): DLT simplifies CDC using the APPLY CHANGES API. Previously, the commonly used method was the...

  • 0 kudos
1 More Replies
DmitriyLamzin
by New Contributor
  • 2408 Views
  • 2 replies
  • 0 kudos

applyInPandas started to hang on the runtime 13.3 LTS ML and above

Hello, recently I've tried to upgrade my runtime env to the 13.3 LTS ML and found that it breaks my workload during applyInPandas.My job started to hang during the applyInPandas execution. Thread dump shows that it hangs on direct memory allocation: ...

Data Engineering
pandas udf
  • 2408 Views
  • 2 replies
  • 0 kudos
Latest Reply
julia
New Contributor II
  • 0 kudos

We experienced similar issues and after an extensive back-and-forth with customer support from Azure and Databricks we gave up. Our current "solution" is to stick with version 12.2 LTS ML also for new projects until they maybe release a version where...

  • 0 kudos
1 More Replies
Avinash_Narala
by New Contributor III
  • 1425 Views
  • 8 replies
  • 1 kudos

Rewrite Notebooks Programatically

Hello,I want to refactor the notebook programatically. So, written the code as follows: import requestsimport base64# Databricks Workspace API URLsworkspace_url = f"{host}/api/2.0/workspace"export_url = f"{workspace_url}/export"import_url = f"{worksp...

  • 1425 Views
  • 8 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @Avinash_Narala, Let’s create a new notebook in your Databricks workspace using the modified JSON content you have. Below are the steps to achieve this programmatically: Create a New Notebook: To create a new notebook, you’ll need to use the D...

  • 1 kudos
7 More Replies
DatabricksDude
by New Contributor
  • 233 Views
  • 1 replies
  • 0 kudos

How to set a job trigger in a yml deployment asset bundle?

Working on an asset bundle/yml file to deploy a job and some notebooks.  How to specify within the yml file a trigger to run the job when files arrive at a specified location?  Thanks in advance!

  • 233 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @DatabricksDude, To trigger a job in your YAML file when specific files arrive at a specified location, you can use the following approaches based on the context of your deployment: GitLab CI/CD: If you’re using GitLab CI/CD, you can achieve t...

  • 0 kudos
kamilmuszynski
by New Contributor
  • 595 Views
  • 1 replies
  • 0 kudos

Asset Bundles - path is not contained in bundle root path

I'm trying to adopt a code base to use asset bundles. I was trying to come up with folder structure that will work for our bundles and came up with layout as below:common/ (source code)services/ (source code)dist/ (here artifacts from monorepo are bu...

Data Engineering
asset-bundles
  • 595 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @kamilmuszynski, When working with Databricks Asset Bundles, there are specific rules and guidelines for structuring your configuration files. Let’s break down the key points to address your concerns: Bundle Configuration File (databricks.yml...

  • 0 kudos
valjas
by New Contributor III
  • 364 Views
  • 1 replies
  • 0 kudos

Warehouse Name in System Tables

Hello.I am creating a table to monitor the usage of All-purpose Compute and SQL Warehouses. From the tables in 'system' catalog, I can get cluster_name and cluster_id. However only warehouse_id is available and not warehouse name. Is there a way to g...

  • 364 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @valjas, To monitor and manage SQL warehouses in your Databricks workspace, you can utilize the warehouse events system table. This table records events related to warehouse activity, including when a warehouse starts, stops, scales up, or scales ...

  • 0 kudos
jim12321
by New Contributor II
  • 505 Views
  • 1 replies
  • 0 kudos

Resolved! Foreign Catalog SQL Server Dynamic Port

When creating a Foreign Catalog SQL Server Connection, a port number is required. However, many sql servers have dynamic ports and the port number keeps changing. Is there a solution for this?In most common cases, it should allow instance name instea...

jim12321_0-1709756538967.png
Data Engineering
Foreign Catalog
JDBC
  • 505 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @jim12321, Dealing with dynamic ports in SQL Server connections can be tricky, but there are ways to address this challenge. Let’s explore a couple of options: Static Port Configuration: By default, SQL Server named instances are configured t...

  • 0 kudos
NT911
by New Contributor II
  • 425 Views
  • 2 replies
  • 0 kudos

Databricks Error while executing this line of code

import geopandas as gpdfrom shapely.geometry import *Pd_csv_sel_pq_gg = gpd.GeoDataFrame(Points_csv_sel_pq_gg.toPandas(), geometry="geometry") Error is given below  /databricks/spark/python/pyspark/sql/pandas/utils.py:37: DeprecationWarning: distutil...

  • 425 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @NT911, Ensure that you have the correct versions of Spark, GeoPandas, and other relevant libraries installed. Sometimes compatibility issues arise due to outdated or incompatible dependencies.

  • 0 kudos
1 More Replies
Avinash_Narala
by New Contributor III
  • 698 Views
  • 2 replies
  • 0 kudos

create notebook programatically

Hello,I have json content of the notebook with me.Can I know is there a way to create notebook with that content using python?

  • 698 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Avinash_Narala , You can use Python to convert JSON content into a DataFrame in Databricks. To do this, you'll first convert the JSON content into a list of JSON strings, then parallelize the list to create an RDD, and finally use spark.read.jso...

  • 0 kudos
1 More Replies
Labels
Top Kudoed Authors