cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

nyehia
by Contributor
  • 4354 Views
  • 19 replies
  • 1 kudos

Can not access SQL files in the Shared workspace

Hey,we have an issue in that we can access the SQL files whenever the notebook is in the repo path but whenever the CICD pipeline imports the repo notebooks and SQL files to the shared workspace, we can list the SQL files but can not read them.we cha...

  • 4354 Views
  • 19 replies
  • 1 kudos
Latest Reply
karthik_p
Esteemed Contributor
  • 1 kudos

@Nermin Yehia​ yes, as you are moving files to different location manually , just update as can manage permissions in target and that should take care of everything

  • 1 kudos
18 More Replies
kinsun
by New Contributor II
  • 906 Views
  • 3 replies
  • 0 kudos

Resolved! Delta Live Table Service Upgrade

Dear experts, Might I know what will happen to the delta live table pipeline which is in a cancelled state, when there is a runtime service upgrade? Thanks!

  • 906 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@KS LAU​ :When a runtime service upgrade occurs in Databricks, any running tasks or pipelines may be temporarily interrupted while the upgrade is being applied. In the case of a cancelled Delta Live Table pipeline, it will not be impacted by the upgr...

  • 0 kudos
2 More Replies
GuMart
by New Contributor III
  • 2481 Views
  • 5 replies
  • 2 kudos

Resolved! DLT target schema - get value during run time

Hi,I would like to know if it is possible to get the target schema, programmatically, inside a DLT.In DLT pipeline settings, destination, target schema.I want to run more idempotent pipelines. For example, my target table has the fields: reference_da...

  • 2481 Views
  • 5 replies
  • 2 kudos
Latest Reply
GuMart
New Contributor III
  • 2 kudos

Thank you @Suteja Kanuri​ ,Looks like you solution is working, thank you.Regards,

  • 2 kudos
4 More Replies
amitca71
by Contributor II
  • 1424 Views
  • 2 replies
  • 2 kudos

Resolved! sedona/shapely error Unknown WKB type 16

Hi,i stream data from postgis to s3 using debezium. postgis->debezium->s3->spark(databricks)once read it i decode it and i can see that the binary representation is similiar to what i have in postgis, on a wkb formated column.once i try to read it ei...

  • 1424 Views
  • 2 replies
  • 2 kudos
Latest Reply
Kaniz
Community Manager
  • 2 kudos

Hi @Amit Cahanovich​, The error message "Unknown WKB type 16" indicates that the WKB data you are trying to read has a geometry type that the library does not recognize. WKB type 16 is not valid in the Simple Feature Access (SFA) standard, the most w...

  • 2 kudos
1 More Replies
ivanychev
by Contributor
  • 2932 Views
  • 7 replies
  • 5 kudos

DBR 12.2: DeltaOptimizedWriter: Resolved attribute(s) missing from in operator

After upgrading from DBR 11.3 LTS to DBR 12.2 LTS we started to observe the following error during "read from parquet and write to delta" piece of logic.AnalysisException: Resolved attribute(s) group_id#72,display_name#73,parent_id#74,path#75,path_li...

  • 2932 Views
  • 7 replies
  • 5 kudos
Latest Reply
Valtor
New Contributor II
  • 5 kudos

I can confirm that this issue is resolved for us as well in the latest 12.2 release.

  • 5 kudos
6 More Replies
playermanny2
by New Contributor II
  • 923 Views
  • 2 replies
  • 1 kudos

Reading data in Azure Databricks Delta Lake from AWS Redshift

We have Databricks set up and running on Azure. Now we want to connect it with Redshift (AWS) to perform further downstream analysis for our redshift users.I could find the documentation on how to do it within the same cloud (Either AWS or Azure) but...

  • 923 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

@Manny Cato​ :To allow Redshift to read data from Delta Lake hosted on Azure, you can use AWS Glue Data Catalog as an intermediary. The Glue Data Catalog is a fully managed metadata catalog that integrates with a variety of data sources, including De...

  • 1 kudos
1 More Replies
405041
by New Contributor II
  • 868 Views
  • 2 replies
  • 0 kudos

Securing the Account Owner

Hey,As I understand, you cannot enable SSO and MFA for the Account Owner.Is there any way on the Databricks side to secure the Account Owner beyond username/password? Is there a lockout that is set up automatically for this user?What are the best pra...

  • 868 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Domonkos Rozsa​ :You are correct that Databricks does not support SSO and MFA for the Account Owner. However, there are several built-in mechanisms that can help secure the Account Owner account and protect it from unauthorized access:Password polic...

  • 0 kudos
1 More Replies
source2sea
by Contributor
  • 1863 Views
  • 1 replies
  • 0 kudos

Resolved! what mode is the deploy-mode when calling spark in databricks/

https://spark.apache.org/docs/latest/submitting-applications.htmlmainly want to know if extra class path could be used or not when i submit a job

  • 1863 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@min shi​ :In Databricks, when you run a job, you are submitting a Spark application to run in the cluster. The deploy-mode that is used by default depends on the type of job you are running:For interactive clusters, the deploy-mode is client. This m...

  • 0 kudos
Hubert-Dudek
by Esteemed Contributor III
  • 579 Views
  • 2 replies
  • 8 kudos

Databricks has added new metrics to its control panel, replacing the outdated Ganglia tool. These new metrics allow users to monitor the following clu...

Databricks has added new metrics to its control panel, replacing the outdated Ganglia tool. These new metrics allow users to monitor the following cluster performance metrics easily:- CPU utilization- Memory usage- Free filesystem space- Network traf...

Screenshot 2023-04-13 154026
  • 579 Views
  • 2 replies
  • 8 kudos
Latest Reply
jose_gonzalez
Moderator
  • 8 kudos

Thank you for sharing @Hubert Dudek​ !!!

  • 8 kudos
1 More Replies
source2sea
by Contributor
  • 1677 Views
  • 1 replies
  • 0 kudos

Resolved! ERROR RetryingHMSHandler: NoSuchObjectException(message:There is no database named global_temp)

ERROR RetryingHMSHandler: NoSuchObjectException(message:There is no database named global_temp)should one create it in the work space manually via UI? and how?would it get overwritten if work space is created via terraform?I use 10.4 LTS runtime.

  • 1677 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Moderator
  • 0 kudos

"global_temp" is a special database used for global temp tables that are shared across spark sessions. This error is harmless. You can ignore it.

  • 0 kudos
Erik_L
by Contributor II
  • 4611 Views
  • 2 replies
  • 2 kudos

Joining a big amount of data causes "Out of disk space error", how to ingest?

What I am trying to dodf = None   # For all of the IDs that are valid for id in ids: # Get the parts of the data from different sources df_1 = spark.read.parquet(url_for_id) df_2 = spark.read.parquet(url_for_id) ...   # Join together the pa...

  • 4611 Views
  • 2 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

@Erik Louie​ :There are several strategies that you can use to handle large joins like this in Spark:Use a broadcast join: If one of your dataframes is relatively small (less than 10-20 GB), you can use a broadcast join to avoid shuffling data. A bro...

  • 2 kudos
1 More Replies
Khalil
by Contributor
  • 3267 Views
  • 6 replies
  • 5 kudos

Resolved! Pivot a DataFrame in Delta Live Table DLT

I wanna apply a pivot on a dataframe in DLT but I'm having the following warningNotebook:XXXX used `GroupedData.pivot` function that will be deprecated soon. Please fix the notebook.I have the same warning if I use the the function collect.Is it risk...

  • 3267 Views
  • 6 replies
  • 5 kudos
Latest Reply
Khalil
Contributor
  • 5 kudos

Thanks @Kaniz Fatma​  for your support.The solution was to do the pivot outside of views or tables and the warning disappeared.

  • 5 kudos
5 More Replies
Tico23
by Contributor
  • 8605 Views
  • 12 replies
  • 10 kudos

Connecting SQL Server (on-premise) to Databricks via jdbc:sqlserver

Is it possible to connect to SQL Server on-premise (Not Azure) from Databricks?I tried to ping my virtualbox VM (with Windows Server 2022) from within Databricks and the request timed out.%sh   ping 122.138.0.14This is what my connection might look l...

  • 8605 Views
  • 12 replies
  • 10 kudos
Latest Reply
DBXC
Contributor
  • 10 kudos

You need to setup the VNet and wire up the connection between Databricks and on-prem via VPN or ExpressRoute

  • 10 kudos
11 More Replies
moski
by New Contributor II
  • 831 Views
  • 3 replies
  • 1 kudos

How to import a data table from SQLQuery2 into Databricks notebook

Can anyone show me a few commands to import a table, say "mytable2 From: Microsoft SQL Server Into: Databricks Notebook using spark dataframe or at least pandas dataframeCheers!

  • 831 Views
  • 3 replies
  • 1 kudos
Latest Reply
irfanaziz
Contributor II
  • 1 kudos

You can read any table from MSSQL. You would need to authenticate to the db, so your would need the connection string:def dbProps(): return { "user" : "db-user", "password" : "your password", "driver" : "com.microsoft.sqlserver.jdbc.SQLServerD...

  • 1 kudos
2 More Replies
Labels
Top Kudoed Authors