cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

GCera
by New Contributor II
  • 3233 Views
  • 2 replies
  • 1 kudos

Can we use "Access Connector for Azure Databricks" to access Azure SQL Server?

Is it possible to avoid using Service Principal (and managing their secrets) via the Python MSAL library and, instead, use the "Access Connector for Azure Databricks" to access Azure SQL Server (just like we do for connecting to Azure Data Lake Stora...

  • 3233 Views
  • 2 replies
  • 1 kudos
Latest Reply
GCera
New Contributor II
  • 1 kudos

Unfortunately, I guess the answer is no (as for today, see @Wojciech_BUK reply).

  • 1 kudos
1 More Replies
Ruby8376
by Valued Contributor
  • 1830 Views
  • 2 replies
  • 1 kudos

Query endpoint on Azure sql or databricks?

Hi Currently all data reauired resides in Az sql database. We have a project in which we need to query on demand this data in Salesforce data cloud to be further used for reporting in CRMA dashboard.do we need to move this data from az sql to delta l...

  • 1830 Views
  • 2 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

It depends.  If Salesforce Data Cloud has a connector for AZ SQL (being a native one or odbc/jdbc), you can query directly.  MS also has something like OData.  AFAIK AZ SQL does not have a query API, only for DB-management purposes.If all above is no...

  • 1 kudos
1 More Replies
hv129
by New Contributor
  • 5353 Views
  • 0 replies
  • 0 kudos

java.lang.OutOfMemoryError on Data Ingestion and Storage Pipeline

I have around 25GBs of data in my Azure storage. I am performing data ingestion using Autoloader in databricks. Below are the steps I am performing:Setting the enableChangeDataFeed as true.Reading the complete raw data using readStream.Writing as del...

  • 5353 Views
  • 0 replies
  • 0 kudos
vroste
by New Contributor III
  • 14825 Views
  • 8 replies
  • 5 kudos

Resolved! Unsupported Azure Scheme: abfss

Using Databricks Runtime 12.0, when attempting to mount an Azure blob storage container, I'm getting the following exception:`IllegalArgumentException: Unsupported Azure Scheme: abfss` dbutils.fs.mount( source="abfss://container@my-storage-accoun...

  • 14825 Views
  • 8 replies
  • 5 kudos
Latest Reply
AdamRink
New Contributor III
  • 5 kudos

What configs did you tweak, having same issue?

  • 5 kudos
7 More Replies
NLearn
by New Contributor II
  • 1030 Views
  • 1 replies
  • 0 kudos

How can I programmatically get my notebook default language?

I'm writing some code to perform regression testing which require notebook path and its default language. Based on default language it will perform further analysis. So how can I programmatically get my notebook default language and save in some vari...

  • 1030 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

You can get the default language of a notebook using dbutils.notebook.get_notebook_language()  try this example: %pythonimport dbutilsdefault_language = dbutils.notebook.get_notebook_language()print(default_language)

  • 0 kudos
wissamimad
by New Contributor
  • 8732 Views
  • 1 replies
  • 1 kudos

Writing to Delta tables/files is taking a long time

I have a dataframe that is a series of transformation of big data (167 million rows) and I want to write it to delta files and tables using the below :  try: (df_new.write.format('delta') .option("delta.minReaderVersion", "2") .optio...

  • 8732 Views
  • 1 replies
  • 1 kudos
Latest Reply
prasu1222
New Contributor II
  • 1 kudos

Hi @Retired_mod I am having the same issue where i made a inner join on two spark dataframes they are running only a single node not sure how to modify to run on many nodes and same thing with when i write a 30 gb data to a delta table it is almost 3...

  • 1 kudos
quakenbush
by Contributor
  • 2166 Views
  • 3 replies
  • 1 kudos

Resolved! Time out importing DBC

Importing or cloning the .dbc Folder from "advanced-data-engineering-with-databricks" into my own workspace fails with a time-out. The folder is incomplete How can I fix this?I tried download and import the file and via URL... 

  • 2166 Views
  • 3 replies
  • 1 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 1 kudos

Great news and congrats on your exam!!!

  • 1 kudos
2 More Replies
JonW
by New Contributor
  • 3696 Views
  • 2 replies
  • 0 kudos

Pandas finds parquet file, Spark does not

I am having an issue with Databricks (Community Edition) where I can use Pandas to read a parquet file into a dataframe, but when I use Spark it states the file doesn't exist. I have tried reformatting the file path for spark but I can't seem to find...

JonW_1-1703880035484.png
  • 3696 Views
  • 2 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

Are you getting any error messages? what happens when you do a "ls /dbfs/"? are you able to list all the parquet files?

  • 0 kudos
1 More Replies
karthik_p
by Esteemed Contributor
  • 13643 Views
  • 3 replies
  • 1 kudos

does delta live tables supports identity columns

we are able to test identity columns using sql/python, but when we are trying same using DLT, we are not seeing values under identity column. it is always empty for coloumn we created "id BIGINT GENERATED ALWAYS AS IDENTITY" 

  • 13643 Views
  • 3 replies
  • 1 kudos
Latest Reply
karthik_p
Esteemed Contributor
  • 1 kudos

@Retired_mod thank you for quick response, we are able to generate for streaming and materialized views. but only confusion that i am seeing is, in terms of limitations that are mentioned in DLT Identity columns are not supported with tables that are...

  • 1 kudos
2 More Replies
AzaharNadaf
by New Contributor III
  • 1495 Views
  • 2 replies
  • 0 kudos

Resolved! Using FOR XML RAW in Spark SQL

How can I convert below SQL server query to spark SQL Query - SELECT distinct HashBytes('md5', (SELECT a, b, c for xml raw)) as xyzzy  FROM table nameNeed help here community

  • 1495 Views
  • 2 replies
  • 0 kudos
Latest Reply
AzaharNadaf
New Contributor III
  • 0 kudos

The Idea here was to create a unique id based upon the number of columns, I have used dense Rank to resolve this issueThanks

  • 0 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels