cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Anotech
by New Contributor II
  • 3402 Views
  • 2 replies
  • 1 kudos

How can I fix this error. ExecutionError: An error occurred while calling o392.mount: java.lang.NullPointerException

Hello, I'm trying to mount my Databricks to my Azure gen 2 data lake to read in data from the container, but I get an error when executing this line of code: dbutils.fs.mount( source = "abfss://resumes@choisysresume.dfs.core.windows.net/", mount_poin...

  • 3402 Views
  • 2 replies
  • 1 kudos
Latest Reply
WernerS
New Contributor III
  • 1 kudos

checked it with my mount script and that is exactly the same except that I do not put a '/' after dfs.core.windows.netYou might wanna try that.Also, is Unity enabled?  Because Unity does not allow mounts.

  • 1 kudos
1 More Replies
gtaspark
by New Contributor II
  • 38974 Views
  • 8 replies
  • 4 kudos

Resolved! How to get the total directory size using dbutils

Is there a way to get the directory size in ADLS(gen2) using dbutils in databricks? If I run this dbutils.fs.ls("/mnt/abc/xyz") I get the file sizes inside the xyz folder( there are about 5000 files), I want to get the size of the XYZ folder how ca...

  • 38974 Views
  • 8 replies
  • 4 kudos
Latest Reply
User16788316720
New Contributor III
  • 4 kudos

File size is only specified for files. So, if you specify a directory as your source, you have to iterate through the directory. The below snippet should work (and should be faster than the other solutions).import glob   def get_directory_size_in_byt...

  • 4 kudos
7 More Replies
g96g
by New Contributor III
  • 989 Views
  • 3 replies
  • 0 kudos

data is not written back to data lake

I have this strange case where data is not written back to data lake. I have 3 container- . Bronze, Silver and Gold. I have done the mounting and have not problem to read the source data and write it Bronze layer ( using hive meta store catalog). T...

  • 989 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Givi Salu​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we ...

  • 0 kudos
2 More Replies
Rishabh264
by Honored Contributor II
  • 567 Views
  • 1 replies
  • 2 kudos

Hey there! I've noticed that many people seem to be confused about the differences between databases, data warehouses, and data lakes. It's un...

Hey there! I've noticed that many people seem to be confused about the differences between databases, data warehouses, and data lakes. It's understandable, as these terms can be easily misunderstood or used interchangeablyHere is the summary for all ...

  • 567 Views
  • 1 replies
  • 2 kudos
Latest Reply
Kaniz
Community Manager
  • 2 kudos

Hi @Rishabh Pandey​, Thank you for sharing your thoughts and information about our Databricks.We appreciate your interest in and engagement with Databricks. Please let us know if you have any further questions or comments, as we are always happy to h...

  • 2 kudos
JesseS
by New Contributor
  • 2424 Views
  • 2 replies
  • 1 kudos

Resolved! How to extract source data from on-premise databases into a data lake and load with AutoLoader?

Here is the situation I am working with. I am trying to extract source data using Databricks JDBC connector using SQL Server databases as my data source. I want to write those into a directory in my data lake as JSON files, then have AutoLoader ing...

  • 2424 Views
  • 2 replies
  • 1 kudos
Latest Reply
Aashita
Contributor III
  • 1 kudos

To add to @werners point, I would use ADF to load SQL server data into ADLS Gen 2 as json. Then Load these Raw Json files from your ADLS base location into a Delta table using Autoloader.Delta Live Tables can be used in this scenario.You can also reg...

  • 1 kudos
1 More Replies
DB_developer
by New Contributor III
  • 1496 Views
  • 2 replies
  • 3 kudos

How to optimize storage for sparse data in data lake?

I have lot of tables with 80% of columns being filled with nulls. I understand SQL sever provides a way to handle these kind of data during the data definition of the tables (with Sparse keyword). Do datalake provide similar kind of thing?

  • 1496 Views
  • 2 replies
  • 3 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 3 kudos

datalake itself not, but the file format you use to store data does.f.e. parquet uses column compression, so sparse data will compress pretty good.csv on the other hand: total disaster

  • 3 kudos
1 More Replies
DB_developer
by New Contributor III
  • 778 Views
  • 3 replies
  • 0 kudos
  • 778 Views
  • 3 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

there is no single answer to this.If you look at parquet, which is a very common format on data lakes:https://parquet.apache.org/docs/file-format/nulls/and on SO

  • 0 kudos
2 More Replies
rt2
by New Contributor III
  • 679 Views
  • 2 replies
  • 3 kudos

Resolved! Fundamentals of Databricks Lakehouse Badge not recieved.

I passed the databricks fundamental exam and like many others I too did not recieved my badge.I am very much intrested in putting this badge on my linkedin profile, please help.My email id is: rahul.psit.ec@gmail.comWhich databricks is resolving as: ...

  • 679 Views
  • 2 replies
  • 3 kudos
Latest Reply
rt2
New Contributor III
  • 3 kudos

I got the badge now. Thanks.

  • 3 kudos
1 More Replies
Direo
by Contributor
  • 916 Views
  • 3 replies
  • 6 kudos
  • 916 Views
  • 3 replies
  • 6 kudos
Latest Reply
Kaniz
Community Manager
  • 6 kudos

Hi @Direo Direo​ , Does the above suggestion help you? Were you able to write tables to Delta Lake using upsert mode?

  • 6 kudos
2 More Replies
stramzik
by New Contributor II
  • 898 Views
  • 1 replies
  • 1 kudos

Unable to mount datalake gen1 to databricks

I was mounting the Datalake Gen1 to Databricks for accessing and processing files, The below code was working great for the past 1 year and all of a sudden I'm getting an errorconfigs = {"df.adl.oauth2.access.token.provider.type": "ClientCredential"...

  • 898 Views
  • 1 replies
  • 1 kudos
Latest Reply
stramzik
New Contributor II
  • 1 kudos

bumping up the thread

  • 1 kudos
Labels