- 1098 Views
- 1 replies
- 0 kudos
I want to read the last modified datetime of the files in data lake in a databricks script. If I could read it efficiently as a column when reading data from data lake, it would be perfect.Thank you:)
- 1098 Views
- 1 replies
- 0 kudos
Latest Reply
Efficiently reading data lake files involves:Choosing the Right Tools: Select tools optimized for data lake file formats (e.g., Parquet, ORC) and distributed computing frameworks (e.g., Apache Spark, Apache Flink).Partitioning and Indexing: Partition...
- 55226 Views
- 9 replies
- 5 kudos
Is there a way to get the directory size in ADLS(gen2) using dbutils in databricks?
If I run this
dbutils.fs.ls("/mnt/abc/xyz")
I get the file sizes inside the xyz folder( there are about 5000 files), I want to get the size of the XYZ folder
how ca...
- 55226 Views
- 9 replies
- 5 kudos
Latest Reply
File size is only specified for files. So, if you specify a directory as your source, you have to iterate through the directory. The below snippet should work (and should be faster than the other solutions).import glob
def get_directory_size_in_byt...
8 More Replies
- 8814 Views
- 2 replies
- 1 kudos
Hello, I'm trying to mount my Databricks to my Azure gen 2 data lake to read in data from the container, but I get an error when executing this line of code: dbutils.fs.mount(
source = "abfss://resumes@choisysresume.dfs.core.windows.net/",
mount_poin...
- 8814 Views
- 2 replies
- 1 kudos
Latest Reply
checked it with my mount script and that is exactly the same except that I do not put a '/' after dfs.core.windows.netYou might wanna try that.Also, is Unity enabled? Because Unity does not allow mounts.
1 More Replies
by
g96g
• New Contributor III
- 2303 Views
- 3 replies
- 0 kudos
I have this strange case where data is not written back to data lake. I have 3 container- . Bronze, Silver and Gold. I have done the mounting and have not problem to read the source data and write it Bronze layer ( using hive meta store catalog). T...
- 2303 Views
- 3 replies
- 0 kudos
Latest Reply
Hi @Givi Salu Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we ...
2 More Replies
- 1166 Views
- 0 replies
- 3 kudos
Hey there! I've noticed that many people seem to be confused about the differences between databases, data warehouses, and data lakes. It's understandable, as these terms can be easily misunderstood or used interchangeablyHere is the summary for all ...
- 1166 Views
- 0 replies
- 3 kudos
by
JesseS
• New Contributor II
- 6518 Views
- 2 replies
- 1 kudos
Here is the situation I am working with. I am trying to extract source data using Databricks JDBC connector using SQL Server databases as my data source. I want to write those into a directory in my data lake as JSON files, then have AutoLoader ing...
- 6518 Views
- 2 replies
- 1 kudos
Latest Reply
To add to @werners point, I would use ADF to load SQL server data into ADLS Gen 2 as json. Then Load these Raw Json files from your ADLS base location into a Delta table using Autoloader.Delta Live Tables can be used in this scenario.You can also reg...
1 More Replies
- 9603 Views
- 2 replies
- 3 kudos
I have lot of tables with 80% of columns being filled with nulls. I understand SQL sever provides a way to handle these kind of data during the data definition of the tables (with Sparse keyword). Do datalake provide similar kind of thing?
- 9603 Views
- 2 replies
- 3 kudos
Latest Reply
datalake itself not, but the file format you use to store data does.f.e. parquet uses column compression, so sparse data will compress pretty good.csv on the other hand: total disaster
1 More Replies
- 1224 Views
- 0 replies
- 0 kudos
I inherited this environment and my question is we have a job that mines the the data lake and creates a table that's is grouped by unit number and their data points. The job runs every 10 minutes. We then connect to that table direct query power bi ...
- 1224 Views
- 0 replies
- 0 kudos
by
rt2
• New Contributor III
- 1597 Views
- 2 replies
- 3 kudos
I passed the databricks fundamental exam and like many others I too did not recieved my badge.I am very much intrested in putting this badge on my linkedin profile, please help.My email id is: rahul.psit.ec@gmail.comWhich databricks is resolving as: ...
- 1597 Views
- 2 replies
- 3 kudos
- 1593 Views
- 1 replies
- 1 kudos
I was mounting the Datalake Gen1 to Databricks for accessing and processing files, The below code was working great for the past 1 year and all of a sudden I'm getting an errorconfigs = {"df.adl.oauth2.access.token.provider.type": "ClientCredential"...
- 1593 Views
- 1 replies
- 1 kudos