cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

nyehia
by Contributor
  • 2833 Views
  • 9 replies
  • 0 kudos

Can not access a sql file from Notebook

Hey,I have a repo of notebooks and SQL files, the typical way is to update/create notebooks in the repo then push it and CICD pipeline deploys the notebooks to the Shared workspace.the issue is that I can access the SQL files in the Repo but can not ...

tempsnip
  • 2833 Views
  • 9 replies
  • 0 kudos
Latest Reply
ok_1
New Contributor II
  • 0 kudos

ok

  • 0 kudos
8 More Replies
Jiri_Koutny
by New Contributor III
  • 2945 Views
  • 10 replies
  • 3 kudos

Delay in files update on filesystem

Hi, I noticed that there is quite a significant delay (2 - 10s) between making a change to some file in Repos via Databricks file edit window and propagation of such change to the filesystem. Our engineers and scientists use YAML config files. If the...

  • 2945 Views
  • 10 replies
  • 3 kudos
Latest Reply
DaniyarZ
New Contributor II
  • 3 kudos

There is a trick: if you execute "%sh ls" command, it forces update of filesystem immediately

  • 3 kudos
9 More Replies
SimhadriRaju
by New Contributor
  • 41646 Views
  • 7 replies
  • 0 kudos

How to check file exists in databricks

I Have a while loop there i have to check a file exists or not if exists read the file in a data frame else go to another file

  • 41646 Views
  • 7 replies
  • 0 kudos
Latest Reply
Amit_Dass
New Contributor II
  • 0 kudos

How to check if a file exists in DBFS?Let's write a Python function to check if the file exists or not-------------------------------------------------------------def file_exists(path):    try:        dbutils.fs.ls(path)        return True    except ...

  • 0 kudos
6 More Replies
learnerbricks
by New Contributor II
  • 3623 Views
  • 4 replies
  • 0 kudos

Unable to save file in DBFS

I have took the azure datasets that are available for practice. I got the 10 days data from that dataset and now I want to save this data into DBFS in csv format. I have facing an error :" No such file or directory: 'No such file or directory: '/dbfs...

  • 3623 Views
  • 4 replies
  • 0 kudos
Latest Reply
pardosa
New Contributor II
  • 0 kudos

Hi,after some exercise you need to aware folder create in dbutils.fs.mkdirs("/dbfs/tmp/myfolder") it's created in /dbfs/dbfs/tmp/myfolderif you want to access path to_csv("/dbfs/tmp/myfolder/mytest.csv") you should created with this script dbutils.fs...

  • 0 kudos
3 More Replies
RantoB
by Valued Contributor
  • 16644 Views
  • 8 replies
  • 4 kudos

Resolved! read csv directly from url with pyspark

I would like to load a csv file directly to a spark dataframe in Databricks. I tried the following code :url = "https://opendata.reseaux-energies.fr/explore/dataset/eco2mix-national-tr/download/?format=csv&timezone=Europe/Berlin&lang=fr&use_labels_fo...

  • 16644 Views
  • 8 replies
  • 4 kudos
Latest Reply
MartinIsti
New Contributor III
  • 4 kudos

I know it's a 2 years old thread but I needed to find a solution to this very thing today. I had one notebook using SparkContextfrom pyspark import SparkFilesfrom pyspark.sql.functions import *sc.addFile(url) But according to the runtime 14 release n...

  • 4 kudos
7 More Replies
kinsun
by New Contributor II
  • 8740 Views
  • 5 replies
  • 0 kudos

Resolved! DBFS and Local File System Doubts

Dear Databricks Expert,I got some doubts when dealing with DBFS and Local File System.Case01: Copy a file from ADLS to DBFS. I am able to do so through the below python codes:#spark.conf.set("fs.azure.account.auth.type", "OAuth") spark.conf.set("fs.a...

  • 8740 Views
  • 5 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @KS LAU​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your q...

  • 0 kudos
4 More Replies
Venky
by New Contributor III
  • 50003 Views
  • 19 replies
  • 20 kudos

Resolved! i am trying to read csv file using databricks, i am getting error like ......FileNotFoundError: [Errno 2] No such file or directory: '/dbfs/FileStore/tables/world_bank.csv'

i am trying to read csv file using databricks, i am getting error like ......FileNotFoundError: [Errno 2] No such file or directory: '/dbfs/FileStore/tables/world_bank.csv'

image
  • 50003 Views
  • 19 replies
  • 20 kudos
Latest Reply
Alexis
New Contributor III
  • 20 kudos

Hiyou can try: my_df = spark.read.format("csv")      .option("inferSchema","true")  # to get the types from your data      .option("sep",",")            # if your file is using "," as separator      .option("header","true")       # if you...

  • 20 kudos
18 More Replies
CrisCampos
by New Contributor II
  • 2062 Views
  • 1 replies
  • 1 kudos

How to load a "pickle/joblib" file on Databricks

Hi Community, I am trying to load a joblib on Databricks, but doesn't seems to be working.Getting an error message: "Incompatible format detected"  Any idea of how to load this type of file on db?Thanks!

image image
  • 2062 Views
  • 1 replies
  • 1 kudos
Latest Reply
User16859817863
New Contributor II
  • 1 kudos

You can import joblib/joblibspark package to load joblib files

  • 1 kudos
Data_Engineer_3
by New Contributor III
  • 11674 Views
  • 17 replies
  • 7 kudos

Resolved! FileNotFoundError: [Errno 2] No such file or directory: '/FileStore/tables/flight_data.zip' The data and file exists in location mentioned above

I am new to learning Spark and working on some practice; I have uploaded a zip file in DBFS /FileStore/tables directory and trying to run a python code to unzip the file; The python code is as: from zipfile import *with ZipFile("/FileStore/tables/fli...

  • 11674 Views
  • 17 replies
  • 7 kudos
Latest Reply
883022
New Contributor II
  • 7 kudos

What if changing the runtime is not an option? I'm experiencing a similar issue using the following:%pip install -r /dbfs/path/to/file.txtThis worked for a while, but now I'm getting the Errno 2 mentioned above. I am still able to print the same file...

  • 7 kudos
16 More Replies
GC-James
by Contributor II
  • 6738 Views
  • 17 replies
  • 5 kudos

Resolved! Lost memory when using dbutils

Why does copying a 9GB file from a container to the /dbfs lose me 50GB of memory? (Which doesn't come back until I restarted the cluster)

image
  • 6738 Views
  • 17 replies
  • 5 kudos
Latest Reply
AdrianP
New Contributor II
  • 5 kudos

Hi James,Did you get to the bottom of this? We are experiencing the same issue, and all the suggested solutions don't seem to work.Thanks,Adrian

  • 5 kudos
16 More Replies
f2008700
by New Contributor III
  • 4759 Views
  • 7 replies
  • 7 kudos

Configuring average parquet file size

I have S3 as a data source containing sample TPC dataset (10G, 100G).I want to convert that into parquet files with an average size of about ~256MiB. What configuration parameter can I use to set that?I also need the data to be partitioned. And withi...

  • 4759 Views
  • 7 replies
  • 7 kudos
Latest Reply
Anonymous
Not applicable
  • 7 kudos

Hi @Vikas Goel​ We haven't heard from you since the last response from @Werner Stinckens​ ​, and I was checking back to see if her suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be helpful to o...

  • 7 kudos
6 More Replies
jch
by New Contributor III
  • 1988 Views
  • 4 replies
  • 5 kudos

Resolved! Why does spark.read.csv come back with an error: com.databricks.sql.io.FileReadException: Error while reading file dbfs:/mnt/cntnr/demo/circuits.csv ?

I need help understanding why I can't open a file.In a databricks notebook, I use this code:%fs   ls /mnt/cntnr/demoI get back dbfs:/mnt/cntnr/demo/circuits.csv as one of the path values.When I use this code, I get an error:circuits_df = spark.read....

  • 1988 Views
  • 4 replies
  • 5 kudos
Latest Reply
jch
New Contributor III
  • 5 kudos

It turns out my spark config was wrong    #Set Spark configuration    configs = {"fs.azure.account.auth.type": "OAuth",          "fs.azure.account.oauth.provider.type": "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",          "fs.azu...

  • 5 kudos
3 More Replies
teng_shin_lim
by New Contributor
  • 963 Views
  • 1 replies
  • 1 kudos

Having issue trying to download a csv file from a website using FireFox Selenium.

Hi, When I clicked on the download button from a website thru Firefox selenium using element.click(), and the download destination is being set as Azure datalake storage. Then, after the download started, those .csv and .csv.part files never gotten m...

image.png
  • 963 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Brandon Lim​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 1 kudos
Kaijser
by New Contributor II
  • 1037 Views
  • 1 replies
  • 2 kudos

Installing private python Azure DevOps repository without revealing personal access token in pyproject.toml

I want to install a .whl file on my Databricks cluster which includes a private Azure DevOps repository as a dependency in its pyproject.toml file, i.e.:[project] name = "test" description = "test_description." version = "0.1.0" authors = [ { name ...

  • 1037 Views
  • 1 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Aaron Kaijser​  Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 2 kudos
Ajay-Pandey
by Esteemed Contributor III
  • 6968 Views
  • 7 replies
  • 11 kudos

Resolved! Unzip Files

Hi all, I am trying to unzip a file in databricks but facing an issue,Please help me if you have any doc or codes to share.

  • 6968 Views
  • 7 replies
  • 11 kudos
Latest Reply
vivek_rawat
New Contributor III
  • 11 kudos

Hey ajay,You can follow this module to unzip your zip file.To give your brief idea about this, it will unzip your file directly into your driver node storage.So If your compressed data is inside DBFS then you first have to move that to drive node and...

  • 11 kudos
6 More Replies
Labels