- 11002 Views
- 12 replies
- 0 kudos
Using sparkcsv to write data to dbfs, which I plan to move to my laptop via standard s3 copy commands.
The default for spark csv is to write output into partitions. I can force it to a single partition, but would really like to know if there is a ge...
- 11002 Views
- 12 replies
- 0 kudos
Latest Reply
Without access to bash it would be highly appreciated if an option within databricks (e.g. via dbfsutils) existed.
11 More Replies
- 6681 Views
- 3 replies
- 1 kudos
Hello,
Currently, I'm facing problem with line separator inside csv file, which is exported from data frame in Azure Databricks (version Spark 2.4.3) to Azure Blob storage. All those csv files contains LF as line-separator. I need to have CRLF (\r\n...
- 6681 Views
- 3 replies
- 1 kudos
Latest Reply
Hi,
Have you got the solution for above problem.Kindly let me know.
2 More Replies
- 6035 Views
- 2 replies
- 0 kudos
I understand that plots in R notebooks are captured by a png graphics device. Is there a way to set the size or the aspect ratio of the canvas? I understand that I can resize the rendered .png by dragging the handle in the notebook, but that means I...
- 6035 Views
- 2 replies
- 0 kudos
Latest Reply
Hi @sdaza​ , the answer above didn't change the size somehow, or perhaps I was putting it in the wrong place? I entered it in a new cell before the %sql cell with the plot chart.
1 More Replies
- 8136 Views
- 9 replies
- 0 kudos
I have the following two data frames which have just one column each and have exact same number of rows. How do I merge them so that I get a new data frame which has the two columns and all rows from both the data frames. For example,
df1:
+-----+...
- 8136 Views
- 9 replies
- 0 kudos
Latest Reply
@bhosskie
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("Spark SQL basic example").enableHiveSupport().getOrCreate()
sc = spark.sparkContext
sqlDF1 = spark.sql("select count(*) as Total FROM user_summary")
sqlDF2 = sp...
8 More Replies
by
nmud19
• New Contributor II
- 56334 Views
- 8 replies
- 6 kudos
I have a folder at location dbfs:/mnt/temp
I need to delete this folder. I tried using
%fs rm mnt/temp
&
dbutils.fs.rm("mnt/temp")
Could you please help me out with what I am doing wrong?
- 56334 Views
- 8 replies
- 6 kudos
Latest Reply
use this (last raw should not be indented twice...):
def delete_mounted_dir(dirname):
files=dbutils.fs.ls(dirname)
for f in files:
if f.isDir():
delete_mounted_dir(f.path)
dbutils.fs.rm(f.path, recurse=True)
7 More Replies
- 25160 Views
- 2 replies
- 2 kudos
Hi all,
"Driver is up but is not responsive, likely due to GC."
This is the message in cluster event logs. Can anyone help me with this. What does GC means? Garbage collection? Can we control it externally?
- 25160 Views
- 2 replies
- 2 kudos
Latest Reply
spark.catalog.clearCache() solve the problem for me
1 More Replies
- 5729 Views
- 5 replies
- 0 kudos
Getting the error when try to load the uploaded file in py notebook.# File location and type file_location = "//FileStore/tables/data/d1.csv" file_type = "csv" # CSV options infer_schema = "true" first_row_is_header = "false" delimiter = ","# The app...
- 5729 Views
- 5 replies
- 0 kudos
Latest Reply
@naughtonelad​ if your issue is solved,please let me know as I am facing the same problem
4 More Replies
- 4972 Views
- 5 replies
- 0 kudos
I'm using a broadcast variable about 100 MB pickled in size, which I'm approximating with: >>> data = list(range(int(10*1e6)))
>>> import cPickle as pickle
>>> len(pickle.dumps(data))
98888896Running on a cluster with 3 c3.2xlarge executors, ...
- 4972 Views
- 5 replies
- 0 kudos
Latest Reply
The Facebook credit can be utilized by the gamers to purchase the pearls. The other route is to finished various sorts of Dragons in the Dragon Book. Dragon City Gems There are various kinds of Dragons, one is amazing, at that point you have the fund...
4 More Replies
- 5340 Views
- 2 replies
- 0 kudos
The following code produces no output. It seems as if the print(x) is not being executed for each "words" element:
words = sc.parallelize (
["scala",
"java",
"hadoop",
"spark",
"akka",
"spark vs hadoop",
"pyspark",
"pysp...
- 5340 Views
- 2 replies
- 0 kudos
Latest Reply
Epson wf-3640 error code 0x97 is the common printer error code that may occur mostly in all printers but in order to resolve the error code, upon provides the best printer guide to all printer users.
1 More Replies
- 12129 Views
- 8 replies
- 0 kudos
Is there a way to indicate to dbutils.fs.mount to not throw an error if the mount is already mounted?
And viceversa, for unmount to not throw an error if it is already unmounted?
I am trying to run my notebook as a job and it has a init section that...
- 12129 Views
- 8 replies
- 0 kudos
Latest Reply
If you use scala to mount a gen 2 data lake you could try something like this
/Gather relevant Keys/
var ServicePrincipalID = ""
var ServicePrincipalKey = ""
var DirectoryID = ""
/Create configurations for our connection/
var configs = Map (...
7 More Replies
by
Barb
• New Contributor III
- 4204 Views
- 6 replies
- 0 kudos
Hi all,I need to use the SQL charindex function, but I'm getting a databricks error that this doesn't exist. That can't be true, right? Thanks for any ideas about how to make this work!Barb
- 4204 Views
- 6 replies
- 0 kudos
Latest Reply
The best option I found to replace CHARINDEX was LOCATE, examples from the Spark documentation below
> SELECT locate('bar', 'foobarbar', 5);
7
> SELECT POSITION('bar' IN 'foobarbar');
4
5 More Replies
- 5411 Views
- 1 replies
- 0 kudos
HI,
i have a parquet file with complex column types with nested structs and arrays.
I am using the scrpit from below link to flatten my parquet file.
https://docs.microsoft.com/en-us/azure/synapse-analytics/how-to-analyze-complex-schema
I am able ...
- 5411 Views
- 1 replies
- 0 kudos
Latest Reply
Hello, Please check out the below docs and notebook which has similar examples, https://docs.microsoft.com/en-us/azure/synapse-analytics/how-to-analyze-complex-schemahttps://docs.microsoft.com/en-us/azure/databricks/_static/notebooks/transform-comple...
- 2729 Views
- 3 replies
- 0 kudos
My organization has an S3 bucket mounted to the databricks filesystem under /dbfs/mnt. When using Databricks runtime 5.5 and below, the following logging code works correctly:log_file = '/dbfs/mnt/path/to/my/bucket/test.log' logger = logging.getLogg...
- 2729 Views
- 3 replies
- 0 kudos
Latest Reply
Probably it's worth to try to rewrite the emit ... https://docs.python.org/3/library/logging.html#handlers
This works for me: class OurFileHandler(logging.FileHandler):
def emit(self, record):
# copied from https://github.com/python/cpython/bl...
2 More Replies
- 21204 Views
- 16 replies
- 0 kudos
I couldn't find in documentation a way to export an RDD as a text file to a local folder by using python. Is it possible?
- 21204 Views
- 16 replies
- 0 kudos
Latest Reply
To: Export a file to local desktop
Workaround : Basically you have to do a "Create a table in notebook" with DBFS
The steps are:
Click on "Data" icon > Click "Add Data" button > Click "DBFS" button > Click "FileStore" folder icon in 1st pane "Sele...
15 More Replies