Data Engineering

Forum Posts

Sorted by:

by hravilla • New Contributor

07-19-2021 7:20:08 AM

2147 Views
1 replies
0 kudos

Upload file to DBFS fails with error code 0

When trying to upload to DBFS from local machine getting error as "Error occurred when processing file ... : Server responded with 0 code" DBR 7.3 LTSSpark 3.0.1 Scala 2.12 Uploading the file using the "upload" in the Databricks cloud console, the c...

Data Engineering

2147 Views
1 replies
0 kudos

07-19-2021 7:20:08 AM

View Replies

Latest Reply

PramodNaik
New Contributor II

07-22-2021 12:20:25 AM

0 kudos

Even I am facing the same issue with GCP databricks. I am able to upload files with smaller size. When i tried with 3MB file, databricks chokes. I get the above error. I tried with aws databricks, it works good even for bigger size files.

0 kudos

07-22-2021 12:20:25 AM

by PraveenVenkates • New Contributor

07-21-2021 2:00:42 PM

363 Views
0 replies
0 kudos

Code to automate provisioning of databricks in Azure

Hi, I would like to provision a Databricks environment in Azure and looking at options to create a workspace, cluster, notebook using code. Could you please point me to the documentation around this. Thank you.

Data Engineering

363 Views
0 replies
0 kudos

07-21-2021 2:00:42 PM

by Anonymous • Not applicable

06-05-2021 10:03:08 PM

1221 Views
1 replies
0 kudos

What are the advantages of using Delta Live tables over dbt?

Data Engineering

1221 Views
1 replies
0 kudos

06-05-2021 10:03:08 PM

View Replies

Latest Reply

Anonymous
Not applicable

07-21-2021 1:12:46 PM

0 kudos

By "dbt" do you mean Database Transactions? If not, what do you mean?

0 kudos

07-21-2021 1:12:46 PM

by aemorina • New Contributor

07-21-2021 7:08:22 AM

1783 Views
0 replies
0 kudos

Can Databricks Connect work on a cluster with port 443?

I am attempting to use Databricks Connect with a cluster in Azure Government with a port of 443 but get the following error when running databricks-connect test.The port you specified is either being used already or invalid. Port: The port that Data...

Data Engineering

1783 Views
0 replies
0 kudos

07-21-2021 7:08:22 AM

by daniil_terentye • New Contributor III

07-19-2021 12:52:34 PM

1340 Views
3 replies
0 kudos

EXISTS statement works incorrectly

Hi everybody. Looks like EXISTS statement works incorrectly. If i execute the following statement in SQL Server it returns one row, as it should WITH a AS ( SELECT '1' AS id, 'Super Company' AS name UNION SELECT '2' AS id, 'SUPER COMPANY...

Data Engineering

1340 Views
3 replies
0 kudos

07-19-2021 12:52:34 PM

View Replies

Latest Reply

daniil_terentye
New Contributor III

07-21-2021 6:52:23 AM

0 kudos

In newer versions of spark it's possible to use ANTI JOIN and SEMI JOIN It looks this way:WITH a AS ( SELECT '1' AS id, 'Super Company' AS name UNION SELECT '2' AS id, 'SUPER COMPANY' AS name ), b AS ( SELECT 'a@b.com' AS user_username, 'Super Co...

0 kudos

07-21-2021 6:52:23 AM

2 More Replies

by nickmaco • New Contributor II

07-21-2021 1:26:27 AM

627 Views
1 replies
0 kudos

Databricks - autostart from jdbc query

Hi team, New to Databricks and trying to understand if there is a "True" auto-start capability with Databricks. We are evaluating Databricks Delta lake as an alternative cloud based datawarehouse solution but the biggest problem I see is the inabili...

Data Engineering

627 Views
1 replies
0 kudos

07-21-2021 1:26:27 AM

View Replies

Latest Reply

nickmaco
New Contributor II

07-21-2021 5:53:02 AM

0 kudos

Just adding on to this. Using DBeaver as a client and using a singlenode cluster and a pool of idling VM, it was possible to get the autostart time of the cluster down to 35 seconds, + 17 seconds for the query time on top to show the first 200 rows ...

0 kudos

07-21-2021 5:53:02 AM

by vishavgupta988 • New Contributor

11-28-2019 2:43:37 AM

3438 Views
2 replies
0 kudos

How to set font-size of values in each cell of dataframe?

I am working on pandas and python.After processing a particular dataframe in my program , I am appending that dataframe below an existing Excel file. Now problem is my excel has font size of 11 pt but dataframe has font size of 12 pt.I want to set f...

Data Engineering

3438 Views
2 replies
0 kudos

11-28-2019 2:43:37 AM

View Replies

Latest Reply

DominicFHelms
New Contributor II

07-20-2021 9:29:55 AM

0 kudos

I like sharp fonts.

0 kudos

07-20-2021 9:29:55 AM

1 More Replies

by almogg • New Contributor

07-20-2021 2:41:50 AM

705 Views
0 replies
0 kudos

filter push down into redis when querying using spark connector

HiI'm loading df from redis using this code:df = (spark.read.format("org.apache.spark.sql.redis") .option("table", f"state_store_ready_to_sell") .option("key.column", "msid").option("infer.schema", "true").load()and then i'm running f...

Data Engineering

705 Views
0 replies
0 kudos

07-20-2021 2:41:50 AM

by okmich • New Contributor II

07-20-2021 12:45:59 AM

1242 Views
0 replies
1 kudos

S3 connection reset error :: Removing Spark Config on Cluster

Hi guys, I am running a production pipeline (Databricks Runtime 7.3 LTS) that keeps failing for some delta file reads with the error: 21/07/19 09:56:02 ERROR Executor: Exception in task 36.1 in stage 2.0 (TID 58) com.databricks.sql.io.FileReadExcept...

Data Engineering

1242 Views
0 replies
1 kudos

07-20-2021 12:45:59 AM

by talegari • New Contributor

07-19-2021 9:16:52 PM

338 Views
0 replies
0 kudos

sparkR.session() from web terminal

Question: sparkR.session() gives an error when run on web terminal, while it runs in a notebook. What parameters should be provided to create a spark session on web terminal? PS: I am trying to run a .R file using Rscript call on terminal instead ...

Data Engineering

338 Views
0 replies
0 kudos

07-19-2021 9:16:52 PM

by DanSiegel • New Contributor

07-19-2021 5:51:35 AM

546 Views
0 replies
0 kudos

Access an external table from another workspace

What's the best way to add an external table so another cluster/workspace can access an existing external table on S3? I need to redeploy my workspace into a new VPC, so I am not expecting any collisions of the warehouses. Is it as simple as adding ...

Data Engineering

546 Views
0 replies
0 kudos

07-19-2021 5:51:35 AM

by mnziza • New Contributor

07-18-2021 7:01:32 PM

1440 Views
0 replies
0 kudos

Is there a way to programmatically clear the notebook state in Azure databricks?

I have a scenario where I have a series of jobs that are triggered in ADF, the jobs are not linked as such but the resulting temporally tables from each job takes up memory of the databricks cluster. If I can clear the notebook state, that would fre...

Data Engineering

1440 Views
0 replies
0 kudos

07-18-2021 7:01:32 PM

by CalvinCalvert_ • New Contributor

07-18-2021 1:47:08 PM

435 Views
0 replies
0 kudos

How does FSCK work and does it have any negative effects on subsequent notebook executions?

In my environment, there are 3 groups of notebooks that run on their own schedules, however they all use the same underlying transaction logs (auditlogs, as we call them) in S3. From time to time, various notebooks from each of the 3 groups fail wit...

Data Engineering

435 Views
0 replies
0 kudos

07-18-2021 1:47:08 PM

by AnandNair • New Contributor

07-15-2021 4:45:44 AM

518 Views
0 replies
0 kudos

Load an explicit schema from an external metadata.csv file or a json file for reading csv's into dataframe

Hi, I have a metadata csv file which contains column name, and datatype such as Colm1: INT Colm2: String. I can also get the same in a json format as shown: I can store this on ADLS. How can I convert this into a schema like: "Myschema" that I can ...

Data Engineering

518 Views
0 replies
0 kudos

07-15-2021 4:45:44 AM

by Devaraj • New Contributor

07-14-2021 11:22:56 PM

2679 Views
0 replies
0 kudos

Not able to fetch data from Simba Spark Jdbc Driver

We are getting below error when we tried to set the date in preparedstatement using Simba Spark Jdbc Driver. Exception: Query execution failed: [Simba][SparkJDBCDriver](500051) ERROR processing query/statement. Error Code: 0, SQL state: org.apache.h...

Data Engineering

2679 Views
0 replies
0 kudos

07-14-2021 11:22:56 PM

User

Count

1602

736

343

284

247

Databricks

Forum Posts

Upload file to DBFS fails with error code 0

Code to automate provisioning of databricks in Azure

What are the advantages of using Delta Live tables over dbt?

Can Databricks Connect work on a cluster with port 443?

EXISTS statement works incorrectly

Databricks - autostart from jdbc query

How to set font-size of values in each cell of dataframe?

filter push down into redis when querying using spark connector

S3 connection reset error :: Removing Spark Config on Cluster

sparkR.session() from web terminal

Access an external table from another workspace

Is there a way to programmatically clear the notebook state in Azure databricks?

How does FSCK work and does it have any negative effects on subsequent notebook executions?

Load an explicit schema from an external metadata.csv file or a json file for reading csv's into dataframe

Not able to fetch data from Simba Spark Jdbc Driver

Best way to parse Google Analytics data in Databri...

DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Not able to set run_as service_principal_name

Pyspark operations slowness in CLuster 14.3LTS as ...

[Databricks Assets Bundles] Workflow trigger on fi...