cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

vishavgupta988
by New Contributor
  • 6286 Views
  • 2 replies
  • 0 kudos

How to set font-size of values in each cell of dataframe?

I am working on pandas and python.After processing a particular dataframe in my program , I am appending that dataframe below an existing Excel file. Now problem is my excel has font size of 11 pt but dataframe has font size of 12 pt.I want to set f...

  • 6286 Views
  • 2 replies
  • 0 kudos
Latest Reply
DominicFHelms
New Contributor II
  • 0 kudos

I like sharp fonts.

  • 0 kudos
1 More Replies
okmich
by New Contributor II
  • 2746 Views
  • 0 replies
  • 1 kudos

S3 connection reset error :: Removing Spark Config on Cluster

Hi guys, I am running a production pipeline (Databricks Runtime 7.3 LTS) that keeps failing for some delta file reads with the error: 21/07/19 09:56:02 ERROR Executor: Exception in task 36.1 in stage 2.0 (TID 58) com.databricks.sql.io.FileReadExcept...

  • 2746 Views
  • 0 replies
  • 1 kudos
talegari
by New Contributor
  • 792 Views
  • 0 replies
  • 0 kudos

sparkR.session() from web terminal

Question: sparkR.session() gives an error when run on web terminal, while it runs in a notebook. What parameters should be provided to create a spark session on web terminal? PS: I am trying to run a .R file using Rscript call on terminal instead ...

  • 792 Views
  • 0 replies
  • 0 kudos
DanSiegel
by New Contributor
  • 1163 Views
  • 0 replies
  • 0 kudos

Access an external table from another workspace

What's the best way to add an external table so another cluster/workspace can access an existing external table on S3? I need to redeploy my workspace into a new VPC, so I am not expecting any collisions of the warehouses. Is it as simple as adding ...

  • 1163 Views
  • 0 replies
  • 0 kudos
CalvinCalvert_
by New Contributor
  • 1174 Views
  • 0 replies
  • 0 kudos

How does FSCK work and does it have any negative effects on subsequent notebook executions?

In my environment, there are 3 groups of notebooks that run on their own schedules, however they all use the same underlying transaction logs (auditlogs, as we call them) in S3. From time to time, various notebooks from each of the 3 groups fail wit...

  • 1174 Views
  • 0 replies
  • 0 kudos
MohitAnchlia
by New Contributor II
  • 1597 Views
  • 0 replies
  • 1 kudos

Change AWS storage setting and account

I am seeing a super weird behaviour in databricks. We initially configured the following: 1. Account X in Account Console -> AWS Account arn:aws:iam::X:role/databricks-s3 2. We setup databricks-s3 as S3 bucket in Account Console -> AWS Storage 3. W...

  • 1597 Views
  • 0 replies
  • 1 kudos
TrinaDe
by Databricks Partner
  • 6125 Views
  • 1 replies
  • 1 kudos

How can we join two pyspark dataframes side by side (without using join,equivalent to pd.concat() in pandas) ? I am trying to join two extremely large dataframes where each is of the order of 50 million.

My two dataframes look like new_df2_record1 and new_df2_record2 and the expected output dataframe I want is like new_df2: The code I have tried is the following: If I print the top 5 rows of new_df2, it gives the output as expected but I cannot pri...

0693f000007OoS6AAK
  • 6125 Views
  • 1 replies
  • 1 kudos
Latest Reply
TrinaDe
Databricks Partner
  • 1 kudos

The code in a more legible format:

  • 1 kudos
AnandNair
by New Contributor
  • 1297 Views
  • 0 replies
  • 0 kudos

Load an explicit schema from an external metadata.csv file or a json file for reading csv's into dataframe

Hi, I have a metadata csv file which contains column name, and datatype such as Colm1: INT Colm2: String. I can also get the same in a json format as shown: I can store this on ADLS. How can I convert this into a schema like: "Myschema" that I can ...

  • 1297 Views
  • 0 replies
  • 0 kudos
Devaraj
by New Contributor
  • 4236 Views
  • 0 replies
  • 0 kudos

Not able to fetch data from Simba Spark Jdbc Driver

We are getting below error when we tried to set the date in preparedstatement using Simba Spark Jdbc Driver. Exception: Query execution failed: [Simba][SparkJDBCDriver](500051) ERROR processing query/statement. Error Code: 0, SQL state: org.apache.h...

  • 4236 Views
  • 0 replies
  • 0 kudos
twotwoiscute
by New Contributor
  • 2401 Views
  • 0 replies
  • 0 kudos

PySpark pandas_udf slower than single thread

I used @pandas_udf write a function for speeding up the process(parsing xml file ) and then compare it's speed with single thread , Surprisingly , Using @pandas_udf is two times slower than single-thread code. And the number of xml files I need to p...

  • 2401 Views
  • 0 replies
  • 0 kudos
Labels