cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

SaraCorralLou
by New Contributor III
  • 6574 Views
  • 2 replies
  • 0 kudos

Resolved! Differences between lit(None) or lit(None).cast('string')

I want to define a column with null values in my dataframe using pyspark. This column will later be used for other calculations.What is the difference between creating it in these two different ways?df.withColumn("New_Column", lit(None))df.withColumn...

  • 6574 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Sara Corral​ Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedback ...

  • 0 kudos
1 More Replies
Aviral-Bhardwaj
by Esteemed Contributor III
  • 5130 Views
  • 2 replies
  • 13 kudos

Understanding Rename in Databricks Now there are multiple ways to rename Spark Data Frame Columns or Expressions. We can rename columns or expressions...

Understanding Rename in DatabricksNow there are multiple ways to rename Spark Data Frame Columns or Expressions.We can rename columns or expressions using alias as part of selectWe can add or rename columns or expressions using withColumn on top of t...

  • 5130 Views
  • 2 replies
  • 13 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 13 kudos

Very informative, Thanks for sharing

  • 13 kudos
1 More Replies
noimeta
by Contributor II
  • 2029 Views
  • 4 replies
  • 1 kudos

Apply change data with delete and schema evolution

Hi,Currently, I'm using structure streaming to insert/update/delete to a table. A row will be deleted if value in 'Operation' column is 'deleted'. Everything seems to work fine until there's a new column.Since I don't need 'Operation' column in the t...

  • 2029 Views
  • 4 replies
  • 1 kudos
Latest Reply
User16753725469
Contributor II
  • 1 kudos

please go through this documentation https://docs.delta.io/latest/api/python/index.html

  • 1 kudos
3 More Replies
Confused
by New Contributor III
  • 6549 Views
  • 7 replies
  • 2 kudos

Schema evolution issue

Hi AllI am loading some data using auto loader but am having trouble with Schema evolution.A new column has been added to the data I am loading and I am getting the following error:StreamingQueryException: Encountered unknown field(s) during parsing:...

  • 6549 Views
  • 7 replies
  • 2 kudos
Latest Reply
rgrosskopf
New Contributor II
  • 2 kudos

I agree that hints are the way to go if you have the schema available but the whole point of schema evolution is that you might not always know the schema in advance.I received a similar error with a similar streaming query configuration. The issue w...

  • 2 kudos
6 More Replies
Abeeya
by New Contributor II
  • 3527 Views
  • 2 replies
  • 3 kudos

Resolved! How to Overwrite Using pyspark's JDBC without loosing constraints on table columns

Hello,My table has primary key constraint on a perticular column, Im loosing primary key constaint on that column each time I overwrite the table , What Can I do to preserve it? Any Heads up would be appreciatedTried Belowdf.write.option("truncate", ...

  • 3527 Views
  • 2 replies
  • 3 kudos
Latest Reply
Kaniz
Community Manager
  • 3 kudos

Hi @Abeeya .​ , How are you? Did @Hubert Dudek​ 's answer help you in any way? Please let us know.

  • 3 kudos
1 More Replies
ahana
by New Contributor III
  • 6576 Views
  • 13 replies
  • 2 kudos
  • 6576 Views
  • 13 replies
  • 2 kudos
Latest Reply
jose_gonzalez
Moderator
  • 2 kudos

Hi @ahana ahana​ ,Did any of the replies helped you solve this issue? would you be happy to mark their answer as best so that others can quickly find the solution?Thank you

  • 2 kudos
12 More Replies
omsas
by New Contributor
  • 1796 Views
  • 2 replies
  • 0 kudos

How to add Columns for Automatic Fill on Pandas Python

1. I have data x,I would like to create a new column with the condition that the value are 1, 2 or 32. The name of the column is SHIFT where this SHIFT column will be filled automatically if the TIME_CREATED column meets the conditions.3. the conditi...

Columns Table Result of tested
  • 1796 Views
  • 2 replies
  • 0 kudos
Latest Reply
Ryan_Chynoweth
Honored Contributor III
  • 0 kudos

You an do something like this in pandas. Note there could be a more performant way to do this too. import pandas as pd import numpy as np   df = pd.DataFrame({'a':[1,2,3,4]}) df.head() > a > 0 1 > 1 2 > 2 3 > 3 4   conditions = [(df['a'] <=2...

  • 0 kudos
1 More Replies
Kaniz
by Community Manager
  • 1222 Views
  • 1 replies
  • 0 kudos
  • 1222 Views
  • 1 replies
  • 0 kudos
Latest Reply
saipujari_spark
Valued Contributor
  • 0 kudos

We can add a new column using the withColumn() method of the data frame, like belowfrom pyspark.sql.functions import lit   df = sqlContext.createDataFrame( [(1, "a"), (2, "b")], ("c1", "c2"))   df_new_col = df.withColumn("c3", lit(0)) df_new_col....

  • 0 kudos
Labels