cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

SaraCorralLou
by New Contributor III
  • 16859 Views
  • 3 replies
  • 2 kudos

Resolved! Differences between lit(None) or lit(None).cast('string')

I want to define a column with null values in my dataframe using pyspark. This column will later be used for other calculations.What is the difference between creating it in these two different ways?df.withColumn("New_Column", lit(None))df.withColumn...

  • 16859 Views
  • 3 replies
  • 2 kudos
Latest Reply
shadowinc
New Contributor III
  • 2 kudos

For me df.withColumn("New_Column", lit(None).cast(StringType())) this didn't work.I used this instead df.withColumn("New_Column", lit(null).cast(StringType))  

  • 2 kudos
2 More Replies
Aviral-Bhardwaj
by Esteemed Contributor III
  • 19736 Views
  • 2 replies
  • 13 kudos

Understanding Rename in Databricks Now there are multiple ways to rename Spark Data Frame Columns or Expressions. We can rename columns or expressions...

Understanding Rename in DatabricksNow there are multiple ways to rename Spark Data Frame Columns or Expressions.We can rename columns or expressions using alias as part of selectWe can add or rename columns or expressions using withColumn on top of t...

  • 19736 Views
  • 2 replies
  • 13 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 13 kudos

Very informative, Thanks for sharing

  • 13 kudos
1 More Replies
noimeta
by Contributor III
  • 4199 Views
  • 4 replies
  • 1 kudos

Apply change data with delete and schema evolution

Hi,Currently, I'm using structure streaming to insert/update/delete to a table. A row will be deleted if value in 'Operation' column is 'deleted'. Everything seems to work fine until there's a new column.Since I don't need 'Operation' column in the t...

  • 4199 Views
  • 4 replies
  • 1 kudos
Latest Reply
User16753725469
Contributor II
  • 1 kudos

please go through this documentation https://docs.delta.io/latest/api/python/index.html

  • 1 kudos
3 More Replies
Confused
by New Contributor III
  • 9781 Views
  • 7 replies
  • 2 kudos

Schema evolution issue

Hi AllI am loading some data using auto loader but am having trouble with Schema evolution.A new column has been added to the data I am loading and I am getting the following error:StreamingQueryException: Encountered unknown field(s) during parsing:...

  • 9781 Views
  • 7 replies
  • 2 kudos
Latest Reply
rgrosskopf
New Contributor II
  • 2 kudos

I agree that hints are the way to go if you have the schema available but the whole point of schema evolution is that you might not always know the schema in advance.I received a similar error with a similar streaming query configuration. The issue w...

  • 2 kudos
6 More Replies
Abeeya
by New Contributor II
  • 6582 Views
  • 1 replies
  • 5 kudos

Resolved! How to Overwrite Using pyspark's JDBC without loosing constraints on table columns

Hello,My table has primary key constraint on a perticular column, Im loosing primary key constaint on that column each time I overwrite the table , What Can I do to preserve it? Any Heads up would be appreciatedTried Belowdf.write.option("truncate", ...

  • 6582 Views
  • 1 replies
  • 5 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 5 kudos

@Abeeya .​ , Mode "truncate", is correct to preserve the table. However, when you want to add a new column (mismatched schema), it wants to drop it anyway.

  • 5 kudos
ahana
by New Contributor III
  • 15402 Views
  • 11 replies
  • 2 kudos
  • 15402 Views
  • 11 replies
  • 2 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 2 kudos

Hi @ahana ahana​ ,Did any of the replies helped you solve this issue? would you be happy to mark their answer as best so that others can quickly find the solution?Thank you

  • 2 kudos
10 More Replies
omsas
by New Contributor
  • 2890 Views
  • 2 replies
  • 0 kudos

How to add Columns for Automatic Fill on Pandas Python

1. I have data x,I would like to create a new column with the condition that the value are 1, 2 or 32. The name of the column is SHIFT where this SHIFT column will be filled automatically if the TIME_CREATED column meets the conditions.3. the conditi...

Columns Table Result of tested
  • 2890 Views
  • 2 replies
  • 0 kudos
Latest Reply
Ryan_Chynoweth
Esteemed Contributor
  • 0 kudos

You an do something like this in pandas. Note there could be a more performant way to do this too. import pandas as pd import numpy as np   df = pd.DataFrame({'a':[1,2,3,4]}) df.head() > a > 0 1 > 1 2 > 2 3 > 3 4   conditions = [(df['a'] <=2...

  • 0 kudos
1 More Replies
Labels