Topics with Label: Column

Forum Posts

Sorted by:

by anonturtle • New Contributor

01-30-2023 7:05:26 PM

676 Views
1 replies
0 kudos

How does automl classify which feature is numeric or categorical?

When running automl on its UI, it classifies a feature "local_convenience_store" as both a numeric and categorical column. This affects the result as for numeric columns a scaler is used while in a categorical column it is one hot encoded. For contex...

Data Engineering

676 Views
1 replies
0 kudos

01-30-2023 7:05:26 PM

View Replies

Latest Reply

Anonymous
Not applicable

04-10-2023 7:30:42 AM

0 kudos

@hr then :The approach taken by AutoML to classify features as numeric or categorical depends on the specific AutoML framework or library being used, as different implementations may use different methods or heuristics to make this determination.In ...

0 kudos

04-10-2023 7:30:42 AM

by bluesky • New Contributor II

02-04-2023 10:51:00 AM

1354 Views
2 replies
1 kudos

Identity error Spark Sql:not enough data columns;target has 3 but the inserted data has 2, it's the identity column which is missing here

While inserting into target table i am getting an error '"not enough data columns;target has 3 but the inserted data has 2" but it's the identity column which is the 8th column ".insert into table A(col 1,col 2,col3)select col2,col3from table Bjoin t...

Data Engineering

1354 Views
2 replies
1 kudos

02-04-2023 10:51:00 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-10-2023 3:05:59 AM

1 kudos

Hi @sky blue Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

1 kudos

04-10-2023 3:05:59 AM

1 More Replies

by Gilg • Contributor II

03-30-2023 11:24:03 PM

2067 Views
1 replies
0 kudos

Adding column as StructType

Hi Team,Just wondering, how can I add a column to an existing table.I'd tried the below script but giving me an error:ParseException: [PARSE_SYNTAX_ERROR] Syntax error at or near '<'(line 1, pos 121)ALTER TABLE table_clone ADD COLUMNS col_name1 STRUC...

Data Engineering

2067 Views
1 replies
0 kudos

03-30-2023 11:24:03 PM

View Replies

Latest Reply

Anonymous
Not applicable

04-02-2023 9:19:07 AM

0 kudos

@Gil Gonong :In Databricks, you can add a column to an existing table using the ALTER TABLE statement in SQL. Here is an example:ALTER TABLE table_clone ADD COLUMN col_name1 STRUCT< type: STRING, values: ARRAY<STRING> >Note that you need to ...

0 kudos

04-02-2023 9:19:07 AM

by MerelyPerfect • New Contributor II

03-24-2023 8:39:40 AM

1960 Views
3 replies
1 kudos

read base64 json column with Autoloader and inferschema.

I have json files falling in our blob with two fields, 1. offset(integer), 2. value(base64).This value column is json with unicode. so they sent it as base64. Challenge is this json is very large with 100+ fields. so we cannot define the schema. We c...

Data Engineering

1960 Views
3 replies
1 kudos

03-24-2023 8:39:40 AM

View Replies

Latest Reply

Anonymous
Not applicable

03-25-2023 10:56:10 PM

1 kudos

Hi @MerelyPerfect Per Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you....

1 kudos

03-25-2023 10:56:10 PM

2 More Replies

by ramankr48 • Contributor II

10-18-2022 4:08:43 AM

11709 Views
5 replies
8 kudos

Resolved! How to get all the tables name with a specific column or columns in a database?

let's say there is a database db in which 700 tables are there, and we need to find all the tables name in which column "project_id" is present.just an example for ubderstanding the questions.

Data Engineering

11709 Views
5 replies
8 kudos

10-18-2022 4:08:43 AM

View Replies

Latest Reply

Anonymous
Not applicable

10-18-2022 4:53:00 AM

8 kudos

databaseName = "db" desiredColumn = "project_id" database = spark.sql(f"show tables in {databaseName} ").collect() tablenames = [] for row in database: cols = spark.table(row.tableName).columns if desiredColumn in cols: tablenames.append(row....

8 kudos

10-18-2022 4:53:00 AM

4 More Replies

by thushar • Contributor

01-23-2023 12:41:21 AM

1249 Views
6 replies
0 kudos

GeneratedAlwaysAs' along with dataframe.write

Is it possible to use a calculated column (as like in the delta table using generatedAlwaysAs) definition while writing the data frame as a delta file like df.write.format("delta").Any options are there with the dataframe.write method to achieve this...

Data Engineering

1249 Views
6 replies
0 kudos

01-23-2023 12:41:21 AM

View Replies

Latest Reply

pvignesh92
Honored Contributor

03-09-2023 6:27:58 AM

0 kudos

Hi @Thushar R ,This option is not a part of Dataframe write API as GeneratedAlwaysAs feature is only applicable to Delta format and df.write is a common API to handle writes for all formats. If you to achieve this programmatically, you can still use...

0 kudos

03-09-2023 6:27:58 AM

5 More Replies

by chanansh • Contributor

01-18-2023 7:48:14 AM

840 Views
2 replies
0 kudos

how to compute difference over time of a spark structure streaming?

I have a table with a timestamp column (t) and a list of columns for which I would like to compute the difference over time (v), by some key(k): v_diff(t) = v(t)-v(t-1) for each k independently.Normally I would write:lag_window = Window.partitionBy(C...

Data Engineering

840 Views
2 replies
0 kudos

01-18-2023 7:48:14 AM

View Replies

Latest Reply

chanansh
Contributor

02-08-2023 5:32:54 AM

0 kudos

I found this but could not make it work https://www.databricks.com/blog/2022/10/18/python-arbitrary-stateful-processing-structured-streaming.html

0 kudos

02-08-2023 5:32:54 AM

1 More Replies

by rocky5 • New Contributor III

11-30-2022 1:14:24 AM

1342 Views
1 replies
2 kudos

Cannot create delta live table

I created a simple definition of delta live table smth like:CREATE OR REFRESH STREAMING LIVE TABLE customers_silverAS SELECT * FROM STREAM(LIVE.customers_bronze)But I am getting an error when running a pipeline:com.databricks.sql.transaction.tahoe.De...

Data Engineering

1342 Views
1 replies
2 kudos

11-30-2022 1:14:24 AM

View Replies

Latest Reply

jose_gonzalez
Moderator

01-30-2023 4:26:13 PM

2 kudos

You might need to execute the following on your tables to avoid this error message ALTER TABLE <table_name> SET TBLPROPERTIES ( 'delta.minReaderVersion' = '2', 'delta.minWriterVersion' = '5', 'delta.columnMapping.mode' = 'name' )Docs https...

2 kudos

01-30-2023 4:26:13 PM

by sonali1996 • New Contributor

01-19-2023 8:06:40 PM

663 Views
2 replies
0 kudos

adding Widget as a column and populating its value every-time in that column in a table.

hi , I want date for runtime from ADF as @utcnow() -- base paramater of notebook activity in ADF and take the data in ADB using widgets as runtime_date, further i want that column to be added in my table X with the populated value from the widget.Eve...

Data Engineering

663 Views
2 replies
0 kudos

01-19-2023 8:06:40 PM

View Replies

Latest Reply

sher
Valued Contributor II

01-22-2023 4:30:27 AM

0 kudos

you can use as current_timestamp() or now()refer link: https://docs.databricks.com/sql/language-manual/functions/current_timestamp.html

0 kudos

01-22-2023 4:30:27 AM

1 More Replies

by data_explorer • New Contributor II

09-29-2022 10:18:02 PM

690 Views
1 replies
2 kudos

Is there any way to mask a column of a table for a user/group without creating dynamic views?

Data Engineering

690 Views
1 replies
2 kudos

09-29-2022 10:18:02 PM

View Replies

Latest Reply

User16753725469
Contributor II

11-27-2022 10:52:05 AM

2 kudos

Please refer: https://www.databricks.com/blog/2021/05/26/introducing-databricks-unity-catalog-fine-grained-governance-for-data-and-ai-on-the-lakehouse.html

2 kudos

11-27-2022 10:52:05 AM

by lizou • Contributor II

05-08-2022 8:41:24 AM

2020 Views
4 replies
6 kudos

Resolved! Identity column definition lost using save as table

I found an issue:For a table with an identity column defined.when the table column is renamed using this method, the identity definition will be removed. That means using an identity column in a table requires extra attention to check whether the ide...

Data Engineering

2020 Views
4 replies
6 kudos

05-08-2022 8:41:24 AM

View Replies

Latest Reply

lizou
Contributor II

11-10-2022 6:47:19 AM

6 kudos

try to avoid reload table, I found we can upgrade table version, and use rename column commandALTER TABLE test_id2 SET TBLPROPERTIES ( 'delta.columnMapping.mode' = 'name', 'delta.minReaderVersion' = '2', 'delta.minWriterVersion' = '6')ALTER TABLE ...

6 kudos

11-10-2022 6:47:19 AM

3 More Replies

by ramankr48 • Contributor II

09-13-2022 12:21:45 AM

11658 Views
11 replies
2 kudos

Resolved! how to add an identity column to an existing table?

I have created a database called retail and inside database a table is there called sales_order. I want to create an identity column in the sales_order table, but while creating it I am getting an error.

Data Engineering

11658 Views
11 replies
2 kudos

09-13-2022 12:21:45 AM

View Replies

Latest Reply

PriyaAnanthram
Contributor III

09-14-2022 4:46:17 PM

2 kudos

My DBR

2 kudos

09-14-2022 4:46:17 PM

10 More Replies

by auser85 • New Contributor III

05-26-2022 10:46:20 AM

1339 Views
2 replies
1 kudos

How to reset the IDENTITY column count?

After accumulating many updates to a delta table,like,keyExample bigint GENERATED ALWAYS AS IDENTITY (START WITH 1 INCREMENT BY 1),my identity column values are in the hundreds of millions. Is there any way that I can reset this value through vacuumi...

Data Engineering

1339 Views
2 replies
1 kudos

05-26-2022 10:46:20 AM

View Replies

Latest Reply

Anonymous
Not applicable

07-27-2022 10:29:19 AM

1 kudos

Hey there @Andrew Fogarty Does @Werner Stinckens's response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly? Else please let us know if you need more help. Thanks!

1 kudos

07-27-2022 10:29:19 AM

1 More Replies

by joel_iemma • New Contributor III

05-12-2022 5:55:38 AM

2561 Views
5 replies
0 kudos

Resolved! A void column was created after connecting to cosmos

Hi everyone, I have connected to Cosmos using this tutorial https://github.com/Azure/azure-sdk-for-java/tree/main/sdk/cosmos/azure-cosmos-spark_3_2-12/Samples/DatabricksLiveContainerMigrationAfter creating a table using a simple SQL command:CREATE TA...

Data Engineering

2561 Views
5 replies
0 kudos

05-12-2022 5:55:38 AM

View Replies

Latest Reply

Anonymous
Not applicable

07-07-2022 9:14:26 AM

0 kudos

Hey there @Joel iemma Hope all is well! Just wanted to check in if you would be happy to mark an answer as best for us, please? It would be really helpful for the other members too.Cheers!

0 kudos

07-07-2022 9:14:26 AM

4 More Replies

by cuteabhi32 • New Contributor III

06-06-2022 8:17:54 AM

26191 Views
11 replies
1 kudos

Resolved! Trying to check if a column exist in a dataframe or not if not then i have to give NULL if yes then i need to give the column itself by using UDF

from pyspark import SparkContextfrom pyspark import SparkConffrom pyspark.sql.types import *from pyspark.sql.functions import *from pyspark.sql import *from pyspark.sql.types import StringTypefrom pyspark.sql.functions import udfdf1 = spark.read.form...

Data Engineering

26191 Views
11 replies
1 kudos

06-06-2022 8:17:54 AM

View Replies

Latest Reply

cuteabhi32
New Contributor III

06-07-2022 7:29:16 AM

1 kudos

Thanks i modified my code as per your suggestion and it worked perfectly Thanks again for all your inputsdflist= spark.createDataFrame(list(a.columns), "string").toDF("Name")dfg=dflist.filter(col('name').isin('ref_date')).count()if dfg==1 : a = a.wi...

1 kudos

06-07-2022 7:29:16 AM

10 More Replies