cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

KKo
by Contributor III
  • 2127 Views
  • 2 replies
  • 2 kudos

delete and append in delta path

I am deleting data from curated path based on date column and appending staged data on it on each run, using below script. My fear is, just after the delete operation, if any network issue appeared and the job stopped before it appended the staged da...

  • 2127 Views
  • 2 replies
  • 2 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 2 kudos

thanks man

  • 2 kudos
1 More Replies
timothy_hartant
by New Contributor
  • 1462 Views
  • 3 replies
  • 1 kudos

Resolved! Databricks Certified Machine Learning Associate Badge not Received Yet

I have recently passed my Databricks Certified Machine Learning Associate exam on Tuesday (04/01) and still have not received my badge on accredible website.Please advise.

  • 1462 Views
  • 3 replies
  • 1 kudos
Latest Reply
Chaitanya_Raju
Honored Contributor
  • 1 kudos

@Timothy Hartanto​ First of all congratulations on your achievement, you will be receiving your certificate and the badge to the registered mail address in 24-48 hours post-completion of your examination. Hope this helps!!All the very best for your f...

  • 1 kudos
2 More Replies
Manojkumar
by New Contributor II
  • 3102 Views
  • 4 replies
  • 0 kudos

Can we assigee default value in select columns in Spark sql when the column is not present?

Im reading avro file and loading into table. The avro data is nested data.Now from this table im trying to extract the necessary elements using spark sql. Using explode function when there is array data. Now the challenge is there are cases like the ...

  • 3102 Views
  • 4 replies
  • 0 kudos
Latest Reply
UmaMahesh1
Honored Contributor III
  • 0 kudos

Hi @manoj kumar​ An easiest way would be to make use of unmanaged delta tables and while loading data into the path of that table, you can enable mergeSchema to be true. This handles all the schema differences, incase column is not present as null an...

  • 0 kudos
3 More Replies
julie
by New Contributor III
  • 3371 Views
  • 5 replies
  • 3 kudos

Resolved! Scope creation in Databricks or Confluent?

Hello I am a newbie in this field and trying to access confluent kafka stream in Databricks Azure based on a beginner's video by Databricks. I have a free trial of Databricks cluster right now. When I run the below notebook, it errors out on line 5 o...

image
  • 3371 Views
  • 5 replies
  • 3 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 3 kudos

For testing, create without secret scope. It will be unsafe, but you can post secrets as strings in the notebook for testing. Here is the code which I used for loading data from confluent:inputDF = (spark .readStream .format("kafka") .option("kafka.b...

  • 3 kudos
4 More Replies
gbradley145
by New Contributor III
  • 3891 Views
  • 3 replies
  • 4 kudos

Why does Databricks SQL drop ending 0 in decimal data type

All,I have a column, RateAdj that is defined as DECIMAL(15,5) and I can see that the value is 4.00000, but when this gets inserted into my table it shows as just 4.%sql   SELECT LTRIM(RTRIM(IFNULL(FORMAT_NUMBER(RateADJ, '0.00000'), '0.00000')))This i...

  • 3891 Views
  • 3 replies
  • 4 kudos
Latest Reply
silvathomas
New Contributor II
  • 4 kudos

The value goes to 10,000 values and having the things done to run a fast execution, and I am also Sociology Dissertation Help with the reduction of pages.

  • 4 kudos
2 More Replies
galop12
by New Contributor
  • 3224 Views
  • 3 replies
  • 0 kudos

Databricks workspace (with managed VNET) upgrade to premium failing

I am trying to upgrade our Databricks workspace from standard to premium but running into issues. The workspace is currently deployed in a managed VNET.I tried the migration tool as well as just re-creating a premium workspace with the same parameter...

  • 3224 Views
  • 3 replies
  • 0 kudos
Latest Reply
lskw
New Contributor II
  • 0 kudos

Hi, I have same situation when trying to upgrade from Standard to Premium on Azure.My error: "ConflictWithNetworkIntentPolicy","message":"Found conflicts with NetworkIntentPolicy. Details: Subnet or Virtual Network cannot have resources or properties...

  • 0 kudos
2 More Replies
CaseyTercek_
by New Contributor II
  • 880 Views
  • 2 replies
  • 1 kudos

Lineage - It would be nice if the lineage in Unity would allow for API calls that could add additional lineage information, somehow. I am not certain...

Lineage - It would be nice if the lineage in Unity would allow for API calls that could add additional lineage information, somehow. I am not certain exactly what would be nice. But some sort of feature to include source systems in it.

  • 880 Views
  • 2 replies
  • 1 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 1 kudos

Pureview is quite popular to be integrated to solve this issue. I think lineage in the unity catalog is designed to be auto-generated. I know there are information tables, but I never manually manipulated them.

  • 1 kudos
1 More Replies
vanessafvg
by New Contributor III
  • 3064 Views
  • 3 replies
  • 1 kudos

linking filters from different Databricks SQL queries in a Dashboard

I am having to use Databricks SQL dashboard for some analysis, it seems very clunky. If i have multiple queries, is it possible to apply the same filters to all the queries in the dashboard or do i have to duplicate the filters for each query in the ...

  • 3064 Views
  • 3 replies
  • 1 kudos
Latest Reply
FelixH
New Contributor II
  • 1 kudos

Same issue here. According the docs, using query filters with the same name and values should result in a single dashboard filter. However, filters are duplicated. I also tried using this setting but no success

  • 1 kudos
2 More Replies
sher
by Valued Contributor II
  • 1449 Views
  • 3 replies
  • 1 kudos

Resolved! Do we have any certificate voucher for the data bricks session in the upcoming days

Hi Team,Do we have any program for certificate vouchers for the data bricks session in upcoming days

  • 1449 Views
  • 3 replies
  • 1 kudos
Latest Reply
sher
Valued Contributor II
  • 1 kudos

@Vidula Khanna​ I got this link for the certificate voucher register link.https://docs.google.com/presentation/d/1sy5hSSnFtncrpYY1EYi0WMsDkJK0dYk9iKBAeeAha8E/edit#slide=id.g1ade45a9cd6_0_543

  • 1 kudos
2 More Replies
pjp94
by Contributor
  • 6892 Views
  • 9 replies
  • 7 kudos

Calling a python function (def) in databricks

Not sure if I'm missing something here, but running a task outside of a python function runs much much quicker than executing the same task inside a function. Is there something I'm missing with how spark handles functions? 1) def task(x): y = dostuf...

  • 6892 Views
  • 9 replies
  • 7 kudos
Latest Reply
sher
Valued Contributor II
  • 7 kudos

don't use python normal function use UDF in pyspark so that will be faster

  • 7 kudos
8 More Replies
vr
by Contributor
  • 4696 Views
  • 3 replies
  • 2 kudos

Resolved! Is timestamp difference always INTERVAL DAY TO SECOND?

My observations show that timestamp difference has type of INTERVAL DAY TO SECONDS:select typeof(getdate() - current_date()) ----------------------------------------- interval day to secondBut is it guaranteed? Can it be DAY TO MINUTE or, say, YEAR T...

  • 4696 Views
  • 3 replies
  • 2 kudos
Latest Reply
sher
Valued Contributor II
  • 2 kudos

you can check here for given example: https://docs.databricks.com/sql/language-manual/functions/minussign.htmlthis might help to you.

  • 2 kudos
2 More Replies
Vijaykumarj
by New Contributor III
  • 4294 Views
  • 4 replies
  • 3 kudos

Generate sh2 hashkey while loading files to delta table

I have files in azure data lake. I am using autoloader to read the incremental filesfiles don't have primary key to load, In this case i want to use some columns and generate an hashkey and use it as primary key to do changes.In this case i want to ...

image.png
  • 4294 Views
  • 4 replies
  • 3 kudos
Latest Reply
Debayan
Esteemed Contributor III
  • 3 kudos

Hi , Could you please provide the error code?

  • 3 kudos
3 More Replies
prasannar
by New Contributor II
  • 3087 Views
  • 3 replies
  • 3 kudos
  • 3087 Views
  • 3 replies
  • 3 kudos
Latest Reply
sher
Valued Contributor II
  • 3 kudos

Df.write.format('jdbc').options( url='jdbc:oracle:thin:@192.168.11.100:1521:ORCL', driver='oracle.jdbc.driver.OracleDriver', dbtable='testschema.test', user='testschema', password='password').mode('overwrite').save()try ...

  • 3 kudos
2 More Replies
dulu
by New Contributor III
  • 11220 Views
  • 5 replies
  • 6 kudos

split character string in cell with sql

I have the following input: I am looking for a way to split the characters in the item_order_detail column into 2 columns itemID and itemName. As below output table uses SQL function in databricks with spark_sql version 3.2.1.Can someone suggest a so...

hinh22 hinh223
  • 11220 Views
  • 5 replies
  • 6 kudos
Latest Reply
sher
Valued Contributor II
  • 6 kudos

you need to use explode functionhttps://stackoverflow.com/questions/61070630/spark-explode-column-with-json-array-to-rows

  • 6 kudos
4 More Replies
RamyaN
by New Contributor II
  • 2894 Views
  • 2 replies
  • 3 kudos

How to read enum[] (enum of array) datatype from postgres using spark

We are trying to read a column which is enum of array datatype from postgres as string datatype to target. We could able to achieve this by expilcitly using concat function while extracting like belowval jdbcDF3 = spark.read .format("jdbc") .option(...

  • 2894 Views
  • 2 replies
  • 3 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 3 kudos

You can try custom schema for JDBC read.option("customSchema", "colname STRING")

  • 3 kudos
1 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels