cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

User16783854657
by New Contributor III
  • 2601 Views
  • 4 replies
  • 6 kudos

How do I know how much of a query/job used Photon?

I'm trying to use the native execution engine, Photon. How can I tell if a query is using Photon or is falling back to the non-native Spark engine?

  • 2601 Views
  • 4 replies
  • 6 kudos
Latest Reply
venkat09
New Contributor III
  • 6 kudos

Typo error in my second point of the previous post. Click the execution plan of your task[this is available under SQL/Dataframe tab in Spark UI]. It explains what operations run in the photon engine and what didn't execute by photon.

  • 6 kudos
3 More Replies
patdev
by New Contributor III
  • 4294 Views
  • 9 replies
  • 2 kudos

text datatype not supported and data having huge data in text filed how to bring it over

Hello all,I have medical field data file and one of the field is the text field with huge data not the big problem is databrick does not support text data type so how can i bring the data over. i tried conversion, cast in various way but so far not ...

  • 4294 Views
  • 9 replies
  • 2 kudos
Latest Reply
patdev
New Contributor III
  • 2 kudos

Setting escapeQuotes to false has helped to bring huge text data in colomn.thanks

  • 2 kudos
8 More Replies
Gaurav_784295
by New Contributor III
  • 1977 Views
  • 2 replies
  • 0 kudos

pyspark.sql.utils.AnalysisException: Non-time-based windows are not supported on streaming DataFrames/Datasets

pyspark.sql.utils.AnalysisException: Non-time-based windows are not supported on streaming DataFrames/DatasetsGetting this error while writing can any one please tell how we can resolve it

  • 1977 Views
  • 2 replies
  • 0 kudos
Latest Reply
Gaurav_784295
New Contributor III
  • 0 kudos

I'm trying to run query on some table and then storing that result in some table .query = stream .writeStream .format("delta") .foreachBatch(batch_function) \ .option('checkpointLocation', self.checkpoint_loc) .trigger(processingTime...

  • 0 kudos
1 More Replies
ty2
by New Contributor II
  • 1661 Views
  • 3 replies
  • 1 kudos

Resolved! How to start my cluster

​I try to stop my_cluster from compute from admin role. BTW, using same account, I could not restart my_cluster. The information is as followings. How should I do?

20230121-my_cluster_not_start
  • 1661 Views
  • 3 replies
  • 1 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 1 kudos

rit seems this is community edition so in CE this feature is disabled , delete this one and create new cluster

  • 1 kudos
2 More Replies
Sujitha
by Community Manager
  • 622 Views
  • 1 replies
  • 2 kudos

Documentation Update January 13 - 19 Databricks documentation provides how-to guidance and reference information for data analysts, data scientists, a...

Documentation Update January 13 - 19Databricks documentation provides how-to guidance and reference information for data analysts, data scientists, and data engineers working in the Databricks Data Science & Engineering, Databricks Machine Learning, ...

  • 622 Views
  • 1 replies
  • 2 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 2 kudos

thanks for the details

  • 2 kudos
vk217
by Contributor
  • 1111 Views
  • 1 replies
  • 0 kudos

Resolved! Import course material to databricks

I signed up for the data engineering course and downloaded the course material.However I cannot access the link to import the course material into databricks. Below link gives me access denied.https://www.databricks.training/step-by-step/importing-co...

  • 1111 Views
  • 1 replies
  • 0 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 0 kudos

https://github.com/databricks-academy/data-engineering-with-databricks-english use this link and download this to your local and then import, it will work

  • 0 kudos
Chris_Konsur
by New Contributor III
  • 5969 Views
  • 1 replies
  • 0 kudos

Resolved! configuring the Databricks JobAPIs and I get Error 403 User not authorized.

 I’m configuring the Databricks JobAPIs and I get Error 403 User not authorized.I found out the issue is that I need to apply a rule and set API permissions for AzureDatabricksAzure Portal>Azure Databricks>Azure Databricks Service>Access control (IAM...

  • 5969 Views
  • 1 replies
  • 0 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 0 kudos

for the particular jobs the user who is trying to start the job he should have access permission or run permission for that jobs , please give required permission and it will work for sure

  • 0 kudos
Mado
by Valued Contributor II
  • 2381 Views
  • 1 replies
  • 2 kudos

Resolved! How to get a snapshot of a streaming delta table as a static table?

Hi,Assume that I have a streaming delta table. Is there any way to get snapshot of the streaming table as a static table?Reason is that I need to join this streaming table with a static table by:output = output.join(country_information, ["Country"], ...

  • 2381 Views
  • 1 replies
  • 2 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 2 kudos

Hi @Mohammad Saber​, Yes, you can try this approach Create the snapshot with a timestampsnapshot_time = "2022-10-01 00:00:00"   spark.sql(f"CREATE TABLE snapshot_table_at_time AS SELECT * FROM streaming_table VERSION AS OF '{snapshot_time}'")Then, yo...

  • 2 kudos
asif5494
by New Contributor III
  • 1526 Views
  • 3 replies
  • 0 kudos

preAction in databricks while writing into Google Big Query Table?

I am writing into Google Big Query table using append mode. I need to delete current day data before writing new data. I just want to know if there is any preActions parameter can be used to first delete data before writing into table? Below is the s...

  • 1526 Views
  • 3 replies
  • 0 kudos
Latest Reply
Cami
Contributor III
  • 0 kudos

Can you use override mode instead append?

  • 0 kudos
2 More Replies
Neli
by New Contributor II
  • 2885 Views
  • 2 replies
  • 0 kudos

How to add Current date as one of the column in Databricks

I am trying to create new column "Ingest_date" in table which should contain current date. I am getting error "Current date cannot be used in a generated column". Can you please review and suggest alternative to get the current date in delta table.

image image
  • 2885 Views
  • 2 replies
  • 0 kudos
Latest Reply
daniel_sahal
Esteemed Contributor
  • 0 kudos

A generation expression can use any SQL functions in Spark that always return the same result when given the same argument valuesSource: https://docs.delta.io/latest/delta-batch.html#use-generated-columnsIt means that it's intended to not work.You ca...

  • 0 kudos
1 More Replies
hari
by Contributor
  • 3035 Views
  • 4 replies
  • 3 kudos

Multiple streaming sources to the same delta table

Is it possible to have two streaming sources doing Merge into the same delta table with each source setting a different set of fields?We are trying to create a single table which will be used by the service layer for queries. The table can be populat...

  • 3035 Views
  • 4 replies
  • 3 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 3 kudos

Hi @Zachary Higgins​ ​, We haven’t heard from you on the last response from @Harikrishnan P H​ , and I was checking back to see if you have a resolution. If you have any solution, please do share the same with the community as it can be helpful to ot...

  • 3 kudos
3 More Replies
MeghashreeM
by New Contributor III
  • 2414 Views
  • 3 replies
  • 5 kudos

org.apache.spark.sql.AnalysisException: Non-time-based windows are not supported on streaming DataFrames/Datasets

org.apache.spark.sql.AnalysisException: Non-time-based windows are not supported on streaming DataFrames/Datasets

  • 2414 Views
  • 3 replies
  • 5 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 5 kudos

Hi @ MeghashreeM! My name is Kaniz, and I'm a technical moderator here. Great to meet you, and thanks for your question! Let's see if your peers on the Forum have an answer to your questions first. Or else I will follow up shortly with a response.

  • 5 kudos
2 More Replies
chanansh
by Contributor
  • 3945 Views
  • 9 replies
  • 9 kudos

copy files from azure to s3

I am trying to copy files from azure to s3. I've created a solution by comparing file lists and copy manually to a temp file and upload. However, I just found AutoLoader and I would like to use that https://docs.databricks.com/ingestion/auto-loader/i...

  • 3945 Views
  • 9 replies
  • 9 kudos
Latest Reply
Falokun
New Contributor II
  • 9 kudos

Just use tools like Goodsync and Gs Richcopy 360 to copy directly from blob to S3, I think you will never face problems like that ​

  • 9 kudos
8 More Replies
Nhan_Nguyen
by Valued Contributor
  • 1321 Views
  • 1 replies
  • 6 kudos

Resolved! Logic execute when we create an View?

Hi all,I have a small curious about VIEW on Databricks. Could anyone please help me clarify this?Normal database like Postgres or MS SQL, when we define a view, the logic still not execute that time, only run when we query that VIEW.Not sure how VIEW...

  • 1321 Views
  • 1 replies
  • 6 kudos
Latest Reply
daniel_sahal
Esteemed Contributor
  • 6 kudos

@Nhan Nguyen​ It works the same. CREATE VIEW constructs a virtual table that has no physical data.

  • 6 kudos
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels