cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

berserkersap
by Contributor
  • 7782 Views
  • 3 replies
  • 5 kudos

What is the timeout for dbutils.notebook.run, timeout = 0 ?

Hello everyone,I have several notebooks (around 10) and I want to run them in a sequential order. At first I thought of using %run but I have a variable that is repeatedly used in every notebook. So now I am thinking to pass that variable from one ma...

image
  • 7782 Views
  • 3 replies
  • 5 kudos
Latest Reply
UmaMahesh1
Honored Contributor III
  • 5 kudos

Hi @pavan venkata​ Yes, as the document says 0 means no timeout. It means that the notebook will take it's sweet time to complete execution without throwing an error due to a time limit. Be it if the notebook takes 1 min or 1 hour or 1 day or more. H...

  • 5 kudos
2 More Replies
Databrickguy
by New Contributor II
  • 3829 Views
  • 6 replies
  • 3 kudos

Resolved! How to parse/extract/format a string based a pattern?

How to parse, extract or form a string based on a pattern?SQL server has a function which will format the string based on a pattern. example,a string is "abcdefgh", the pattern is XX-XX-XXXX,the the string will be "ab-cd-efgh".How to archive this wit...

  • 3829 Views
  • 6 replies
  • 3 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 3 kudos

@Tim zhang​ ,thanks for your code, and here is your answer  I asked this question in Stackoverflow and got this answer Here is the Stackoverflow link- https://stackoverflow.com/questions/74845760/how-to-parse-a-pattern-and-use-it-to-format-a-string-u...

  • 3 kudos
5 More Replies
Pat
by Honored Contributor III
  • 4274 Views
  • 5 replies
  • 9 kudos

Reading data from "dbfs:/mnt/"

Hi community,I don't know what is happening TBH. I have a use case where data is written to the location "dbfs:/mnt/...", don't ask me why it's mounted, it's just a side project. I do believe that data is stored in ADLS2.I've been trying to read the ...

  • 4274 Views
  • 5 replies
  • 9 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 9 kudos

this is really interesting never faced this type od situation @Pat Sienkiewicz​  can you please share whole code by that we can test and debug this in our systemThanksAviral

  • 9 kudos
4 More Replies
bakselrud
by New Contributor III
  • 6211 Views
  • 12 replies
  • 2 kudos

Resolved! DLT pipeline failure - Detected a data update... This is currently not supported

We are using DLT pipeline in Databricks workspace hosted by Microsoft Azure platform which is failing intermittently and for unclear reason.The pipeline is as follows:spark.readStream.format("delta").option("mergeSchema", "true").option("ignoreChange...

  • 6211 Views
  • 12 replies
  • 2 kudos
Latest Reply
bakselrud
New Contributor III
  • 2 kudos

Ok, so after doing some investigation on the way to resolving my original question, I think we're getting some clarity after all.Consider the following data frame that is ingested by DLT streaming pipeline:dfMock = spark.sparkContext.parallelize([[1,...

  • 2 kudos
11 More Replies
SM14
by New Contributor
  • 951 Views
  • 1 replies
  • 0 kudos

Row Level Validation

I have two array one of devl other one is prod.Inside this there are many tables .How do i compare and check the count difference.Wanted to create a automated script so as to check the count difference and perform row level validation.Pyspark script...

  • 951 Views
  • 1 replies
  • 0 kudos
Latest Reply
Debayan
Esteemed Contributor III
  • 0 kudos

Hi, You can use except command for the same. Please refer: https://stackoverflow.com/questions/70366209/databricks-comparing-two-tables-to-see-which-records-are-missing. Please let us know if this helps.

  • 0 kudos
g96g
by New Contributor III
  • 937 Views
  • 2 replies
  • 1 kudos

Databricks SQL permission problems

We are using a catalog and normally I have the ALL PREVELAGES user status but Im not able to modify the SQL script which is created by some of my colleagues. They have to give me an access and after that Im able to modify. How can I solve this proble...

  • 937 Views
  • 2 replies
  • 1 kudos
Latest Reply
Debayan
Esteemed Contributor III
  • 1 kudos

Hi, when you are not able to modify could you please confirm the error you are receiving? Also, you can refer to https://docs.databricks.com/_static/notebooks/set-owners-notebook.html and https://docs.databricks.com/sql/admin/transfer-ownership.html

  • 1 kudos
1 More Replies
Viren123
by Contributor
  • 2890 Views
  • 6 replies
  • 4 kudos

API to write into Databricks tables

Hello Experts,Is there any API in databricks that allows to write the data in the Databricks tables. I would like to send small size Logs information to Databricks tables from other service. What are my options?Thank you very much.

  • 2890 Views
  • 6 replies
  • 4 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 4 kudos

Hi @Viren Devi​, It would mean a lot if you could select the "Best Answer" to help others find the correct answer faster.This makes that answer appear right after the question, so it's easier to find within a thread.It also helps us mark the question...

  • 4 kudos
5 More Replies
CHANDAN_NANDY
by New Contributor III
  • 3941 Views
  • 2 replies
  • 4 kudos

Resolved! GitCopilot Support

Any Idea why GitCopiot is not available in Azure Databricks, though it supports Github? 

  • 3941 Views
  • 2 replies
  • 4 kudos
Latest Reply
nightcoder
New Contributor II
  • 4 kudos

That is true (this is not an answer but a comment) vscode is supported. But vscode is not integrating with notebook on aws. When will this feature be available?

  • 4 kudos
1 More Replies
brickster_2018
by Esteemed Contributor
  • 1558 Views
  • 3 replies
  • 0 kudos

Resolved! For the Autoloader, cloudFiles.includeExistingFiles option, is ordering respected?

If Yes, how is order ensured?  For example, let's say there are a number of CDC change files that are uploaded to a directory over time. If a table were to be created using the cloudFiles source, in what order would those files be processed?

  • 1558 Views
  • 3 replies
  • 0 kudos
Latest Reply
Hanish_Goel
New Contributor II
  • 0 kudos

Hi, Is there any new development in terms of ensuring ordering of the files in autoloader?

  • 0 kudos
2 More Replies
vk217
by Contributor
  • 9885 Views
  • 5 replies
  • 17 kudos

Resolved! python wheel cannot be installed as library.

When I try to install the python whl library, I get the below error. However I can install it as a jar and it works fine. One difference is that I am creating my own cluster by cloning an existing cluster and copying the whl to a folder called testin...

image
  • 9885 Views
  • 5 replies
  • 17 kudos
Latest Reply
vk217
Contributor
  • 17 kudos

The issue was that the package was renamed after it was installed to the cluster and hence it was not recognized.

  • 17 kudos
4 More Replies
140015
by New Contributor III
  • 417 Views
  • 0 replies
  • 0 kudos

DLT using the result of one view in another table with collect()

Hey,Do you guys know, if there is an option to implement something like this in DLT:@dlt.view()def view_1(): # some calculations that return a small dataframe with around max 80 rows@dlt.table()def table_1(): result_df = dlt.read("view_1") resu...

  • 417 Views
  • 0 replies
  • 0 kudos
ACP
by New Contributor III
  • 1530 Views
  • 4 replies
  • 2 kudos

Accreditation, Badges, Points not received

Hi there​ ,I have completed a few courses but didn't receive any badges or points. I also did an accreditation but also didn't receive anything. It's been already 3 or 4 days and still nothing.I would really appreciate if Databricks could fix this.Ma...

  • 1530 Views
  • 4 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Andre Paiva​ Thank you for reaching out! Please submit a ticket to our Training Team here: https://help.databricks.com/s/contact-us?ReqType=training  and our team will get back to you shortly. 

  • 2 kudos
3 More Replies
KVNARK
by Honored Contributor II
  • 2206 Views
  • 3 replies
  • 8 kudos

Resolved! Advantages of Databricks Lakehouse over Azure synapse.

What are more advantages of data bricks over azure synapse analytics. Looks like most of them are almost similar features like computation or storage etc... in both.

  • 2206 Views
  • 3 replies
  • 8 kudos
Latest Reply
Geeta1
Valued Contributor
  • 8 kudos

Below link has good comparison of both:https://hevodata.com/learn/azure-synapse-vs-databricks/

  • 8 kudos
2 More Replies
raman
by New Contributor II
  • 806 Views
  • 2 replies
  • 0 kudos

Spark pushdown filter not being respected on dbfs

I have a parquet files with a column g1 with schemaStructField(g1,IntegerType,true)Now I have a query with filter on g1.What's weird in the SQL viewer is that spark is loading all the rows from that file. Even though in the physical plan I can see th...

  • 806 Views
  • 2 replies
  • 0 kudos
Latest Reply
raman
New Contributor II
  • 0 kudos

Thanks @Ajay Pandey​ pls find attached the physical plan.Query: Select identityMap, segmentMembership, _repo, workEmail, person, homePhone, workPhone, workAddress, personalEmail, homeAddress from final_segment_index_table_v2 where (g1 >= 128 AND g1 <...

  • 0 kudos
1 More Replies
Kopal
by New Contributor II
  • 3637 Views
  • 3 replies
  • 3 kudos

Resolved! Data Engineering - CTAS - External Tables - Limitations of CTAS for external tables - can or cannot use options and location

Data Engineering - CTAS - External TablesCan someone help me understand why In chapter 3.3, we cannot not directly use CTAS with OPTIONS and LOCATION to specify delimiter and location of CSV?Or I misunderstood?Details:In Data Engineering with Databri...

  • 3637 Views
  • 3 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

The 2nd statement CTAS will not be able to parse the csv in any manner because it's just the from statement that points to a file. It's more of a traditional SQL statement with select and from. It will create a Delta Table. This just happens to b...

  • 3 kudos
2 More Replies
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels