cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

Data_Engineeri7
by New Contributor
  • 2134 Views
  • 3 replies
  • 0 kudos

Global or environment parameters.

Hi All,Need a help on creating utility file that can be use in pyspark notebook.Utility file contain variables like database and schema names. So I need to pass this variables in other notebook wherever I am using database and schema.Thanks   

  • 2134 Views
  • 3 replies
  • 0 kudos
Latest Reply
KSI
New Contributor II
  • 0 kudos

You can use:${param_catalog}.schema.tablename.Pass actual value in the notebook through a job param "param_catalog" or widget utils through text called "param_catalog"

  • 0 kudos
2 More Replies
ilarsen
by Contributor
  • 2638 Views
  • 1 replies
  • 0 kudos

Schema inference with auto loader (non-DLT and DLT)

Hi. Another question, this time about schema inference and column types.  I have dabbled with DLT and structured streaming with auto loader (as in, not DLT).  My data source use case is json files, which contain nested structures. I noticed that in t...

  • 2638 Views
  • 1 replies
  • 0 kudos
Latest Reply
" src="" />
This widget could not be displayed.
This widget could not be displayed.
This widget could not be displayed.
  • 0 kudos

This widget could not be displayed.
Hi. Another question, this time about schema inference and column types.  I have dabbled with DLT and structured streaming with auto loader (as in, not DLT).  My data source use case is json files, which contain nested structures. I noticed that in t...

This widget could not be displayed.
  • 0 kudos
This widget could not be displayed.
MarthinusBosma1
by New Contributor II
  • 1275 Views
  • 3 replies
  • 0 kudos

Unable to DROP TABLE: "Lock wait timeout exceeded"

We have a table where the underlying data has been dropped, and seemingly something else must have gone wrong as well, and we want to just get rid of the whole table and schema, but running "DROP TABLE schema.table" is throwing the following error:or...

  • 1275 Views
  • 3 replies
  • 0 kudos
Latest Reply
Lakshay
Databricks Employee
  • 0 kudos

The table needs to be dropped from the backend. If you can raise a ticket, support team can do it for you. 

  • 0 kudos
2 More Replies
Mystagon
by New Contributor II
  • 1424 Views
  • 1 replies
  • 0 kudos

Performance Issues with Unity Catalog

Hey I need some help /  suggestions troubleshooting this, I have two DataBricks Workspaces Common and Lakehouse. There difference between them is: Major Differences:- Lakehouse is using Unity Catalog- Lakehouse is using External Locations whereas cre...

  • 1424 Views
  • 1 replies
  • 0 kudos
Latest Reply
Lakshay
Databricks Employee
  • 0 kudos

This needs a detailed analysis to understand the root cause. But a good point to start is to compare the Spark Ui for both runs and identify which part of execution is taking time. And then we need to look at the logs.

  • 0 kudos
Data_Engineer3
by Contributor III
  • 4144 Views
  • 5 replies
  • 0 kudos

Resolved! Need to define the struct and array of struct field colum in the delta live table(dlt) in databrick.

I want to create the columns with datatype struct and array of struct datatype in the DLT live tables, will it be possible, if possible could you share the sample for the same.Thanks.

  • 4144 Views
  • 5 replies
  • 0 kudos
Latest Reply
Data_Engineer3
Contributor III
  • 0 kudos

I have created DLT live tables pipeline, In Job UI, i can able to see only steps and if any failure happened it show only error at that stage.But if i use any log using print, it doesn't show the logs in the console or any where. how can i see the lo...

  • 0 kudos
4 More Replies
kiko_roy
by Contributor
  • 2259 Views
  • 3 replies
  • 1 kudos

Resolved! IsBlindAppend config changes

Hello Allcan someone please suggest me how can I change the config IsBlindAppend true from false. I need to do this not for a data table but a custom log table .Also is there any concern If I toggle the value as standard practices. pls suggest

  • 2259 Views
  • 3 replies
  • 1 kudos
Latest Reply
Lakshay
Databricks Employee
  • 1 kudos

Hi, IsBlindAppend is not a config but an operation metrics that is used in Delta Lake History. The value of this changes based on the type of operation performed on Delta table. https://docs.databricks.com/en/delta/history.html

  • 1 kudos
2 More Replies
aa_204
by New Contributor II
  • 2745 Views
  • 3 replies
  • 0 kudos

Reading excel file using pandas on spark api not rendering #N/A values correctly

I am trying to read a .xlsx file using ps.read_excel() and having #N/A as a value for string type columns. But in the dataframe, i am getting "null" inplace of #N/A . Is there any option , using which we can read #N/A as a string in .xlsx file 

  • 2745 Views
  • 3 replies
  • 0 kudos
Latest Reply
vishwanath_1
New Contributor III
  • 0 kudos

i am facing the same issue currently even after setting keep_default_na = False still #N/A is being converted as nulldoes anyone know the solution here?

  • 0 kudos
2 More Replies
francly
by New Contributor II
  • 3552 Views
  • 5 replies
  • 3 kudos

Resolved! terraform create multiple db user

Hi, follow the example to create one user. It's working however I want to create multiple users, I have tried many ways but still cannot get it work, please share some idea.https://registry.terraform.io/providers/databricks/databricks/latest/docs/res...

  • 3552 Views
  • 5 replies
  • 3 kudos
Latest Reply
Natlab
New Contributor II
  • 3 kudos

What if I want to give User Name along with the email ID?I used below code but its not helping(code is not failing, but not adding user name)It seems this code line: "display_name = each.key" is not working. Pls suggest.  terraform {required_provider...

  • 3 kudos
4 More Replies
364488
by New Contributor
  • 1589 Views
  • 2 replies
  • 0 kudos

java.io.IOException: Invalid PKCS8 data error when reading data from Google Storage

Databricks workspace is hosted in AWS.  Trying to access data in Google Cloud Platform.I have followed the instructions here: https://docs.databricks.com/en/connect/storage/gcs.htmlI get error: "java.io.IOException: Invalid PKCS8 data." when trying t...

  • 1589 Views
  • 2 replies
  • 0 kudos
Latest Reply
Debayan
Databricks Employee
  • 0 kudos

Hi, Could you also please share the whole error stack?  

  • 0 kudos
1 More Replies
Faisal
by Contributor
  • 9319 Views
  • 1 replies
  • 0 kudos

DLT quarantine records

How to capture bad records that are violating expectations into quarantine tables, can someone provide DLT SQL code syntax for the same 

  • 9319 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

I would like to share the following docs, which will have examples https://docs.databricks.com/en/delta-live-tables/expectations.html

  • 0 kudos
Alva
by New Contributor
  • 892 Views
  • 1 replies
  • 0 kudos

Error while performing async I/O for file

We're running dbt Cloud on DBSQL. And a frequent error we keep getting in our dbt jobs is  "Error while performing async I/O for file [S3 URI path]". Since we don't have access to the full logs it's very difficult to know what's actually going on her...

  • 892 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

do you have access to create a support ticket? if you do, we can retrieve the logs for you and provide the details. If you dont, then you will need access to your driver's logs to identify the root cause of this issue.

  • 0 kudos
rt-slowth
by Contributor
  • 1311 Views
  • 2 replies
  • 1 kudos

How to writeStream with redshift

I have already checked the documentation below The documentation below does not describe how to write to streaming.Is there a way to write the gold table (type is streaming table), which is the output of the streaming pipeline of Delta Live Tables in...

  • 1311 Views
  • 2 replies
  • 1 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 1 kudos

Only batch processing is supported.

  • 1 kudos
1 More Replies
umarkhan
by New Contributor II
  • 1022 Views
  • 1 replies
  • 0 kudos

Module not found when using applyInPandasWithState in Repos

I should start by saying that everything works fine if I copy and paste it all into a notebook and run it. The problem starts if we try to have any structure in our application repository. Also, so far we have only run into this problem with applyInP...

  • 1022 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

which DBR version are you using? does it works on non DLT jobs?

  • 0 kudos
sher
by Valued Contributor II
  • 803 Views
  • 1 replies
  • 0 kudos

did anyone faced this issue in delta table while genrating manifest file

error message : Manifest generation is not supported for tables that leverage column mapping, as external readers cannot read these Delta tableswhy i got this issue. not sure should we need to do any process ?

  • 803 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

could you please share the full stack trace and the repro steps?  

  • 0 kudos
VishalD
by New Contributor
  • 706 Views
  • 1 replies
  • 0 kudos

Not able to load nested XML file with struct type

Hello Experts,I am trying to load XML with struct type and having XSI type attribute. below is sample XML format:<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="htt...

  • 706 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

You can try to use from_xml() function, here is the link to the docs https://docs.databricks.com/en/sql/language-manual/functions/from_xml.html

  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels