cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

159312
by New Contributor III
  • 3012 Views
  • 1 replies
  • 0 kudos

How to set pipelines.incompatibleViewCheck.enabled = false

I tried to load a static table as source to a streaming dlt pipeline. I understand this is not optimum, but it provides the best path toward eventually having a full streaming pipeline. When I do I get the following error:pyspark.sql.utils.Analysis...

  • 3012 Views
  • 1 replies
  • 0 kudos
Latest Reply
kfoster
Contributor
  • 0 kudos

when you declare a table or view, you can pass use something as this: @dlt.table( spark_conf={ "pipelines.incompatibleViewCheck.enabled": "false" } )

  • 0 kudos
PrebenOlsen
by New Contributor III
  • 2299 Views
  • 1 replies
  • 1 kudos

Resolved! Why does @dlt.table from a table give different results than from a view?

I have some data in silver that I read in as a view using the __apply_changes function on. I create a table based on this, and I then want to create my gold-table, after doing a .groupBy() and .pivot(). The transformations I do in the gold-table aren...

image image
  • 2299 Views
  • 1 replies
  • 1 kudos
Latest Reply
PrebenOlsen
New Contributor III
  • 1 kudos

I have found a temporary solution to solve this. The .pivot("columnName") should automatically grab all the values it can find, but for some reason it does not. I need to specify the values, using.pivot("group_name", "group0", "group1", "group2"...) ...

  • 1 kudos
SatishGunjal
by New Contributor
  • 2881 Views
  • 1 replies
  • 0 kudos

Data frame takes long time to print count of rows

We have a pyspark data frame with 50 MN records. We can display records from it, but it takes around 10 minutes to print the shape of dataframe. We aim to use this data for modelling that will take some numerical features based on the final data fra...

  • 2881 Views
  • 1 replies
  • 0 kudos
Latest Reply
Hanna08
New Contributor II
  • 0 kudos

Thanks for the detailed explanation. For those who want to have constant technical support for their work processes, I recommend JD Young. Here is only the latest information about the update in the world of information technology solutions and cyber...

  • 0 kudos
venkad
by Contributor
  • 11311 Views
  • 4 replies
  • 7 kudos

Passing proxy configurations with databricks-sql-connector python?

Hi,I am trying to connect to databricks workspace which has IP Access restriction enabled using databricks-sql-connector. Only my Proxy server IPs are added in the allow list.from databricks import sql   connection = sql.connect( server_hostname ='...

  • 11311 Views
  • 4 replies
  • 7 kudos
Latest Reply
susodapop
Contributor
  • 7 kudos

`databricks-sql-connector` doesn't support HTTP proxies yet but the work is underway to implement it. Should be available in the next month or so. You can follow this issue on the open source repository for updates.

  • 7 kudos
3 More Replies
Cano
by New Contributor III
  • 1521 Views
  • 1 replies
  • 2 kudos

How to add notebook to my Databricks jdbc url?

Please how do I add a notebook to the jdbc url in order to run queries externally?jdbc:databricks://dbc-a1b2345c-d6e7.cloud.databricks.com:443/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/1234567890123456/1234-567890-reef123;AuthMech=3;...

  • 1521 Views
  • 1 replies
  • 2 kudos
Latest Reply
ranged_coop
Valued Contributor II
  • 2 kudos

Not sure if it is possible.Alternatively you could try adding your notebook to a job, and then triggering that job via jobs api.Please refer below link Jobs API 2.1 | Databricks on AWS

  • 2 kudos
Anonymous
by Not applicable
  • 4538 Views
  • 6 replies
  • 5 kudos

COPY INTO command can not recognise MAP type value from JSON file

I have a delta table in Databricks with single column of type map<string, string> and I have a data file in JSON format created by Hive 3 for the table with thecolumn of same type. And I want to load data from file to Databricks's table using COPY IN...

  • 4538 Views
  • 6 replies
  • 5 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 5 kudos

Hi Alexey,Just a friendly follow-up. Did any of the responses help you to resolve your question? if it did, please mark it as best. Otherwise, please let us know if you still need help.

  • 5 kudos
5 More Replies
tomnguyen_195
by New Contributor III
  • 2318 Views
  • 2 replies
  • 3 kudos

DLT maintenance job got stuck

Hi all,Recently we just realize a huge cost associate with our databricks account and the main culprit of it is DLT's pipeline maintenance job that got auto-scheduled to run but got stucked and cost us thousand of DBU. Do you know what would be the r...

  • 2318 Views
  • 2 replies
  • 3 kudos
Latest Reply
tinai_long
New Contributor III
  • 3 kudos

Same question. These maintenance jobs run for the maximum timeout (168 hours) and do not terminate. Example below:

  • 3 kudos
1 More Replies
Sha_1890
by New Contributor III
  • 5262 Views
  • 8 replies
  • 0 kudos

How to execute a series of stored procedures using scala in databricks

I am working in a migration project, where lift and shift method is used to migrate SQL server DB from onprem to AZure Cloud. There are a lot of stored procedures used for integration in On prem. Now here in On prem , to process the XMl file and exec...

  • 5262 Views
  • 8 replies
  • 0 kudos
Latest Reply
Noopur_Nigam
Databricks Employee
  • 0 kudos

Hi @shafana Roohi Jahubar​ I hope that your queries are answered. Please let me know if you have more doubts.

  • 0 kudos
7 More Replies
TMNGB
by New Contributor II
  • 2411 Views
  • 2 replies
  • 2 kudos

Resolved! Does MERGE statement preserve order? (Slowly Changing Dimensions)

In the case of processing multiple source files - with potentially, one or multiple entity versions per source - being able to use the MERGE statement whilst preserving the order is key to ensure the correct versioning of entity versions (aka, versio...

  • 2411 Views
  • 2 replies
  • 2 kudos
Latest Reply
Noopur_Nigam
Databricks Employee
  • 2 kudos

Hi @Guilherme Banhudo​ I hope that werners answer would have helped you. Please let me know if you still have doubts or queries.

  • 2 kudos
1 More Replies
SaiN
by New Contributor II
  • 2418 Views
  • 2 replies
  • 4 kudos

How to get Cost Per Job on a Single Cluster?

How will you get the granular information for cost per job for a single cluster in Azure Databricks? I know we can give Tags for Jobs as well Only Cluster we have. But I can only see Cluster Tag but not the Job TAGs in Cost Analysis on Azure Portal. ...

  • 2418 Views
  • 2 replies
  • 4 kudos
Latest Reply
Prabakar
Databricks Employee
  • 4 kudos

Hi @Sainath Nagare​  Job tags will be propagated on the job clusters. If you are using an interactive cluster for your job then you won't be able to see the Job tag.

  • 4 kudos
1 More Replies
77796
by New Contributor II
  • 4744 Views
  • 4 replies
  • 0 kudos

Databricks S3A error - java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.commit.S3ACommitterFactory not found

We are getting the below error for runtime 10.x and 11.x when writing to s3 via saveAsNewAPIHadoopFile function. The same jobs are running fine on runtime 9.x and 7.x. The difference betwen 9.x and 10.x is the former has hadoop 2.7 bindings with sp...

  • 4744 Views
  • 4 replies
  • 0 kudos
Latest Reply
77796
New Contributor II
  • 0 kudos

We have resolved this issue by using s3 scheme instead of s3a i.e. pairRDD.saveAsNewAPIHadoopFile("s3://bucket/testout.dat",

  • 0 kudos
3 More Replies
zyang
by Contributor
  • 3773 Views
  • 5 replies
  • 2 kudos

azure databricks notebook cannot load the difference

I am trying to commit and push my change to the branch, I cannot load the difference. I haven't changed many cells and each cells doesn't exceed the 500 lines in the notebook file. I am wondering why this happens and how to solve it?

Screenshot 2022-06-26 101907
  • 3773 Views
  • 5 replies
  • 2 kudos
Latest Reply
Vidula
Honored Contributor
  • 2 kudos

Hey there @z yang​ Hope all is well! Just wanted to check in if you were able to resolve your issue, and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Th...

  • 2 kudos
4 More Replies
OldDogNewTrix
by New Contributor
  • 1089 Views
  • 3 replies
  • 0 kudos
  • 1089 Views
  • 3 replies
  • 0 kudos
Latest Reply
Vidula
Honored Contributor
  • 0 kudos

Hey there @Jim Carlson​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you...

  • 0 kudos
2 More Replies
Yagao
by New Contributor
  • 1156 Views
  • 2 replies
  • 0 kudos

How to do python within sql query in Databricks ?

How to do python within sql query in Databricks ?

  • 1156 Views
  • 2 replies
  • 0 kudos
Latest Reply
Vidula
Honored Contributor
  • 0 kudos

Hi @Ya Gao​ Hope you are well. Just wanted to see if you were able to find an answer to your question and would you like to mark an answer as best? It would be really helpful for the other members too.Cheers!

  • 0 kudos
1 More Replies
Mapajr
by New Contributor III
  • 2953 Views
  • 2 replies
  • 3 kudos

Issues pushing repos on Gitlab with Databricks

Our company uses Gitlab enterprise edition and we link our repos up to databricks through this. Randomly we will get errors when trying to push the repo and we have to spend hours debugging trying to figure out what is causing the push error on datab...

  • 2953 Views
  • 2 replies
  • 3 kudos
Latest Reply
Vidula
Honored Contributor
  • 3 kudos

Hey there @Mark Patrick​ Hope you are well. Just wanted to see if you were able to find an answer to your question and would you like to mark an answer as best? It would be really helpful for the other members too.Cheers!

  • 3 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels
Latest Photos in Data Engineering