cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

krsimons
by New Contributor
  • 1181 Views
  • 3 replies
  • 0 kudos

How do I automate my Databricks script?

How do I automate my Databricks script?

  • 1181 Views
  • 3 replies
  • 0 kudos
Latest Reply
Vartika
Databricks Employee
  • 0 kudos

Hey there @Kayla Simons​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell...

  • 0 kudos
2 More Replies
fshimamoto
by New Contributor III
  • 2419 Views
  • 3 replies
  • 2 kudos

What are the best practices for schema drift using Delta Live tables, in a scenario where the main source is a no sql database and we have a lot of ch...

What are the best practices for schema drift using Delta Live tables, in a scenario where the main source is a no sql database and we have a lot of changes in the schema?​

  • 2419 Views
  • 3 replies
  • 2 kudos
Latest Reply
Vartika
Databricks Employee
  • 2 kudos

Hey there @Fernando Martin​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from...

  • 2 kudos
2 More Replies
RajeshRK
by Contributor II
  • 5942 Views
  • 10 replies
  • 4 kudos

Databricks job fails while creating table.

Hi Team,The Databricks job fails with the below error while creating EXTERNAL table.com.simba.spark.jdbc41.internal.apache.http.wire - Error running query: MetaException(message:Got exception: org.apache.hadoop.fs.azure.AzureException com.microsoft.a...

  • 5942 Views
  • 10 replies
  • 4 kudos
Latest Reply
Vartika
Databricks Employee
  • 4 kudos

Hey there @Rajesh Kannan R​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from...

  • 4 kudos
9 More Replies
159312
by New Contributor III
  • 2725 Views
  • 1 replies
  • 0 kudos

How to set pipelines.incompatibleViewCheck.enabled = false

I tried to load a static table as source to a streaming dlt pipeline. I understand this is not optimum, but it provides the best path toward eventually having a full streaming pipeline. When I do I get the following error:pyspark.sql.utils.Analysis...

  • 2725 Views
  • 1 replies
  • 0 kudos
Latest Reply
kfoster
Contributor
  • 0 kudos

when you declare a table or view, you can pass use something as this: @dlt.table( spark_conf={ "pipelines.incompatibleViewCheck.enabled": "false" } )

  • 0 kudos
PrebenOlsen
by New Contributor III
  • 2071 Views
  • 1 replies
  • 1 kudos

Resolved! Why does @dlt.table from a table give different results than from a view?

I have some data in silver that I read in as a view using the __apply_changes function on. I create a table based on this, and I then want to create my gold-table, after doing a .groupBy() and .pivot(). The transformations I do in the gold-table aren...

image image
  • 2071 Views
  • 1 replies
  • 1 kudos
Latest Reply
PrebenOlsen
New Contributor III
  • 1 kudos

I have found a temporary solution to solve this. The .pivot("columnName") should automatically grab all the values it can find, but for some reason it does not. I need to specify the values, using.pivot("group_name", "group0", "group1", "group2"...) ...

  • 1 kudos
SatishGunjal
by New Contributor
  • 2653 Views
  • 1 replies
  • 0 kudos

Data frame takes long time to print count of rows

We have a pyspark data frame with 50 MN records. We can display records from it, but it takes around 10 minutes to print the shape of dataframe. We aim to use this data for modelling that will take some numerical features based on the final data fra...

  • 2653 Views
  • 1 replies
  • 0 kudos
Latest Reply
Hanna08
New Contributor II
  • 0 kudos

Thanks for the detailed explanation. For those who want to have constant technical support for their work processes, I recommend JD Young. Here is only the latest information about the update in the world of information technology solutions and cyber...

  • 0 kudos
venkad
by Contributor
  • 10259 Views
  • 4 replies
  • 7 kudos

Passing proxy configurations with databricks-sql-connector python?

Hi,I am trying to connect to databricks workspace which has IP Access restriction enabled using databricks-sql-connector. Only my Proxy server IPs are added in the allow list.from databricks import sql   connection = sql.connect( server_hostname ='...

  • 10259 Views
  • 4 replies
  • 7 kudos
Latest Reply
susodapop
Contributor
  • 7 kudos

`databricks-sql-connector` doesn't support HTTP proxies yet but the work is underway to implement it. Should be available in the next month or so. You can follow this issue on the open source repository for updates.

  • 7 kudos
3 More Replies
Cano
by New Contributor III
  • 1385 Views
  • 1 replies
  • 2 kudos

How to add notebook to my Databricks jdbc url?

Please how do I add a notebook to the jdbc url in order to run queries externally?jdbc:databricks://dbc-a1b2345c-d6e7.cloud.databricks.com:443/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/1234567890123456/1234-567890-reef123;AuthMech=3;...

  • 1385 Views
  • 1 replies
  • 2 kudos
Latest Reply
ranged_coop
Valued Contributor II
  • 2 kudos

Not sure if it is possible.Alternatively you could try adding your notebook to a job, and then triggering that job via jobs api.Please refer below link Jobs API 2.1 | Databricks on AWS

  • 2 kudos
Anonymous
by Not applicable
  • 4135 Views
  • 6 replies
  • 5 kudos

COPY INTO command can not recognise MAP type value from JSON file

I have a delta table in Databricks with single column of type map<string, string> and I have a data file in JSON format created by Hive 3 for the table with thecolumn of same type. And I want to load data from file to Databricks's table using COPY IN...

  • 4135 Views
  • 6 replies
  • 5 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 5 kudos

Hi Alexey,Just a friendly follow-up. Did any of the responses help you to resolve your question? if it did, please mark it as best. Otherwise, please let us know if you still need help.

  • 5 kudos
5 More Replies
tomnguyen_195
by New Contributor III
  • 2079 Views
  • 2 replies
  • 3 kudos

DLT maintenance job got stuck

Hi all,Recently we just realize a huge cost associate with our databricks account and the main culprit of it is DLT's pipeline maintenance job that got auto-scheduled to run but got stucked and cost us thousand of DBU. Do you know what would be the r...

  • 2079 Views
  • 2 replies
  • 3 kudos
Latest Reply
tinai_long
New Contributor III
  • 3 kudos

Same question. These maintenance jobs run for the maximum timeout (168 hours) and do not terminate. Example below:

  • 3 kudos
1 More Replies
Sha_1890
by New Contributor III
  • 4896 Views
  • 8 replies
  • 0 kudos

How to execute a series of stored procedures using scala in databricks

I am working in a migration project, where lift and shift method is used to migrate SQL server DB from onprem to AZure Cloud. There are a lot of stored procedures used for integration in On prem. Now here in On prem , to process the XMl file and exec...

  • 4896 Views
  • 8 replies
  • 0 kudos
Latest Reply
Noopur_Nigam
Databricks Employee
  • 0 kudos

Hi @shafana Roohi Jahubar​ I hope that your queries are answered. Please let me know if you have more doubts.

  • 0 kudos
7 More Replies
TMNGB
by New Contributor II
  • 2102 Views
  • 2 replies
  • 2 kudos

Resolved! Does MERGE statement preserve order? (Slowly Changing Dimensions)

In the case of processing multiple source files - with potentially, one or multiple entity versions per source - being able to use the MERGE statement whilst preserving the order is key to ensure the correct versioning of entity versions (aka, versio...

  • 2102 Views
  • 2 replies
  • 2 kudos
Latest Reply
Noopur_Nigam
Databricks Employee
  • 2 kudos

Hi @Guilherme Banhudo​ I hope that werners answer would have helped you. Please let me know if you still have doubts or queries.

  • 2 kudos
1 More Replies
SaiN
by New Contributor II
  • 2185 Views
  • 2 replies
  • 4 kudos

How to get Cost Per Job on a Single Cluster?

How will you get the granular information for cost per job for a single cluster in Azure Databricks? I know we can give Tags for Jobs as well Only Cluster we have. But I can only see Cluster Tag but not the Job TAGs in Cost Analysis on Azure Portal. ...

  • 2185 Views
  • 2 replies
  • 4 kudos
Latest Reply
Prabakar
Databricks Employee
  • 4 kudos

Hi @Sainath Nagare​  Job tags will be propagated on the job clusters. If you are using an interactive cluster for your job then you won't be able to see the Job tag.

  • 4 kudos
1 More Replies
77796
by New Contributor II
  • 4336 Views
  • 4 replies
  • 0 kudos

Databricks S3A error - java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.commit.S3ACommitterFactory not found

We are getting the below error for runtime 10.x and 11.x when writing to s3 via saveAsNewAPIHadoopFile function. The same jobs are running fine on runtime 9.x and 7.x. The difference betwen 9.x and 10.x is the former has hadoop 2.7 bindings with sp...

  • 4336 Views
  • 4 replies
  • 0 kudos
Latest Reply
77796
New Contributor II
  • 0 kudos

We have resolved this issue by using s3 scheme instead of s3a i.e. pairRDD.saveAsNewAPIHadoopFile("s3://bucket/testout.dat",

  • 0 kudos
3 More Replies
zyang
by Contributor
  • 3496 Views
  • 5 replies
  • 2 kudos

azure databricks notebook cannot load the difference

I am trying to commit and push my change to the branch, I cannot load the difference. I haven't changed many cells and each cells doesn't exceed the 500 lines in the notebook file. I am wondering why this happens and how to solve it?

Screenshot 2022-06-26 101907
  • 3496 Views
  • 5 replies
  • 2 kudos
Latest Reply
Vidula
Honored Contributor
  • 2 kudos

Hey there @z yang​ Hope all is well! Just wanted to check in if you were able to resolve your issue, and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Th...

  • 2 kudos
4 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels