cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

yatharthmahesh
by New Contributor III
  • 3403 Views
  • 3 replies
  • 6 kudos

ENABLE CHANGE DATA FEED FOR EXISTING DELTA-TABLE

I have a delta table already created, now I want to enable the change data feed. I read that I have to set delta.enableChangeDataFeed property to true. But however, this cannot be done using the Scala API. I tried using this but it didn't work. I am ...

  • 3403 Views
  • 3 replies
  • 6 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 6 kudos

'delta.enableChangeDataFeed' have to be without quotes. spark.sql("ALTER TABLE delta_training.onaudience_dpm SET TBLPROPERTIES (delta.enableChangeDataFeed = true)").show()

  • 6 kudos
2 More Replies
KumarShiv
by New Contributor III
  • 2286 Views
  • 2 replies
  • 2 kudos

Resolved! Databricks Spark SQL function "PERCENTILE_DISC()" output not accurate.

I am try to get the percentile values on different splits but I got that the result of Databricks PERCENTILE_DISC() function is not accurate . I have run the same query on MS SQL but getting different result set.Here are both result sets for Pyspark ...

  • 2286 Views
  • 2 replies
  • 2 kudos
Latest Reply
artsheiko
Databricks Employee
  • 2 kudos

The reason might be that in SQL PERCENTILE_DISC is nondeterministic

  • 2 kudos
1 More Replies
Trung
by Contributor
  • 3699 Views
  • 5 replies
  • 5 kudos

Job fail due to Access Denied

please help me to solve the problem that my data bricks account can not start the Job by triggering manually or scheduling although I can run the script without error.

image.png
  • 3699 Views
  • 5 replies
  • 5 kudos
Latest Reply
Vivian_Wilfred
Databricks Employee
  • 5 kudos

Hi @trung nguyen​ , Please check if you have the necessary instance profile attached to the Job cluster. You are definitely missing something related to the IAM.

  • 5 kudos
4 More Replies
Anonymous
by Not applicable
  • 1724 Views
  • 4 replies
  • 4 kudos

Invalid shard address

I'm running pyspark through databricks-connect and getting an error saying```ERROR SparkClientManager: Fail to get the SparkClientjava.util.concurrent.ExecutionException: com.databricks.service.SparkServiceConnectionException: Invalid shard address:`...

  • 1724 Views
  • 4 replies
  • 4 kudos
Latest Reply
Prabakar
Databricks Employee
  • 4 kudos

hi @Marco Wong​ was this working before and failing now? Are you behind a VPN or firewall? If so can you check by disabling it?enable traces at wireshark and collected dump to check if there is traffic going to workspace?Check if you can get curl wor...

  • 4 kudos
3 More Replies
krsimons
by New Contributor
  • 1257 Views
  • 3 replies
  • 0 kudos

How do I automate my Databricks script?

How do I automate my Databricks script?

  • 1257 Views
  • 3 replies
  • 0 kudos
Latest Reply
Vartika
Databricks Employee
  • 0 kudos

Hey there @Kayla Simons​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell...

  • 0 kudos
2 More Replies
fshimamoto
by New Contributor III
  • 2670 Views
  • 3 replies
  • 2 kudos

What are the best practices for schema drift using Delta Live tables, in a scenario where the main source is a no sql database and we have a lot of ch...

What are the best practices for schema drift using Delta Live tables, in a scenario where the main source is a no sql database and we have a lot of changes in the schema?​

  • 2670 Views
  • 3 replies
  • 2 kudos
Latest Reply
Vartika
Databricks Employee
  • 2 kudos

Hey there @Fernando Martin​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from...

  • 2 kudos
2 More Replies
RajeshRK
by Contributor II
  • 6296 Views
  • 10 replies
  • 4 kudos

Databricks job fails while creating table.

Hi Team,The Databricks job fails with the below error while creating EXTERNAL table.com.simba.spark.jdbc41.internal.apache.http.wire - Error running query: MetaException(message:Got exception: org.apache.hadoop.fs.azure.AzureException com.microsoft.a...

  • 6296 Views
  • 10 replies
  • 4 kudos
Latest Reply
Vartika
Databricks Employee
  • 4 kudos

Hey there @Rajesh Kannan R​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from...

  • 4 kudos
9 More Replies
159312
by New Contributor III
  • 2861 Views
  • 1 replies
  • 0 kudos

How to set pipelines.incompatibleViewCheck.enabled = false

I tried to load a static table as source to a streaming dlt pipeline. I understand this is not optimum, but it provides the best path toward eventually having a full streaming pipeline. When I do I get the following error:pyspark.sql.utils.Analysis...

  • 2861 Views
  • 1 replies
  • 0 kudos
Latest Reply
kfoster
Contributor
  • 0 kudos

when you declare a table or view, you can pass use something as this: @dlt.table( spark_conf={ "pipelines.incompatibleViewCheck.enabled": "false" } )

  • 0 kudos
PrebenOlsen
by New Contributor III
  • 2208 Views
  • 1 replies
  • 1 kudos

Resolved! Why does @dlt.table from a table give different results than from a view?

I have some data in silver that I read in as a view using the __apply_changes function on. I create a table based on this, and I then want to create my gold-table, after doing a .groupBy() and .pivot(). The transformations I do in the gold-table aren...

image image
  • 2208 Views
  • 1 replies
  • 1 kudos
Latest Reply
PrebenOlsen
New Contributor III
  • 1 kudos

I have found a temporary solution to solve this. The .pivot("columnName") should automatically grab all the values it can find, but for some reason it does not. I need to specify the values, using.pivot("group_name", "group0", "group1", "group2"...) ...

  • 1 kudos
SatishGunjal
by New Contributor
  • 2796 Views
  • 1 replies
  • 0 kudos

Data frame takes long time to print count of rows

We have a pyspark data frame with 50 MN records. We can display records from it, but it takes around 10 minutes to print the shape of dataframe. We aim to use this data for modelling that will take some numerical features based on the final data fra...

  • 2796 Views
  • 1 replies
  • 0 kudos
Latest Reply
Hanna08
New Contributor II
  • 0 kudos

Thanks for the detailed explanation. For those who want to have constant technical support for their work processes, I recommend JD Young. Here is only the latest information about the update in the world of information technology solutions and cyber...

  • 0 kudos
venkad
by Contributor
  • 10895 Views
  • 4 replies
  • 7 kudos

Passing proxy configurations with databricks-sql-connector python?

Hi,I am trying to connect to databricks workspace which has IP Access restriction enabled using databricks-sql-connector. Only my Proxy server IPs are added in the allow list.from databricks import sql   connection = sql.connect( server_hostname ='...

  • 10895 Views
  • 4 replies
  • 7 kudos
Latest Reply
susodapop
Contributor
  • 7 kudos

`databricks-sql-connector` doesn't support HTTP proxies yet but the work is underway to implement it. Should be available in the next month or so. You can follow this issue on the open source repository for updates.

  • 7 kudos
3 More Replies
Cano
by New Contributor III
  • 1462 Views
  • 1 replies
  • 2 kudos

How to add notebook to my Databricks jdbc url?

Please how do I add a notebook to the jdbc url in order to run queries externally?jdbc:databricks://dbc-a1b2345c-d6e7.cloud.databricks.com:443/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/1234567890123456/1234-567890-reef123;AuthMech=3;...

  • 1462 Views
  • 1 replies
  • 2 kudos
Latest Reply
ranged_coop
Valued Contributor II
  • 2 kudos

Not sure if it is possible.Alternatively you could try adding your notebook to a job, and then triggering that job via jobs api.Please refer below link Jobs API 2.1 | Databricks on AWS

  • 2 kudos
Anonymous
by Not applicable
  • 4359 Views
  • 6 replies
  • 5 kudos

COPY INTO command can not recognise MAP type value from JSON file

I have a delta table in Databricks with single column of type map<string, string> and I have a data file in JSON format created by Hive 3 for the table with thecolumn of same type. And I want to load data from file to Databricks's table using COPY IN...

  • 4359 Views
  • 6 replies
  • 5 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 5 kudos

Hi Alexey,Just a friendly follow-up. Did any of the responses help you to resolve your question? if it did, please mark it as best. Otherwise, please let us know if you still need help.

  • 5 kudos
5 More Replies
tomnguyen_195
by New Contributor III
  • 2209 Views
  • 2 replies
  • 3 kudos

DLT maintenance job got stuck

Hi all,Recently we just realize a huge cost associate with our databricks account and the main culprit of it is DLT's pipeline maintenance job that got auto-scheduled to run but got stucked and cost us thousand of DBU. Do you know what would be the r...

  • 2209 Views
  • 2 replies
  • 3 kudos
Latest Reply
tinai_long
New Contributor III
  • 3 kudos

Same question. These maintenance jobs run for the maximum timeout (168 hours) and do not terminate. Example below:

  • 3 kudos
1 More Replies
Sha_1890
by New Contributor III
  • 5091 Views
  • 8 replies
  • 0 kudos

How to execute a series of stored procedures using scala in databricks

I am working in a migration project, where lift and shift method is used to migrate SQL server DB from onprem to AZure Cloud. There are a lot of stored procedures used for integration in On prem. Now here in On prem , to process the XMl file and exec...

  • 5091 Views
  • 8 replies
  • 0 kudos
Latest Reply
Noopur_Nigam
Databricks Employee
  • 0 kudos

Hi @shafana Roohi Jahubar​ I hope that your queries are answered. Please let me know if you have more doubts.

  • 0 kudos
7 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels