cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

rt-slowth
by Contributor
  • 2693 Views
  • 2 replies
  • 1 kudos

How to writeStream with redshift

I have already checked the documentation below The documentation below does not describe how to write to streaming.Is there a way to write the gold table (type is streaming table), which is the output of the streaming pipeline of Delta Live Tables in...

  • 2693 Views
  • 2 replies
  • 1 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 1 kudos

Only batch processing is supported.

  • 1 kudos
1 More Replies
umarkhan
by New Contributor II
  • 2082 Views
  • 1 replies
  • 0 kudos

Module not found when using applyInPandasWithState in Repos

I should start by saying that everything works fine if I copy and paste it all into a notebook and run it. The problem starts if we try to have any structure in our application repository. Also, so far we have only run into this problem with applyInP...

  • 2082 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

which DBR version are you using? does it works on non DLT jobs?

  • 0 kudos
sher
by Valued Contributor II
  • 1697 Views
  • 1 replies
  • 0 kudos

did anyone faced this issue in delta table while genrating manifest file

error message : Manifest generation is not supported for tables that leverage column mapping, as external readers cannot read these Delta tableswhy i got this issue. not sure should we need to do any process ?

  • 1697 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

could you please share the full stack trace and the repro steps?  

  • 0 kudos
VishalD
by New Contributor
  • 1572 Views
  • 1 replies
  • 0 kudos

Not able to load nested XML file with struct type

Hello Experts,I am trying to load XML with struct type and having XSI type attribute. below is sample XML format:<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="htt...

  • 1572 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

You can try to use from_xml() function, here is the link to the docs https://docs.databricks.com/en/sql/language-manual/functions/from_xml.html

  • 0 kudos
SimDarmapuri
by New Contributor II
  • 2440 Views
  • 1 replies
  • 1 kudos

Databricks Deployment using Data Thirst

Hi,I am trying to deploy Databricks Notebooks using Azure Devops to different environments using third party extension Data Thirst (Databricks Script Deployment Task by Data Thirst). The pipeline is able to generate/download artifacts but not able to...

SimDarmapuri_0-1705853167362.png
  • 2440 Views
  • 1 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

the extension is quite old and does not know about Unity Catalog.  So that is probably the reason why it fails.But why do you use the extension for notebook propagation from dev to prd?  You can do this using Repos, feature branches and pull requests...

  • 1 kudos
Michael_Appiah
by Databricks Partner
  • 2792 Views
  • 1 replies
  • 1 kudos

Resolved! Display Limits Catalog Explorer

It seems as if the Catalog Explorer can only display a maximum of 1000 folders within a UC Volume. I just ran into this issue when I added new folders to a volume which were not displayed in the Catalog Explorer (only folders 1-1000). I was able to r...

  • 2792 Views
  • 1 replies
  • 1 kudos
Latest Reply
Lakshay
Databricks Employee
  • 1 kudos

Hi @Michael_Appiah , This is a known limitation: https://docs.databricks.com/en/connect/unity-catalog/volumes.html#limitations

  • 1 kudos
jonathan-dufaul
by Valued Contributor
  • 5079 Views
  • 2 replies
  • 0 kudos

Is there a command in sql cell to ignore formatting for some lines like `# fmt: off` in Python cells

In python cells I can add the comments `# fmt: off` before a block of code that I want black/autoformatter to ignore and `# fmt: on` afterwards. Is there anything similar I can put in sql cells to accomplish the same effect?Some of the recommendation...

Data Engineering
autoformatter
formatter
sql
  • 5079 Views
  • 2 replies
  • 0 kudos
bayerb
by New Contributor
  • 2002 Views
  • 1 replies
  • 0 kudos

Sink is not written into delta table in Spark structured streaming

I want to create a streaming job, that reads messages from a folder within TXT files, does the parsing, some processing, and appends the result into one of 3 possible delta tables depending on the parse result. There is a parse_failed table, an unknw...

  • 2002 Views
  • 1 replies
  • 0 kudos
Latest Reply
Lakshay
Databricks Employee
  • 0 kudos

There doesn't seem to any issue with code. But log needs to be analysed to get a clue of what is the issue. Could you please create a support ticket.

  • 0 kudos
vishwanath_1
by New Contributor III
  • 1927 Views
  • 1 replies
  • 0 kudos

Resolved! Need Suggestion for better caching strategy

i have below steps to perform 1.Read a csv file (considerably huge file .. ~100gb)2.add index using zipwithindex function 3.repartition dataframe 4.Passing on to another function .Can you suggest the best optimized caching strategy to execute these c...

vishwanath_1_0-1705915220664.png
  • 1927 Views
  • 1 replies
  • 0 kudos
Latest Reply
Lakshay
Databricks Employee
  • 0 kudos

Hi @vishwanath_1 , Caching only comes into picture when there are multiple reference to data source in your code. As per the flow mentioned by you, I don't see that being the case for you. You are only reading the data from source once and also there...

  • 0 kudos
sudhakargen
by New Contributor II
  • 17813 Views
  • 2 replies
  • 0 kudos

Intermittently unavailable: Maven library com.crealytics:spark-excel_2.12:3.5.0_0.20.3

The issue is that the package com.crealytics:spark-excel_2.12:3.5.0_0.20.3 is intermittently unavailable i.e. most of the times excel import works and few times it fails with exception (org.apache.spark.SparkClassNotFoundException).I have installed m...

  • 17813 Views
  • 2 replies
  • 0 kudos
Latest Reply
sudhakargen
New Contributor II
  • 0 kudos

"Looks like the issue is source is not able to reach" - Can you please let me know what you mean by this.Libraries installed on the databricks cluster are as below, I have a cluster with14.2 version on which I have installed maven library(com.crealyt...

  • 0 kudos
1 More Replies
BartoszBiskupsk
by Databricks Partner
  • 3209 Views
  • 2 replies
  • 0 kudos

"Last Access" information for external delta tables (no UC)

Hi,Is there a way to make audit on all tables in hive_metastore (no UC), all are external, to check when each has been used for the last time (queried / updated / etc). ?

Data Engineering
access logs
  • 3209 Views
  • 2 replies
  • 0 kudos
Latest Reply
CharlesReily
New Contributor III
  • 0 kudos

Apache Ranger or Apache Sentry can be used for auditing Hive activities. If you have set up auditing in one of these tools, you can review the audit logs to see when tables were accessed. Audit logs are typically stored in a separate location, and yo...

  • 0 kudos
1 More Replies
hbs59
by New Contributor III
  • 10613 Views
  • 5 replies
  • 2 kudos

Resolved! Rest API Error 404

I am trying to export a notebook or directory using /api/2.0/workspace/export.When I run /api/2.0/workspace/list with a particular url and path, I get the results that I expect, a list of objects (notebooks and folders) at that location.But when I ru...

  • 10613 Views
  • 5 replies
  • 2 kudos
Latest Reply
Debayan
Databricks Employee
  • 2 kudos

Hi, Could you please remove the parameters , (format and direct_download) and confirm? 

  • 2 kudos
4 More Replies
drii_cavalcanti
by New Contributor III
  • 1710 Views
  • 1 replies
  • 0 kudos

Shared Mode Cluster Permission Issue: Editing Folders Across Users

Hi everyone,Currently, I save logs to a specific folder at the root level in Databricks. However, I need to use a Shared Mode cluster, and it seems I no longer have permission to save to the folder or even open its terminal to access the underlying i...

  • 1710 Views
  • 1 replies
  • 0 kudos
Latest Reply
Debayan
Databricks Employee
  • 0 kudos

Hi, If workspace access control is enabled, by default objects in this folder are private to that user. You can refer to https://docs.databricks.com/en/workspace/workspace-objects.html and let us know if this helps. 

  • 0 kudos
therealDE
by New Contributor II
  • 3279 Views
  • 3 replies
  • 1 kudos

databricks cli error : command >> databricks fs ls # getting error Error: accepts 1 arg(s), received

Hi team, I installed databricks cli on my mac using homebrew, below is the linkhttps://docs.databricks.com/en/dev-tools/cli/install.html#homebrew-installstep1:ran >> databricks configure , configured successfully.however, when I ran I am getting belo...

  • 3279 Views
  • 3 replies
  • 1 kudos
Latest Reply
therealDE
New Contributor II
  • 1 kudos

Thanks for the reply, when I install databricks cli in my windows, it was actually returning some directories even with databricks fs ls.I installed in windows with pip. You think pip install different t from brew install in mac

  • 1 kudos
2 More Replies
Phani1
by Databricks MVP
  • 2267 Views
  • 1 replies
  • 0 kudos

cloud fetch Qlik Sense

 Hi Team,Cloud Fetch will improve data transfer efficiency from DataBricks to Power BI and is it compatible with Qlik Sense as well ?

  • 2267 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Cloud Fetch is a feature introduced in Databricks Runtime 8.3 and Simba ODBC 2.6.17 driver that significantly improves data transfer efficiency from Databricks to BI tools like Power BI. It achieves this by fetching data in parallel via cloud storage...

  • 0 kudos
Labels