cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Prashanth24
by New Contributor III
  • 3556 Views
  • 3 replies
  • 1 kudos

Error connecting Databricks Notebook using managed identity from Azure Data Factory

I am trying to connect Databricks Notebook using managed identity authentication type from Azure Data Factory. Below are the settings done. Error message is appended at the bottom of this message. With the same settings but with different authenticat...

  • 3556 Views
  • 3 replies
  • 1 kudos
Latest Reply
Hesareal
New Contributor II
  • 1 kudos

Did you manage to solve it?I am getting the same error calling DBX REST API from ADF with System-assigned Managed Identity. The Databricks workspace is added a linked service and there no problem to run notebooks.

  • 1 kudos
2 More Replies
subhas_hati
by New Contributor
  • 2251 Views
  • 1 replies
  • 1 kudos

Change Data Capture(CDC)

May I know what is a Change Data Capture ?

  • 2251 Views
  • 1 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @subhas_hati ,Change data capture (CDC) is a data integration pattern that captures changes made to data in a source system, such as inserts, updates, and deletes. These changes, represented as a list, are commonly referred to as a CDC feed. You c...

  • 1 kudos
self-employed
by Contributor
  • 14753 Views
  • 9 replies
  • 8 kudos

The log in function and password reset function in the community edition do not work

I want to register a databrick account. I already set up my account. I also receive the email to set my password. However, I cannot use my password to log in the community account. I can use it to log in my standard account. I also click the reset th...

  • 14753 Views
  • 9 replies
  • 8 kudos
Latest Reply
Anonymous
Not applicable
  • 8 kudos

Hello, @lawrance Zhang​ - I wanted you to know that this isn't the first time we've heard of this recently. Thank you for opening a ticket. We've also escalated this to the team. We'll get there.

  • 8 kudos
8 More Replies
210573
by New Contributor
  • 4055 Views
  • 4 replies
  • 2 kudos

Unable to stream from google pub/sub

I am trying to run below for subscribing to a pubsub but this code is throwing this exception java.lang.NoClassDefFoundError: org/apache/spark/sql/sources/v2/DataSourceV2I have tried using all versions of https://mvnrepository.com/artifact/com.google...

  • 4055 Views
  • 4 replies
  • 2 kudos
Latest Reply
davidkhala-ms
New Contributor II
  • 2 kudos

I see some issues from using pubsub as source. in the writeStream, both .foreach or .foreachBatch cannot work to be called when stream data arrives

  • 2 kudos
3 More Replies
sujitmk77
by New Contributor II
  • 1387 Views
  • 2 replies
  • 0 kudos

PySpark JSON read with strict schema check and mark the valid and invalid records based on the non-n

Hi,I have a use case where I have to read the JSON files from "/data/json_files/" location with schema enforced.For the completeness we want to mark the invalid records. The invalid records may be the ones where the mandatory field/s are null, data t...

  • 1387 Views
  • 2 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @sujitmk77, You have to ensure that valid records are processed while invalid records are marked appropriately, you can use the following PySpark code. This code reads the JSON files with schema enforcement and handles invalid records by marking t...

  • 0 kudos
1 More Replies
Rita
by New Contributor III
  • 11355 Views
  • 7 replies
  • 6 kudos

How to connect Cognos 11.1.7 to Azure Databricks

We are trying to connect Cognos 11.1.7 to Azure Databricks, but no success.Can you please help or guide us how to connect Cognos 11.1.7 to Azure Databricks.This is very critical to our user community. Can you please help or guide us how to connect Co...

  • 11355 Views
  • 7 replies
  • 6 kudos
Latest Reply
Hans2
New Contributor II
  • 6 kudos

Have anyone got the Simba JDBC driver going with CA 11.1.7? The ODBC driver works fine but i  can't get the JDBC running.Regd's

  • 6 kudos
6 More Replies
prathameshJoshi
by New Contributor III
  • 8964 Views
  • 10 replies
  • 7 kudos

Resolved! How to obtain the server url for using spark's REST API

Hi,I want to access the stage and job information (usually available through Spark UI) through the REST API provided by Spark: http://<server-url>:18080/api/v1/applications/[app-id]/stages. More information can be found at following link: https://spa...

  • 8964 Views
  • 10 replies
  • 7 kudos
Latest Reply
prathameshJoshi
New Contributor III
  • 7 kudos

Hi @Retired_mod  and @menotron ,Thanks a lot; your solutions are working. I apologise for the delay, as I had some issue logging in.

  • 7 kudos
9 More Replies
jeremy98
by Honored Contributor
  • 2046 Views
  • 1 replies
  • 0 kudos

Resolved! how read through jdbc from postgres to databricks a particular data type

Hi Community,I need to load data from PostgreSQL into Databricks through JDBC without changing the data type of a VARCHAR[]column in PostgreSQL, which should remain as an array of strings in Databricks.Previously, I used psycopg2, and it worked, but ...

  • 2046 Views
  • 1 replies
  • 0 kudos
Latest Reply
jeremy98
Honored Contributor
  • 0 kudos

Hi community,Yesterday, I found a solution. This is to query through jdbc from postgres creating two columns that are manageable in databricks. Here the code: query = f"""(SELECT *, array_to_string(columns_to_export, ',') AS columns_to_export_strin...

  • 0 kudos
asurendran
by New Contributor III
  • 1979 Views
  • 7 replies
  • 2 kudos

Some records are missing after window function

While loading data from one layer to another layer using pyspark window function, I noticed that some data is missing. This is happening if the data is huge. It's not happening for small quantity. Does anyone come across this issue before?

  • 1979 Views
  • 7 replies
  • 2 kudos
Latest Reply
asurendran
New Contributor III
  • 2 kudos

Is there a way caching the dataframe helps to fix this issue?

  • 2 kudos
6 More Replies
busuu
by New Contributor II
  • 1936 Views
  • 3 replies
  • 1 kudos

Failed to checkout Git repository: RESOURCE_DOES_NOT_EXIST: Attempted to move non-existing node

I'm having issues with checking out Git repo in Workflows. Databricks can access files from commit `a` but fails to checkout the branch when attempting to access commit `b`. The error occurs specifically when trying to checkout commit `b`, and Databr...

busuu_0-1738776211583.png
  • 1936 Views
  • 3 replies
  • 1 kudos
Latest Reply
Augustus
New Contributor II
  • 1 kudos

I didn't do anything to fix it. Databricks support did something to my workspace to fix the issue. 

  • 1 kudos
2 More Replies
ohnomydata
by New Contributor
  • 3008 Views
  • 1 replies
  • 0 kudos

Accidentally deleted files via API

Hello,I’m hoping you might be able to help me.I have accidentally deleted some Workspace files via API (an Azure DevOps code deployment pipeline). I can’t see the files in my Trash folder – are they gone forever, or is it possible to recover them on ...

  • 3008 Views
  • 1 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hello @ohnomydata, Unfortunately files deleted via APIs or the Databricks CLI are permanently deleted and do not move to the Trash folder. The Trash folder is a UI-only feature, and items deleted through the UI can be recovered from the Trash within ...

  • 0 kudos
pradeepvatsvk
by New Contributor III
  • 1627 Views
  • 2 replies
  • 0 kudos

polars to natively read and write through adls

HI Everyone,Is there a way polars can directly read files from ADLS  through abfss protocol.

  • 1627 Views
  • 2 replies
  • 0 kudos
Latest Reply
jennifer986bloc
New Contributor II
  • 0 kudos

@pradeepvatsvk wrotae:HI Everyone,Is there a way polars can directly read files from ADLS  through abfss protocol.Hello @pradeepvatsvk,Yes, Polars can directly read files from Azure Data Lake Storage (ADLS) using the ABFS (Azure Blob Filesystem) prot...

  • 0 kudos
1 More Replies
Rafael-Sousa
by Contributor II
  • 1282 Views
  • 3 replies
  • 0 kudos

Managed Delta Table corrupted

Hey guys,Recently, we have add some properties to our delta table and after that, the table shows error and we cannot do anything. The error is that: (java.util.NoSuchElementException) key not found: spark.sql.statistics.totalSizeI think maybe this i...

  • 1282 Views
  • 3 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @Rafael-Sousa, Could you please raise a support case for this, to investigate this further? help@databricks.com

  • 0 kudos
2 More Replies
samtech
by New Contributor
  • 708 Views
  • 1 replies
  • 1 kudos

DAB multiple workspaces

Hi,We have 3 regional workspaces. Assume that we keep seperate folder for notebook say amer/xx , apac/xx, emea/xx and sepeate job/pipeline configrations for each region in git how to make sure during deploy appropriate job/pipleines are deployed in r...

  • 708 Views
  • 1 replies
  • 1 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 1 kudos

Hi @samtech, Define separate bundle configuration files for each region. These configuration files will specify the resources (notebooks, jobs, pipelines) and their respective paths. For example, you can have amer_bundle.yml, apac_bundle.yml, and eme...

  • 1 kudos
BriGuy
by New Contributor II
  • 1698 Views
  • 2 replies
  • 0 kudos

create a one off job run using databricks SDK.

I'm trying to build the job spec using objects.  When I try to call execute the job I get the following error.I'm somewhat new to python and not sure what I'm doing wrong here.  Is anyone able to help?Traceback (most recent call last): File "y:\My ...

  • 1698 Views
  • 2 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @BriGuy, Can you try importing this module first? from databricks.sdk.service.jobs import PermissionLevel

  • 0 kudos
1 More Replies
Labels