cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Johannes_E
by New Contributor III
  • 4742 Views
  • 1 replies
  • 1 kudos

Resolved! How to develop with databricks connect smoothly?

We are working with Databrick Connect and Visual Studio Code in our project. We mainly want to program in the IDE (VS Code) so that we can use the advantages of the IDE compared to notebooks. Therefore, we write most of the code in .py files and actu...

  • 4742 Views
  • 1 replies
  • 1 kudos
Latest Reply
ChrisChieu
Databricks Employee
  • 1 kudos

You can set break points and debug within notebook cells. There's an example in this DAIS talk at 15:27. I recommend the entire talk as a demo. To complete the point, here is an additional documentation about notebook cells debugging with Databricks ...

  • 1 kudos
dbuserng
by New Contributor II
  • 2912 Views
  • 1 replies
  • 0 kudos

Trigger Databricks Workflow when other workflows succeeded

Hi,I have 3 separate workflows with 3 different triggers and the thing I would like to achieve is - after all of these 3 jobs completed & succeeded I would like to trigger another job. Is it possible?These 3 jobs have to stay separate (I cannot combi...

  • 2912 Views
  • 1 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @dbuserng, It is possible but it requires custom code setup based on your use-case. You can use job REST API: https://docs.databricks.com/api/workspace/jobs. Create a Monitoring Job: Set up a job that will monitor the completion status of the thre...

  • 0 kudos
priyansh
by Databricks Partner
  • 8903 Views
  • 10 replies
  • 0 kudos

Error in migration with UCX tool

Hey folks!I am facing an issue while migration the tables from Hive to UC using UCX tool; after completely running the setup and getting the assessment overview, we ran the following commands i.e.  "databricks labs ucx create-table-mapping" but after...

  • 8903 Views
  • 10 replies
  • 0 kudos
Latest Reply
Akash_Wadhankar
Databricks Partner
  • 0 kudos

While running the UCX tool in workspace we are not able to access the tables which are created in hive metastore on default schema. When we run the migrate-table workflow we get the error that the tables are not accessible. We get the following error...

  • 0 kudos
9 More Replies
zmsoft
by Contributor
  • 1863 Views
  • 4 replies
  • 0 kudos

Resolved! How to load PowerBI Dataset into databricks

Hi there, I would like to know how to load power bi dataset into databricks Thanks&Regards, zmsoft

  • 1863 Views
  • 4 replies
  • 0 kudos
Latest Reply
jack533
New Contributor III
  • 0 kudos

I don't think it's possible. While loading a table from DataBricks into a PowerBI dataset is possible, the opposite is not true.

  • 0 kudos
3 More Replies
Prashanth24
by New Contributor III
  • 3914 Views
  • 3 replies
  • 1 kudos

Error connecting Databricks Notebook using managed identity from Azure Data Factory

I am trying to connect Databricks Notebook using managed identity authentication type from Azure Data Factory. Below are the settings done. Error message is appended at the bottom of this message. With the same settings but with different authenticat...

  • 3914 Views
  • 3 replies
  • 1 kudos
Latest Reply
Hesareal
New Contributor II
  • 1 kudos

Did you manage to solve it?I am getting the same error calling DBX REST API from ADF with System-assigned Managed Identity. The Databricks workspace is added a linked service and there no problem to run notebooks.

  • 1 kudos
2 More Replies
subhas_hati
by New Contributor
  • 2509 Views
  • 1 replies
  • 2 kudos

Change Data Capture(CDC)

May I know what is a Change Data Capture ?

  • 2509 Views
  • 1 replies
  • 2 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 2 kudos

Hi @subhas_hati ,Change data capture (CDC) is a data integration pattern that captures changes made to data in a source system, such as inserts, updates, and deletes. These changes, represented as a list, are commonly referred to as a CDC feed. You c...

  • 2 kudos
self-employed
by Contributor
  • 15265 Views
  • 9 replies
  • 8 kudos

The log in function and password reset function in the community edition do not work

I want to register a databrick account. I already set up my account. I also receive the email to set my password. However, I cannot use my password to log in the community account. I can use it to log in my standard account. I also click the reset th...

  • 15265 Views
  • 9 replies
  • 8 kudos
Latest Reply
Anonymous
Not applicable
  • 8 kudos

Hello, @lawrance Zhang​ - I wanted you to know that this isn't the first time we've heard of this recently. Thank you for opening a ticket. We've also escalated this to the team. We'll get there.

  • 8 kudos
8 More Replies
210573
by New Contributor
  • 4166 Views
  • 4 replies
  • 2 kudos

Unable to stream from google pub/sub

I am trying to run below for subscribing to a pubsub but this code is throwing this exception java.lang.NoClassDefFoundError: org/apache/spark/sql/sources/v2/DataSourceV2I have tried using all versions of https://mvnrepository.com/artifact/com.google...

  • 4166 Views
  • 4 replies
  • 2 kudos
Latest Reply
davidkhala-ms
New Contributor II
  • 2 kudos

I see some issues from using pubsub as source. in the writeStream, both .foreach or .foreachBatch cannot work to be called when stream data arrives

  • 2 kudos
3 More Replies
sujitmk77
by New Contributor II
  • 1640 Views
  • 2 replies
  • 0 kudos

PySpark JSON read with strict schema check and mark the valid and invalid records based on the non-n

Hi,I have a use case where I have to read the JSON files from "/data/json_files/" location with schema enforced.For the completeness we want to mark the invalid records. The invalid records may be the ones where the mandatory field/s are null, data t...

  • 1640 Views
  • 2 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @sujitmk77, You have to ensure that valid records are processed while invalid records are marked appropriately, you can use the following PySpark code. This code reads the JSON files with schema enforcement and handles invalid records by marking t...

  • 0 kudos
1 More Replies
Rita
by New Contributor III
  • 11720 Views
  • 7 replies
  • 6 kudos

How to connect Cognos 11.1.7 to Azure Databricks

We are trying to connect Cognos 11.1.7 to Azure Databricks, but no success.Can you please help or guide us how to connect Cognos 11.1.7 to Azure Databricks.This is very critical to our user community. Can you please help or guide us how to connect Co...

  • 11720 Views
  • 7 replies
  • 6 kudos
Latest Reply
Hans2
New Contributor II
  • 6 kudos

Have anyone got the Simba JDBC driver going with CA 11.1.7? The ODBC driver works fine but i  can't get the JDBC running.Regd's

  • 6 kudos
6 More Replies
prathameshJoshi
by Databricks Partner
  • 9941 Views
  • 10 replies
  • 7 kudos

Resolved! How to obtain the server url for using spark's REST API

Hi,I want to access the stage and job information (usually available through Spark UI) through the REST API provided by Spark: http://<server-url>:18080/api/v1/applications/[app-id]/stages. More information can be found at following link: https://spa...

  • 9941 Views
  • 10 replies
  • 7 kudos
Latest Reply
prathameshJoshi
Databricks Partner
  • 7 kudos

Hi @Retired_mod  and @menotron ,Thanks a lot; your solutions are working. I apologise for the delay, as I had some issue logging in.

  • 7 kudos
9 More Replies
jeremy98
by Honored Contributor
  • 2446 Views
  • 1 replies
  • 0 kudos

Resolved! how read through jdbc from postgres to databricks a particular data type

Hi Community,I need to load data from PostgreSQL into Databricks through JDBC without changing the data type of a VARCHAR[]column in PostgreSQL, which should remain as an array of strings in Databricks.Previously, I used psycopg2, and it worked, but ...

  • 2446 Views
  • 1 replies
  • 0 kudos
Latest Reply
jeremy98
Honored Contributor
  • 0 kudos

Hi community,Yesterday, I found a solution. This is to query through jdbc from postgres creating two columns that are manageable in databricks. Here the code: query = f"""(SELECT *, array_to_string(columns_to_export, ',') AS columns_to_export_strin...

  • 0 kudos
asurendran
by New Contributor III
  • 2379 Views
  • 7 replies
  • 2 kudos

Some records are missing after window function

While loading data from one layer to another layer using pyspark window function, I noticed that some data is missing. This is happening if the data is huge. It's not happening for small quantity. Does anyone come across this issue before?

  • 2379 Views
  • 7 replies
  • 2 kudos
Latest Reply
asurendran
New Contributor III
  • 2 kudos

Is there a way caching the dataframe helps to fix this issue?

  • 2 kudos
6 More Replies
busuu
by New Contributor II
  • 2152 Views
  • 3 replies
  • 1 kudos

Failed to checkout Git repository: RESOURCE_DOES_NOT_EXIST: Attempted to move non-existing node

I'm having issues with checking out Git repo in Workflows. Databricks can access files from commit `a` but fails to checkout the branch when attempting to access commit `b`. The error occurs specifically when trying to checkout commit `b`, and Databr...

busuu_0-1738776211583.png
  • 2152 Views
  • 3 replies
  • 1 kudos
Latest Reply
Augustus
Databricks Partner
  • 1 kudos

I didn't do anything to fix it. Databricks support did something to my workspace to fix the issue. 

  • 1 kudos
2 More Replies
ohnomydata
by New Contributor
  • 3450 Views
  • 1 replies
  • 0 kudos

Accidentally deleted files via API

Hello,I’m hoping you might be able to help me.I have accidentally deleted some Workspace files via API (an Azure DevOps code deployment pipeline). I can’t see the files in my Trash folder – are they gone forever, or is it possible to recover them on ...

  • 3450 Views
  • 1 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hello @ohnomydata, Unfortunately files deleted via APIs or the Databricks CLI are permanently deleted and do not move to the Trash folder. The Trash folder is a UI-only feature, and items deleted through the UI can be recovered from the Trash within ...

  • 0 kudos
Labels