cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

self-employed
by Contributor
  • 10094 Views
  • 9 replies
  • 7 kudos

The log in function and password reset function in the community edition do not work

I want to register a databrick account. I already set up my account. I also receive the email to set my password. However, I cannot use my password to log in the community account. I can use it to log in my standard account. I also click the reset th...

  • 10094 Views
  • 9 replies
  • 7 kudos
Latest Reply
Anonymous
Not applicable
  • 7 kudos

Hello, @lawrance Zhang​ - I wanted you to know that this isn't the first time we've heard of this recently. Thank you for opening a ticket. We've also escalated this to the team. We'll get there.

  • 7 kudos
8 More Replies
210573
by New Contributor
  • 2798 Views
  • 4 replies
  • 2 kudos

Unable to stream from google pub/sub

I am trying to run below for subscribing to a pubsub but this code is throwing this exception java.lang.NoClassDefFoundError: org/apache/spark/sql/sources/v2/DataSourceV2I have tried using all versions of https://mvnrepository.com/artifact/com.google...

  • 2798 Views
  • 4 replies
  • 2 kudos
Latest Reply
davidkhala-ms
New Contributor II
  • 2 kudos

I see some issues from using pubsub as source. in the writeStream, both .foreach or .foreachBatch cannot work to be called when stream data arrives

  • 2 kudos
3 More Replies
mbravo-nextport
by New Contributor
  • 35 Views
  • 1 replies
  • 0 kudos

Unity Catalog for medallion architecture

Hello community. I need help to define the most suitable approach for Unity Catalog. I have the following storage architecture in Azure Data Lake Storage. I have data from different clientsI work with 3 different environments for each client: dev, pr...

  • 35 Views
  • 1 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hello @mbravo-nextport, Create a separate catalog for each client to logically isolate their data. This helps in managing permissions and organizing data efficiently. Within each catalog, create schemas for each environment (dev, pre, pro). This will...

  • 0 kudos
sujitmk77
by New Contributor
  • 39 Views
  • 2 replies
  • 0 kudos

PySpark JSON read with strict schema check and mark the valid and invalid records based on the non-n

Hi,I have a use case where I have to read the JSON files from "/data/json_files/" location with schema enforced.For the completeness we want to mark the invalid records. The invalid records may be the ones where the mandatory field/s are null, data t...

  • 39 Views
  • 2 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @sujitmk77, You have to ensure that valid records are processed while invalid records are marked appropriately, you can use the following PySpark code. This code reads the JSON files with schema enforcement and handles invalid records by marking t...

  • 0 kudos
1 More Replies
Rita
by New Contributor III
  • 8019 Views
  • 7 replies
  • 6 kudos

How to connect Cognos 11.1.7 to Azure Databricks

We are trying to connect Cognos 11.1.7 to Azure Databricks, but no success.Can you please help or guide us how to connect Cognos 11.1.7 to Azure Databricks.This is very critical to our user community. Can you please help or guide us how to connect Co...

  • 8019 Views
  • 7 replies
  • 6 kudos
Latest Reply
Hans2
Visitor
  • 6 kudos

Have anyone got the Simba JDBC driver going with CA 11.1.7? The ODBC driver works fine but i  can't get the JDBC running.Regd's

  • 6 kudos
6 More Replies
prathameshJoshi
by New Contributor III
  • 3005 Views
  • 8 replies
  • 6 kudos

Resolved! How to obtain the server url for using spark's REST API

Hi,I want to access the stage and job information (usually available through Spark UI) through the REST API provided by Spark: http://<server-url>:18080/api/v1/applications/[app-id]/stages. More information can be found at following link: https://spa...

  • 3005 Views
  • 8 replies
  • 6 kudos
Latest Reply
prathameshJoshi
New Contributor III
  • 6 kudos

Hi @Retired_mod  and @menotron ,Thanks a lot; your solutions are working. I apologise for the delay, as I had some issue logging in.

  • 6 kudos
7 More Replies
jeremy98
by Contributor
  • 36 Views
  • 1 replies
  • 0 kudos

Resolved! how read through jdbc from postgres to databricks a particular data type

Hi Community,I need to load data from PostgreSQL into Databricks through JDBC without changing the data type of a VARCHAR[]column in PostgreSQL, which should remain as an array of strings in Databricks.Previously, I used psycopg2, and it worked, but ...

  • 36 Views
  • 1 replies
  • 0 kudos
Latest Reply
jeremy98
Contributor
  • 0 kudos

Hi community,Yesterday, I found a solution. This is to query through jdbc from postgres creating two columns that are manageable in databricks. Here the code: query = f"""(SELECT *, array_to_string(columns_to_export, ',') AS columns_to_export_strin...

  • 0 kudos
hiryucodes
by New Contributor
  • 113 Views
  • 2 replies
  • 1 kudos

ModuleNotFound when running DLT pipeline

My new DLT pipeline gives me a ModuleNotFound error when I try to request data from an API. For some more context, I develop in my local IDE and then deploy to databricks using asset bundles. The pipeline runs fine if I try to write a static datafram...

  • 113 Views
  • 2 replies
  • 1 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 1 kudos

Hi @hiryucodes, Ensure that the directory structure of your project is correctly set up. The module 'src' should be in a directory that is part of the Python path. For example, if your module is in a directory named 'src', the directory structure sho...

  • 1 kudos
1 More Replies
sachin_kanchan
by New Contributor
  • 94 Views
  • 4 replies
  • 0 kudos

Unable to log in into Community Edition

So I just registered for the Databricks Community Edition. And received an email for verification.When I click the link, I'm redirected to this website (image attached) where I am asked to input email. And when I do that, it sends me a verification c...

db_fail.png
  • 94 Views
  • 4 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Lets open a case with databricks-community@databricks.com

  • 0 kudos
3 More Replies
SteveC527
by New Contributor
  • 459 Views
  • 3 replies
  • 0 kudos

Medallion Architecture and Databricks Assistant

I am in the process of rebuilding the data lake at my current company with databricks and I'm struggling to find comprehensive best practices for naming conventions and structuring medallion architecture to work optimally with the Databricks assistan...

  • 459 Views
  • 3 replies
  • 0 kudos
Latest Reply
dataBuilder
New Contributor
  • 0 kudos

Hello!I am in a similar position and the medallion architecture makes a lot of sense to me (indeed, I believe we've been following a version of that ourselves for a long time).It seems to me having separate catalogs for each layer (bronze/silver/gold...

  • 0 kudos
2 More Replies
busuu
by New Contributor
  • 144 Views
  • 3 replies
  • 1 kudos

Failed to checkout Git repository: RESOURCE_DOES_NOT_EXIST: Attempted to move non-existing node

I'm having issues with checking out Git repo in Workflows. Databricks can access files from commit `a` but fails to checkout the branch when attempting to access commit `b`. The error occurs specifically when trying to checkout commit `b`, and Databr...

busuu_0-1738776211583.png
  • 144 Views
  • 3 replies
  • 1 kudos
Latest Reply
Augustus
New Contributor II
  • 1 kudos

I didn't do anything to fix it. Databricks support did something to my workspace to fix the issue. 

  • 1 kudos
2 More Replies
MarkV
by New Contributor III
  • 496 Views
  • 5 replies
  • 0 kudos

DLT, Automatic Schema Evolution and Type Widening

I'm attempting to run a DLT pipeline that uses automatic schema evolution against tables that have type widening enabled.I have code in this notebook that is a list of tables to create/update along with the schema for those tables. This list and spar...

  • 496 Views
  • 5 replies
  • 0 kudos
Latest Reply
Sidhant07
Databricks Employee
  • 0 kudos

Alternatively, you can try using the INSERT INTO statement directly: def load_snapshot_tables(source_system_name, source_schema_name, table_name, spark_schema, select_expression): snapshot_load_df = ( spark.readStream .format("clou...

  • 0 kudos
4 More Replies
ohnomydata
by New Contributor
  • 48 Views
  • 1 replies
  • 0 kudos

Accidentally deleted files via API

Hello,I’m hoping you might be able to help me.I have accidentally deleted some Workspace files via API (an Azure DevOps code deployment pipeline). I can’t see the files in my Trash folder – are they gone forever, or is it possible to recover them on ...

  • 48 Views
  • 1 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hello @ohnomydata, Unfortunately files deleted via APIs or the Databricks CLI are permanently deleted and do not move to the Trash folder. The Trash folder is a UI-only feature, and items deleted through the UI can be recovered from the Trash within ...

  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels