cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

KristiLogos
by Visitor
  • 80 Views
  • 8 replies
  • 3 kudos

Resolved! Load parent columns and not unnest using pyspark? Found invalid character(s) ' ,;{}()\n' in schema

I'm not sure I'm working this correctly but I'm having some issues with the column names when I try to load to a table in our databricks catalog. I have multiple .json.gz files in our blob container that I want to load to a table:df = spark.read.opti...

  • 80 Views
  • 8 replies
  • 3 kudos
Latest Reply
szymon_dybczak
Contributor
  • 3 kudos

Hi @KristiLogos ,Check if your JSON doesn't have characters contained in error message in it's key values. 

  • 3 kudos
7 More Replies
valjas
by New Contributor III
  • 1540 Views
  • 2 replies
  • 0 kudos

Warehouse Name in System Tables

Hello.I am creating a table to monitor the usage of All-purpose Compute and SQL Warehouses. From the tables in 'system' catalog, I can get cluster_name and cluster_id. However only warehouse_id is available and not warehouse name. Is there a way to g...

  • 1540 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 0 kudos

Hi @valjas, To monitor and manage SQL warehouses in your Databricks workspace, you can utilize the warehouse events system table. This table records events related to warehouse activity, including when a warehouse starts, stops, scales up, or scales ...

  • 0 kudos
1 More Replies
wendyl
by Visitor
  • 67 Views
  • 3 replies
  • 0 kudos

Connection Refused: [Databricks][JDBC](11640) Required Connection Key(s): PWD;

Hey I'm trying to connect to Databricks using client id and secrets. I'm using JDBC 2.6.38.I'm using the following connection url: jdbc:databricks://<server-hostname>:443;httpPath=<http-path>;AuthMech=11;Auth_Flow=1;OAuth2ClientId=<service-principal-...

  • 67 Views
  • 3 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Contributor
  • 0 kudos

Hi @wendyl ,Could you give as an answer for the following questions? - does your workspace have private link ?- do you use  Microsoft Entra ID managed service principal ?- if you used Entra ID managed SP, did you use secret from Entra ID, or Azure Da...

  • 0 kudos
2 More Replies
abhinandan084
by New Contributor III
  • 19556 Views
  • 21 replies
  • 12 kudos

Resolved! Community Edition signup issues

I am trying to sign up for the community edition (https://databricks.com/try-databricks) for use with a databricks academy course. However, I am unable to signup and I receive the following error (image attached). On going to login page (link in ora...

0693f000007OoQjAAK
  • 19556 Views
  • 21 replies
  • 12 kudos
Latest Reply
brokeTechBro
  • 12 kudos

Hello,I get "An error occurred, try again"I am exhausted from trying... also from solving the puzzle to prove I'm not a robot

  • 12 kudos
20 More Replies
Himanshu4
by New Contributor II
  • 1376 Views
  • 5 replies
  • 2 kudos

Inquiry Regarding Enabling Unity Catalog in Databricks Cluster Configuration via API

Dear Databricks Community,I hope this message finds you well. I am currently working on automating cluster configuration updates in Databricks using the API. As part of this automation, I am looking to ensure that the Unity Catalog is enabled within ...

  • 1376 Views
  • 5 replies
  • 2 kudos
Latest Reply
Himanshu4
New Contributor II
  • 2 kudos

Hi RaphaelCan we fetch job details from one workspace and create new job in new workspace with the same "job id" and configuration?

  • 2 kudos
4 More Replies
mayur_05
by New Contributor II
  • 46 Views
  • 3 replies
  • 0 kudos

access cluster executor logs

Hi Team,I want to get realtime log for cluster executor and driver stderr/stdout log while performing data operations and save those log in catalog's volume

  • 46 Views
  • 3 replies
  • 0 kudos
Latest Reply
gchandra
Esteemed Contributor III
  • 0 kudos

you can create it for Job Clusters compute too. The specific cluster log folder will be under /dbfs/cluster-logs (or whatever you change it to)    

  • 0 kudos
2 More Replies
Direo
by Contributor
  • 276 Views
  • 1 replies
  • 0 kudos

Migrating to Unity Catalog: Read-Only Connections to SQL Server and Snowflake

We are in the process of migrating to Unity Catalog, establishing connections to SQL Server and Snowflake, and creating foreign catalogs that mirror your SQL Server and Snowflake databases. This allows us to leverage Unity Catalog’s query syntax and ...

Data Engineering
UnityCatalog SQLServer Snowflake Governance Permissions
  • 276 Views
  • 1 replies
  • 0 kudos
Latest Reply
brian999
Contributor
  • 0 kudos

We just use SQLAlchemy for connection to Snowflake, which, you're right, does not enable databricks governance.

  • 0 kudos
LasseL
by New Contributor II
  • 500 Views
  • 5 replies
  • 3 kudos

The best practice to remove old data from DLT pipeline created tables

Hi, didn't find any "reasonable" way to clean old data from DLT pipeline tables. In DLT we have used materialized views and streaming tables (scd1, append only). What is the best way to delete old data from the tables (storage size increases linearly...

  • 500 Views
  • 5 replies
  • 3 kudos
Latest Reply
edman
Visitor
  • 3 kudos

If you do a full refresh on that streaming table sourc, that should remove old data.  I am assuming you are feeding this in an scd type 1 which overwrites the data.

  • 3 kudos
4 More Replies
TinasheChinyati
by New Contributor
  • 10138 Views
  • 6 replies
  • 3 kudos

Is databricks capable of housing OLTP and OLAP?

Hi data experts.I currently have an OLTP (Azure SQL DB) that keeps data only for the past 14 days. We use Partition switching to achieve that and have an ETL (Azure data factory) process that feeds the Datawarehouse (Azure Synapse Analytics). My requ...

  • 10138 Views
  • 6 replies
  • 3 kudos
Latest Reply
bsanoop
New Contributor II
  • 3 kudos

@szymon_dybczak Thanks for your explanation. While I understand the limitations of Databricks as a OLTP read system, is there any solution at all that is read-optimized? Like an OLAP layer which optimizes for both aggregation and reads with low laten...

  • 3 kudos
5 More Replies
pesky_chris
by New Contributor
  • 122 Views
  • 4 replies
  • 0 kudos

Problem with SQL Warehouse (Serverless)

I get the following error message on the attempt to use SQL Warehouse (Serverless) compute with Materialized Views (a simple interaction, e.g. DML, UI sample lookup). The MVs are created off the back of Federated Tables (Postgresql), MVs are created ...

  • 122 Views
  • 4 replies
  • 0 kudos
Latest Reply
pesky_chris
New Contributor
  • 0 kudos

Hey,To clarify, as I think I'm potentially hitting Databricks unintended "functionality".Materialised Views are managed by DLT pipeline, which was deployed with DABs off CI/CD pipeline,DLT Pipeline runs a notebook with Python code creating MVs dynami...

  • 0 kudos
3 More Replies
TheManOfSteele
by New Contributor III
  • 41 Views
  • 2 replies
  • 0 kudos

Resolved! Databricks-connect Configure a connection to serverless compute Not working

Following these instructions, at https://docs.databricks.com/en/dev-tools/databricks-connect/python/install.html#configure-a-connection-to-serverless-compute There seems to be an issue with the example code.from databricks.connect import DatabricksSe...

  • 41 Views
  • 2 replies
  • 0 kudos
Latest Reply
TheManOfSteele
New Contributor III
  • 0 kudos

Worked! Thank you!

  • 0 kudos
1 More Replies
Dave_Nithio
by Contributor
  • 46 Views
  • 1 replies
  • 0 kudos

Delta Table Log History not Updating

I am running into an issue related to my Delta Log and an old version. I currently have default delta settings for delta.checkpointInterval (10 commits as this table was created prior to DBR 11.1), delta.deletedFileRetentionDuration (7 days), and del...

Dave_Nithio_4-1726759906146.png Dave_Nithio_2-1726759822867.png Dave_Nithio_1-1726759722776.png Dave_Nithio_5-1726760080078.png
  • 46 Views
  • 1 replies
  • 0 kudos
Latest Reply
jennie258fitz
New Contributor
  • 0 kudos

@Dave_Nithio wrote:I am running into an issue related to my Delta Log and an old version. I currently have default delta settings for delta.checkpointInterval (10 commits as this table was created prior to DBR 11.1), delta.deletedFileRetentionDuratio...

  • 0 kudos
hpant
by New Contributor III
  • 92 Views
  • 1 replies
  • 0 kudos

" ResourceNotFound" error is coming on connecting devops repo to databricks workflow(job).

I have a .py file in a repo in azure devops,I want to add it in a workflow in databricks and these are the values I have provided. And the source is this:I have provided all the values correctly but getting this error: " ResourceNotFound". Can someon...

hpant_0-1725539147316.png hpant_2-1725539295054.png hpant_3-1725539358879.png
  • 92 Views
  • 1 replies
  • 0 kudos
Latest Reply
nicole_lu_PM
Esteemed Contributor III
  • 0 kudos

Can you try cloning the DevOps repo as a Git folder? The git folder clone interface should ask you to set up a Git credential if it's not already there.

  • 0 kudos
Kibour
by Contributor
  • 5186 Views
  • 3 replies
  • 3 kudos

Resolved! Import from repo

Hi all,I am trying the new "git folder" feature, with a repo that works fine from the "Repos". In the new folder location, my imports from my own repo don't work anymore. Anyone faced something similar?Thanks in advance for sharing your experience

  • 5186 Views
  • 3 replies
  • 3 kudos
Latest Reply
nicole_lu_PM
Esteemed Contributor III
  • 3 kudos

Hi from the Git folder product manager -  If you use DBR 14.3+, the Git folder root is automatically added to the Python `sys.path`. This is documented here.  Unfortunately we could not backport this behavior to earlier DBR versions. Hope this helps!...

  • 3 kudos
2 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels
Top Kudoed Authors