cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

nkrom456
by New Contributor III
  • 320 Views
  • 1 replies
  • 0 kudos

Material View to External Delta Table using sink api

Hi Team,While executing the below code i am able to create the sink and my data is getting written into delta tables from materialized view. import dlt@Dlt.table(name = "employee_bronze3")def create_table():df = spark.read.table("dev.default.employee...

  • 320 Views
  • 1 replies
  • 0 kudos
Latest Reply
Brahmareddy
Esteemed Contributor
  • 0 kudos

Hi nkrom456,How are you doing today? as per my understanding, when you use dlt.read_stream() inside the same DLT pipeline, Databricks allows it to stream from that materialized view because everything is being managed within one pipeline — it underst...

  • 0 kudos
saicharandeepb
by New Contributor II
  • 61 Views
  • 1 replies
  • 0 kudos

Accessing Spark Runtime Metrics Using PySpark – Seeking Best Practices

Hi everyone,I’m currently working on a solution to access Spark runtime metrics for better monitoring and analysis of our workloads.From my research, I understand that this can be implemented using SparkListener, which is a JVM interface available in...

  • 61 Views
  • 1 replies
  • 0 kudos
Latest Reply
Brahmareddy
Esteemed Contributor
  • 0 kudos

Hi saicharandeepb,How are you doing today? as per my understanding, since SparkListener is native to Scala/Java, getting detailed runtime metrics in PySpark can be tricky, but there are some workarounds. If you need deep metrics (like stage-level and...

  • 0 kudos
pop_smoke
by Visitor
  • 126 Views
  • 8 replies
  • 6 kudos

Resolved! write file as csv format

Is there any simple pyspark syntax to write data in csv format into a file or anywhere in free edition of databrick? in community edition , it was so easy  

  • 126 Views
  • 8 replies
  • 6 kudos
Latest Reply
BS_THE_ANALYST
Esteemed Contributor
  • 6 kudos

@pop_smoke no worries! My background is with Alteryx (ETL tool). I too am learning Databricks . I look forward to seeing you in the forum ☺️. Please share any cool things you find or any projects you do .All the best,BS

  • 6 kudos
7 More Replies
pop_smoke
by Visitor
  • 72 Views
  • 3 replies
  • 4 kudos

Resolved! switching to Databrick from Ab Initio (an old ETL software)- NEED ADVICE

All courses in market and on youtube as per my knowledge for databrick is outdated as those courses are for community edition. there is no new course for free edition of databrick. i am a working profession and i do not get much time. do you guys kno...

  • 72 Views
  • 3 replies
  • 4 kudos
Latest Reply
BS_THE_ANALYST
Esteemed Contributor
  • 4 kudos

@pop_smoke keep your eyes out for this aswell:I just saw this on Linkedin:https://www.linkedin.com/posts/databricks_join-the-databricks-virtual-learning-festival-activity-7370143251149996032-PmjH?utm_source=share&utm_medium=member_desktop&rcm=ACoAAB_...

  • 4 kudos
2 More Replies
PabloCSD
by Valued Contributor II
  • 77 Views
  • 1 replies
  • 0 kudos

How to configure a Job-Compute for Unity Catalog Access? (Q/A)

If you need to access tables that are in a volume of Unity Catalog (UC), with the following configuration will work:targets: dev: mode: development default: true workspace: host: https://<workspace>.azuredatabricks.net/ run_as...

  • 77 Views
  • 1 replies
  • 0 kudos
Latest Reply
Khaja_Zaffer
Contributor
  • 0 kudos

Hello @PabloCSD Good day!Are you asking or like what are you expectations?Additions to this: You cannot create or register tables (managed or external) with locations pointing to volumes, as this is explicitly not supported—tables must use tabular st...

  • 0 kudos
Espenol1
by New Contributor II
  • 10850 Views
  • 5 replies
  • 2 kudos

Resolved! Using managed identities to access SQL server - how?

Hello! My company wants us to only use managed identities for authentication. We have set up Databricks using Terraform, got Unity Catalog and everything, but we're a very small team and I'm struggling to control permissions outside of Unity Catalog....

  • 10850 Views
  • 5 replies
  • 2 kudos
Latest Reply
vr
Contributor III
  • 2 kudos

As of today, you can use https://learn.microsoft.com/en-us/azure/databricks/connect/unity-catalog/cloud-services/service-credentials

  • 2 kudos
4 More Replies
Khaja_Zaffer
by Contributor
  • 197 Views
  • 9 replies
  • 3 kudos

CONTAINER_LAUNCH_FAILURE

Hello everyone!I need some help, unable to get cluster up and running. I did try creating classic compute but fails, is there any limit to use databricks community edition? Error here: { "reason": { "code": "CONTAINER_LAUNCH_FAILURE", "type...

  • 197 Views
  • 9 replies
  • 3 kudos
Latest Reply
BS_THE_ANALYST
Esteemed Contributor
  • 3 kudos

@Khaja_Zaffer , if your current e-mail address is currently preventing you from leveraging the Free Edition, can't you just use a different e-mail address for the Free Edition?@Advika's advice is the better route to go through ☺️.All the best,BS

  • 3 kudos
8 More Replies
santhiya
by New Contributor
  • 896 Views
  • 2 replies
  • 0 kudos

CPU usage and idle time metrics from system tables

I need to get my compute metric, not from the UI...the system tables has not much informations, node_timeline has per minute record metric so it's difficult to calculate each compute CPU usage per day. Any way we can get the CPU usage,CPU idle time,M...

  • 896 Views
  • 2 replies
  • 0 kudos
Latest Reply
BigRoux
Databricks Employee
  • 0 kudos

To calculate CPU usage, CPU idle time, and memory usage per cluster per day, you can use the system.compute.node_timeline system table. However, since the data in this table is recorded at per-minute granularity, it’s necessary to aggregate the data ...

  • 0 kudos
1 More Replies
fly_high_five
by New Contributor
  • 257 Views
  • 5 replies
  • 1 kudos

Resolved! Unable to retrieve all rows of delta table using SQL endpoint of Interactive Cluster

Hi,I am trying to query a table using JDBC endpoint of Interactive Cluster. I am connected to JDBC endpoint using DBeaver. When I export a small subset of data 2000-8000 rows, it works fine and export the data. However, when I try to export all rows ...

  • 257 Views
  • 5 replies
  • 1 kudos
Latest Reply
WiliamRosa
New Contributor II
  • 1 kudos

Hi @fly_high_five,I found these references about this situation, see if they help you: increase the SocketTimeout in JDBC (Databricks KB “Best practices when using JDBC with Databricks SQL” – https://kb.databricks.com/dbsql/job-timeout-when-connectin...

  • 1 kudos
4 More Replies
fly_high_five
by New Contributor
  • 286 Views
  • 4 replies
  • 1 kudos

Resolved! Exposing Data for Consumers in non-UC ADB

Hi,I want to expose data to consumers from our non-UC ADB. Consumers would be consuming data mainly using SQL client like DBeaver.  I tried SQL endpoint of Interactive Cluster and connected via DBeaver however when I try to fetch/export all rows of t...

  • 286 Views
  • 4 replies
  • 1 kudos
Latest Reply
fly_high_five
New Contributor
  • 1 kudos

Hi @szymon_dybczak I am using latest JDBC driver 2.7.3 https://www.databricks.com/spark/jdbc-drivers-archiveAnd my JDBC url comes from JDBC endpoint of Interactive Cluster.jdbc:databricks://adb-{workspace_id}.azuredatabricks.net:443/default;transport...

  • 1 kudos
3 More Replies
help_needed_445
by Contributor
  • 443 Views
  • 2 replies
  • 2 kudos

Table Fields Have a Different Value and Data Type in SQL Editor vs a SQL Notebook Cell

When I query a numeric field in the SQL Editor it returns a value of 0.02875 and the data type is decimal but when I run the same query in a SQL notebook cell it returns 0.0287500 and decimal(7,7). I'm assuming this is expected behavior but is there ...

help_needed_445_0-1756930330991.png help_needed_445_1-1756930339286.png
  • 443 Views
  • 2 replies
  • 2 kudos
Latest Reply
Khaja_Zaffer
Contributor
  • 2 kudos

Hello @help_needed_445 Good day!its very indeed interesting case study!I found below from LLM models. Yes, this difference in decimal display between the Databricks SQL Editor (which uses the Photon engine in Databricks SQL) and notebooks (which use ...

  • 2 kudos
1 More Replies
liu
by New Contributor III
  • 58 Views
  • 1 replies
  • 1 kudos

Can the default cluster Serverless of Databricks install Scala packages

Can the default cluster Serverless of Databricks install Scala packagesI need to use the spark-sftp package, but it seems that serverless is different from purpose compute, and I can only install python packages?There is another question. I can use p...

  • 58 Views
  • 1 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

no scala, you can't even run scala notebooks.about the sftp: the serverless compute is way more limited than general purpose clusters.what folder can't be found? dbfs or s3?

  • 1 kudos
kmodelew
by New Contributor III
  • 400 Views
  • 10 replies
  • 21 kudos

Unable to read excel file from Volume

Hi, I'am trying to read excel file directly from Volume (not workspace or filestore) -> all examples on the internet use workspace or filestore. Volume is external location so I can read from there but I would like to read directly from Volume. I hav...

  • 400 Views
  • 10 replies
  • 21 kudos
Latest Reply
BS_THE_ANALYST
Esteemed Contributor
  • 21 kudos

@ck7007 thanks for the update. Absolutely love that you've tested the solution too! Big props . As you mention, if we keep the community accurate, it'll mean that when someone else searches for the thread, they don't end up using an incorrect solutio...

  • 21 kudos
9 More Replies
jfvizoso
by New Contributor II
  • 11720 Views
  • 5 replies
  • 0 kudos

Can I pass parameters to a Delta Live Table pipeline at running time?

I need to execute a DLT pipeline from a Job, and I would like to know if there is any way of passing a parameter. I know you can have settings in the pipeline that you use in the DLT notebook, but it seems you can only assign values to them when crea...

  • 11720 Views
  • 5 replies
  • 0 kudos
Latest Reply
DeepakAI
New Contributor
  • 0 kudos

Team - any workaround possible? I have 100+ tables which need to be ingested incrementally. I created a single DTL notebook which i am using inside a pipeline as a task, this pipeline is triggered via job on file arrival event. I want to utilize same...

  • 0 kudos
4 More Replies
Worrachon
by New Contributor
  • 80 Views
  • 1 replies
  • 0 kudos

Data bricks Connot run pipeline

 found that when I run the pipeline, it shows the message "'Cannot run pipeline', 'PL_TRNF_CRM_SALESFORCE_TO_BLOB', "HTTPSConnectionPool(host='management.azure.com', port=443) It doesn't happen on every instance, but I encounter this case often. 

Worrachon_3-1757025996750.png
  • 80 Views
  • 1 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

What exactly does the pipeline do?  Fetch data from a source system? I also see Data Factory as a component?

  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels