cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

JeroenD
by New Contributor
  • 1038 Views
  • 1 replies
  • 0 kudos

Waiting list

I would like to do the Platform Administrator learning plan, but for all components in the learning plan it mentions "in waiting list". What does this mean?

  • 1038 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

adding @Vidula Khanna​ and @Kaniz Fatma​ for visibility

  • 0 kudos
asif5494
by New Contributor III
  • 1343 Views
  • 1 replies
  • 3 kudos

Study material for Databricks Certified Data Engineer Professional Certification?

I want to go for Databricks Certified Data Engineer Professional, Is there any predefined study material for Databricks Certified Data Engineer Professional Certification?

  • 1343 Views
  • 1 replies
  • 3 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 3 kudos

adding @Vidula Khanna​ and @Kaniz Fatma​ for visibility

  • 3 kudos
prasadvaze
by Valued Contributor II
  • 19889 Views
  • 1 replies
  • 1 kudos

How to start local/city databricks user group?

Hello Lindsey, I would like to start Richmond, VA databricks user group (chapter) . How do I go about doing this? 

  • 19889 Views
  • 1 replies
  • 1 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 1 kudos

adding @Vidula Khanna​ and @Kaniz Fatma​ for visibility

  • 1 kudos
Ogi
by New Contributor II
  • 1383 Views
  • 4 replies
  • 1 kudos

Setting right processingTime

How to set just the right processingTime for readStream to maximize the performance? Based on which factors it depends and is there a way to measure this?

  • 1383 Views
  • 4 replies
  • 1 kudos
Latest Reply
Ogi
New Contributor II
  • 1 kudos

Thanks @Ajay Pandey​ and @Nandini N​ for your answers. I wanted to know more about what should I do in order to do it properly. Should I change processing times (1, 5, 10, 30, 60 seconds) and see how it affects running job in terms of time and CPU/me...

  • 1 kudos
3 More Replies
AdamRink
by New Contributor III
  • 1856 Views
  • 1 replies
  • 0 kudos

Apply Avro defaults when writing to Confluent Kafka

I have an avro schema for my Kafka topic. In that schema it has defaults. I would like to exclude the defaulted columns from databricks and just let them default as an empty array. Sample avro, trying to not provide the UserFields because I can't...

  • 1856 Views
  • 1 replies
  • 0 kudos
Latest Reply
" src="" />
This widget could not be displayed.
This widget could not be displayed.
This widget could not be displayed.
  • 0 kudos

This widget could not be displayed.
I have an avro schema for my Kafka topic. In that schema it has defaults. I would like to exclude the defaulted columns from databricks and just let them default as an empty array. Sample avro, trying to not provide the UserFields because I can't...

This widget could not be displayed.
  • 0 kudos
This widget could not be displayed.
Anonymous
by Not applicable
  • 1109 Views
  • 1 replies
  • 4 kudos

Hello Everyone, I am thrilled to announce that we have our first winner for the raffle contest - @Uma Maheswara Rao Desula​ Please join me in congratu...

Hello Everyone,I am thrilled to announce that we have our first winner for the raffle contest - @Uma Maheswara Rao Desula​ Please join me in congratulating him on this remarkable achievement!UmaMahesh, your dedication and hard work have paid off, and...

Winner1
  • 1109 Views
  • 1 replies
  • 4 kudos
Latest Reply
Sujitha
Databricks Employee
  • 4 kudos

@Uma Maheswara Rao Desula​  Congratulations on this well deserved win!! Can't wait for you to meet our Community peers at the Data + AI Summit 2023 in SFO.

  • 4 kudos
rohit8491
by New Contributor III
  • 4788 Views
  • 3 replies
  • 8 kudos

Azure Databricks Connectivity with Power BI Cloud - Firewall Whitelisting

Hi Support TeamWe want to connect to tables in Azure Databricks via Power BI. We are able to connect this via Power BI Desktop but when we try to Publish the same, we can see the dataset associated does not refresh and throws error from Powerbi.comIt...

  • 4788 Views
  • 3 replies
  • 8 kudos
Latest Reply
rohit8491
New Contributor III
  • 8 kudos

Hi NoorThank you soo much for your response. Please see the below details for the error message. I just got to know that Power BI are Azure Databricks are in different tenants. Do you think it causes any issues? Do we need VNet peering to be configur...

  • 8 kudos
2 More Replies
keenan_jones7
by New Contributor II
  • 10819 Views
  • 2 replies
  • 5 kudos

Cannot create job through Jobs API

import requests import json instance_id = 'abcd.azuredatabricks.net' api_version = '/api/2.0' api_command = '/jobs/create' url = f"https://{instance_id}{api_version}{api_command}" headers = {'Authorization': 'Bearer myToken'} params = { "settings...

  • 10819 Views
  • 2 replies
  • 5 kudos
Latest Reply
rAlex
New Contributor III
  • 5 kudos

@keenan_jones7​ I had the same problem today. It looks like you've copied and pasted the JSON that Databricks displays in the GUI when you select View JSON from the dropdown menu when viewing a job.In order to use that JSON in a request to the Jobs ...

  • 5 kudos
1 More Replies
adrianlwn
by New Contributor III
  • 12019 Views
  • 14 replies
  • 16 kudos

How to activate ignoreChanges in Delta Live Table read_stream ?

Hello everyone, I'm using DLT (Delta Live Tables) and I've implemented some Change Data Capture for deduplication purposes. Now I am creating a downstream table that will read the DLT as a stream (dlt.read_stream("<tablename>")). I keep receiving thi...

  • 12019 Views
  • 14 replies
  • 16 kudos
Latest Reply
gopínath
New Contributor II
  • 16 kudos

In DLT read_stream, we can't use ignoreChanges / ignoreDeletes. These are the configs helps to avoid the failures but it is actually ignoring the operations done on the upstream. So you need to manually perform the deletes or updates in the downstrea...

  • 16 kudos
13 More Replies
Colter
by New Contributor II
  • 2068 Views
  • 3 replies
  • 0 kudos

Is there a way to use cluster policies within jobs api to define cluster configuration rather than in the jobs api itself?

I want to create a cluster policy that is referenced by most of our repos/jobs so we have one place to update whenever there is a spark version change or when we need to add additional spark configurations. I figured cluster policies might be a good ...

  • 2068 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Colter Nattrass​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answe...

  • 0 kudos
2 More Replies
tototox
by New Contributor III
  • 2707 Views
  • 3 replies
  • 2 kudos

dbutils.fs.ls overlaps with managed storage error

I created a schema with that route as a managed location.(abfss://~~@~~.dfs.core.windows.net/dejeong/)However, I dropped shcema with the cascade option, and also entered the azure portal and deleted the path directly. and made it again(abfss://~~@~~....

  • 2707 Views
  • 3 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @jin park​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your...

  • 2 kudos
2 More Replies
Dean_Lovelace
by New Contributor III
  • 2508 Views
  • 3 replies
  • 4 kudos

What is the Pyspark equivalent of FSCK REPAIR TABLE?

I am using the delta format and occasionaly get the following error:-"xx.parquet referenced in the transaction log cannot be found. This occurs when data has been manually deleted from the file system rather than using the table `DELETE` statement"FS...

  • 2508 Views
  • 3 replies
  • 4 kudos
Latest Reply
shan_chandra
Databricks Employee
  • 4 kudos

## Delta check when a file was added %scala (oldest-version-available to newest-version-available).map { version => var df = spark.read.json(f"<delta-table-location>/_delta_log/$version%020d.json").where("add is not null").select("add.path") var ...

  • 4 kudos
2 More Replies
Dean_Lovelace
by New Contributor III
  • 4309 Views
  • 3 replies
  • 0 kudos

Delta Table Optimize Error

I have have started getting an error message when running the following optimize command:-deltaTable.optimize().executeCompaction()Error:-java.util.concurrent.ExecutionException: java.lang.IllegalStateException: Number of records changed after Optimi...

  • 4309 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Dean Lovelace​ :The error message suggests that the number of records in the Delta table changed after the optimize() command was run. The optimize() command is used to improve the performance of Delta tables by removing small files and compacting l...

  • 0 kudos
2 More Replies
haraldh
by New Contributor II
  • 1486 Views
  • 1 replies
  • 2 kudos

Databericks JDBC driver connection pooling support

When using Camel JDBC with Databricks JDBC driver I get an error: Caused by: java.sql.SQLFeatureNotSupportedException: [Databricks][JDBC](10220) Driver does not support this optional feature.Is there any means to work around this limitation?

  • 1486 Views
  • 1 replies
  • 2 kudos
Latest Reply
swethaNandan
Databricks Employee
  • 2 kudos

Tools like SDI can connect to a generic JDBC source such as Databricks SQL Warehouse via the SDI Camel JDBC adapter. can you see  if these will help you https://help.sap.com/docs/HANA_SMART_DATA_INTEGRATION/7952ef28a6914997abc01745fef1b607/1247c9518...

  • 2 kudos
System1999
by New Contributor III
  • 5086 Views
  • 7 replies
  • 0 kudos

My 'Data' menu item shows 'No Options' for Databases. How can I fix?

Hi, I'm new to Databricks and I've signed up for the Community edition.First, I've noticed that I cannot return to a previously created cluster, as I get the message telling me that restarting a cluster is not available to me. Ok, inconvenient, but I...

error
  • 5086 Views
  • 7 replies
  • 0 kudos
Latest Reply
System1999
New Contributor III
  • 0 kudos

Hi @Suteja Kanuri​ ,I get the error message under Data before I've created a cluster. Then I still get it when I've created a cluster and a notebook (having attached the notebook to the cluster). Thanks.

  • 0 kudos
6 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels