cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

Nickje56
by New Contributor
  • 4851 Views
  • 1 replies
  • 1 kudos

Resolved! _sqldf not defined

In the release notes of May 2022 it says that we are now able to investigate our SQL results in python in a python notebook. (See also documentation here: Use notebooks - Azure Databricks | Microsoft Docs ) So I created a simple query (select * from ...

  • 4851 Views
  • 1 replies
  • 1 kudos
Latest Reply
User16753725469
Contributor II
  • 1 kudos

This feature was delayed and will be rolled out over Databricks platform releases 3.74 through 3.76. you can check the release notes for more info --> https://docs.databricks.com/release-notes/product/2022/may.html

  • 1 kudos
Confused
by New Contributor III
  • 8923 Views
  • 7 replies
  • 2 kudos

Schema evolution issue

Hi AllI am loading some data using auto loader but am having trouble with Schema evolution.A new column has been added to the data I am loading and I am getting the following error:StreamingQueryException: Encountered unknown field(s) during parsing:...

  • 8923 Views
  • 7 replies
  • 2 kudos
Latest Reply
rgrosskopf
New Contributor II
  • 2 kudos

I agree that hints are the way to go if you have the schema available but the whole point of schema evolution is that you might not always know the schema in advance.I received a similar error with a similar streaming query configuration. The issue w...

  • 2 kudos
6 More Replies
vk217
by Contributor
  • 2111 Views
  • 2 replies
  • 3 kudos

Resolved! Generic user account and personal access token to Azure Datarbicks

Is there a way to create a generic user account and personal access token to connect to databricks. I have Azure build pipeline and VSCode test that is using my personal access token for running builds and tests.

  • 2111 Views
  • 2 replies
  • 3 kudos
Latest Reply
Gabriel0007
New Contributor III
  • 3 kudos

You can create a service account (principle) for jobs, applications etc. Here's a link to the docs:https://docs.databricks.com/administration-guide/users-groups/service-principals.html

  • 3 kudos
1 More Replies
Tahseen0354
by Valued Contributor
  • 2289 Views
  • 4 replies
  • 2 kudos

Why set up audit log delivery in databricks GCP fails ?

I am trying to set up audit log delivery in google cloud. I have followed this page https://docs.gcp.databricks.com/administration-guide/account-settings-gcp/log-delivery.html and have added log-delivery@databricks-prod-master.iam.gserviceaccount.co...

  • 2289 Views
  • 4 replies
  • 2 kudos
Latest Reply
Prabakar
Databricks Employee
  • 2 kudos

I would suggest, contacting your Databricks accounts representative for this. They would be able to check if something went wrong with your workspace subscription.

  • 2 kudos
3 More Replies
Gabriel0007
by New Contributor III
  • 1515 Views
  • 2 replies
  • 2 kudos

How do I process each new record when using autoloader.

For instance, I'm ingesting webhook data into a delta table with autoloader and need to run a process for each new record as it arrives.

  • 1515 Views
  • 2 replies
  • 2 kudos
Latest Reply
AmanSehgal
Honored Contributor III
  • 2 kudos

With autoloader, you can do something like changelog and record data about operations performed on each micro batch - like affected id, I/U/D, timestamp etc..Then you can make use of this changelog table, and run subsequent processes for each row aff...

  • 2 kudos
1 More Replies
ishantjain194
by New Contributor II
  • 1787 Views
  • 2 replies
  • 3 kudos

AWS OR AZURE OR GCLOUD??

I want to know whether which cloud is better to learn and which cloud services has more career opportunities.

  • 1787 Views
  • 2 replies
  • 3 kudos
Latest Reply
Cedric
Databricks Employee
  • 3 kudos

As addition to @Kaniz Fatma​ great comparison article, cloud skills are generally transferable across other providers. It is the same concept just with different names (eg: EC2 / Azure VM / Google Compute Engine). Learning cloud in general is a good ...

  • 3 kudos
1 More Replies
Cassio
by New Contributor II
  • 3348 Views
  • 4 replies
  • 3 kudos

Resolved! "SparkSecurityException: Cannot read sensitive key" error when reading key from Spark config

In Databricks 10.1 it is possible to define in the "Spark Config" of the cluster something like:spark.fernet {{secrets/myscope/encryption-key}} . In my case my scopes are tied to Azure Key Vault.With that I can make a query as follows:%sql   SELECT d...

  • 3348 Views
  • 4 replies
  • 3 kudos
Latest Reply
Soma
Valued Contributor
  • 3 kudos

This solution exposes the entire secret if I use commands like belowsql("""explain select upper("${spark.fernet.email}") as data """).display()Please dont use this

  • 3 kudos
3 More Replies
754424
by New Contributor
  • 1453 Views
  • 3 replies
  • 2 kudos

Firefox only - copying from notebook table output copies cell contents instead

Firefox only - copying from notebook table output copies cell contents instead in Firefox (and firefox based browsers)

  • 1453 Views
  • 3 replies
  • 2 kudos
Latest Reply
User16741082858
Contributor III
  • 2 kudos

Hi @Jim Kutter​, I have gone ahead and put in a ticket for you regarding this. Your Databricks representative will be in touch with you regarding the status. Thank you for your patience!

  • 2 kudos
2 More Replies
WillieAlsop
by New Contributor
  • 773 Views
  • 1 replies
  • 0 kudos

What is the instrument of activity of Go?

ORDER NOW >>> https://www.outlookindia.com/outlook-spotlight/go-reviews-is-go-legit-or-news-207720 Go"Go" may upgrade mental state many days. They might work on mental concentration in half a month. This regular item might help focus levels and make ...

  • 773 Views
  • 1 replies
  • 0 kudos
Latest Reply
AmanSehgal
Honored Contributor III
  • 0 kudos

@Kaniz Fatma​ - spam post

  • 0 kudos
aschiff
by Contributor II
  • 29788 Views
  • 24 replies
  • 4 kudos

Resolved! Extracting data from a multi-layered JSON object

I have a table in databricks called owner_final_delta with a column called contacts that holds data with this structure:array<struct<address:struct<apartment:string,city:string,house:string,poBox:string,sources:array<string>,state:string,street:strin...

  • 29788 Views
  • 24 replies
  • 4 kudos
Latest Reply
Dooley
Valued Contributor II
  • 4 kudos

Have you tried to use the explode function for that column with the array?df.select(explode(df.emailId).alias("email")).show()----------Also, if you are a SQL lover, you can instead use the Databricks syntax for querying a JSON seen here.

  • 4 kudos
23 More Replies
StackP
by New Contributor
  • 2101 Views
  • 1 replies
  • 0 kudos

How to add unique consecutive id to delta lake table

In Databricks I have a existing delta table, In which i want to add one more column, as Id so that each row has unique id no and It is consecutive (how primary key is present in sql).So far I have tried converting delta table to pyspark dataframe and...

  • 2101 Views
  • 1 replies
  • 0 kudos
Latest Reply
Sandeep
Contributor III
  • 0 kudos

How about defining an identity column as below?GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY [ ( [ START WITH start ] [ INCREMENT BY step ] ) ]https://docs.databricks.com/sql/language-manual/sql-ref-syntax-ddl-create-table-using.html#parameters

  • 0 kudos
BradSheridan
by Valued Contributor
  • 1565 Views
  • 2 replies
  • 1 kudos

Resolved! Add an Instance Profile to a DLT job cluster

@Tomasz Bacewicz​ I've got another, related question for you about the job cluster that is spun up for DLT jobs. Adding the JSON strings for our required E2 tags worked like a charm, but now I need to attach an existing Instance Profile since I'm tr...

  • 1565 Views
  • 2 replies
  • 1 kudos
Latest Reply
tomasz
Databricks Employee
  • 1 kudos

@Brad Sheridan​ To do that you have to add the aws_attributes tag within a cluster configuration and there you have the ability to add an instance_profile_arn like so:"clusters": [ { "label": "default", "aws_attributes": { ...

  • 1 kudos
1 More Replies
junaid
by New Contributor II
  • 7730 Views
  • 0 replies
  • 1 kudos

We are seeing "BOOTSTRAP_TIMEOUT" issue in a new workspace.

When attempting to deploy/start an Databricks cluster on AWS through the UI, the following error consistently occurs:Bootstrap Timeout:[id: InstanceId(i-093caac78cdbfa7e1), status: INSTANCE_INITIALIZING, workerEnvId:WorkerEnvId(workerenv-335698072713...

  • 7730 Views
  • 0 replies
  • 1 kudos
mehdi1
by New Contributor III
  • 9294 Views
  • 8 replies
  • 12 kudos

Resolved! How to programmatically create a widget?

I know the dbutils.widget.text to create a widget in a notebook. So for me the workflow 1. Having a notebook2. Use dbutils.widget.text (or other type of widgets) once in a notebook cell to create a widget3. Remove the cell containing dbutils.widget...

  • 9294 Views
  • 8 replies
  • 12 kudos
Latest Reply
RachelGomez123
New Contributor II
  • 12 kudos

@Mehdi BEN ABDESSELEM​ , Steps for Creating a Basic WidgetStep 1: Create a New ProjectTo create a new project in Android Studio, please refer to How to Create/Start a New Project in Android Studio. We are implementing it for both Java and Kotlin lang...

  • 12 kudos
7 More Replies
Sam
by New Contributor III
  • 4311 Views
  • 3 replies
  • 6 kudos

Resolved! QuantileDiscretizer not respecting NumBuckets

I have set numBuckets and numBucketsArray for a group of columns to bin them into 5 buckets.Unfortunately the number of buckets does not seem to be respected across all columns even though there is variation within them.I have tried setting the relat...

  • 4311 Views
  • 3 replies
  • 6 kudos
Latest Reply
Sam
New Contributor III
  • 6 kudos

Thank you.What I did was:Apply QuntileBucketizer to Non-Zeros and specified a very small value (bottom 1%) to capture the lower bucket including zeroes.That fixed the issue! You can define your own splits which would work as well but the splits thems...

  • 6 kudos
2 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels