cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

lawrence009
by Contributor
  • 1153 Views
  • 2 replies
  • 3 kudos

Advice on efficiently cleansing and transforming delta table

I have a delta table that is being updated nightly using Auto Loader. After the merge, the job kicks off a second notebook to clean and rewrite certain value using a series of UPDATE statements, e.g.,UPDATE TABLE foo SET field1 = some_value WHER...

  • 1153 Views
  • 2 replies
  • 3 kudos
Latest Reply
Jfoxyyc
Valued Contributor
  • 3 kudos

I would partition the table by some sort of date that autoloader can use. You could then filter your update further and it'll automatically use partition pruning and only scan related files.

  • 3 kudos
1 More Replies
Jennifer_Lu
by New Contributor III
  • 1263 Views
  • 1 replies
  • 3 kudos

How do I programmatically get the database name in a DLT notebook?

I have configured a database in the settings of my DLT pipeline. Is there a way to retrieve that value programmatically from within a notebook? I want to do something likespark.read.table(f"{database}.table")

  • 1263 Views
  • 1 replies
  • 3 kudos
Latest Reply
Jfoxyyc
Valued Contributor
  • 3 kudos

You could also set it as a config value as database:value, and then retrieve it in the notebook using spark.conf.get().I'm hoping they update DLT to support UC, and then allow us to set database/schema at the notebook level in @dlt.table(schema_name,...

  • 3 kudos
Jennifer_Lu
by New Contributor III
  • 1219 Views
  • 1 replies
  • 3 kudos

Why does DLT CDC some time manifests the results table as a table and other times as a view?

I have a simple DLT pipeline that reads from an existing table, do some transformations, saves to a view, and then uses dlt.apply_changes() to insert the view into a results table. My question is:why is my results table a view and not a table like I ...

  • 1219 Views
  • 1 replies
  • 3 kudos
Latest Reply
Jfoxyyc
Valued Contributor
  • 3 kudos

I find most of my apply_changes tables are being created as materialized views as well. They do recalculate at runtime, so they're up to date and behave a lot like a table, but they aren't tables in the same sense.

  • 3 kudos
jayallenmn
by New Contributor III
  • 2430 Views
  • 2 replies
  • 3 kudos

Giving new user workspace access

Hey all,We have a new user we'd like to give access to our spark workspace. We invited the user to the workspace as an account admin. They click on the invite link and create a password and login. Once logged in they can see the workspace and can ...

  • 2430 Views
  • 2 replies
  • 3 kudos
Latest Reply
User16255483290
Contributor
  • 3 kudos

The new feature in data bricks is identity federation if identity federation is enabled then the users part of the data bricks account and the account admin can assign the users to the workspace. The account admins can add the users from account cons...

  • 3 kudos
1 More Replies
monicaborges
by New Contributor III
  • 2023 Views
  • 3 replies
  • 6 kudos
  • 2023 Views
  • 3 replies
  • 6 kudos
Latest Reply
Anonymous
Not applicable
  • 6 kudos

Hi @Mônica Borges Silva​ Thank you for reaching out! Please submit a ticket to our Training Team here: https://help.databricks.com/s/contact-us?ReqType=training  and our team will get back to you shortly. 

  • 6 kudos
2 More Replies
User16835756816
by Valued Contributor
  • 15731 Views
  • 1 replies
  • 7 kudos

Resolved! How do I resolve problems when deploying a workspace with AWS Quickstart cloud formation template?

I am unable to deploy a workspace on AWS using Quickstart from my account console.Short description-You might receive one of the following common errors users face:Wrong credentialsElastic IP and VPC limit reachedRegion unavailableResolution-Wrong cr...

cloudformation-databricks-password Screen Shot 2022-03-11 at 10.17.42 AM Screen Shot 2022-03-15 at 10.42.50 AM Cross Account Role
  • 15731 Views
  • 1 replies
  • 7 kudos
Latest Reply
qasimhassan
Contributor
  • 7 kudos

Really great explanation. The error that I was encountering since yesterday was Failed to create CreateStorageConfiguraiton and CreateCredentialConfiguration. The first step to put the password manually helped me to solve the issue

  • 7 kudos
IG1
by New Contributor II
  • 1697 Views
  • 3 replies
  • 2 kudos

Why there's no "New Union" option with Databricks connection

I'm trying to use databricks connect with tableau but it doesn't give me the "New Union" option. Is this normal or it's particular to me? My tableau desktop version is 2021.3

  • 1697 Views
  • 3 replies
  • 2 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 2 kudos

there is option for connecting tableau find there SPARK SQL then it should work after adding proper connection string

  • 2 kudos
2 More Replies
johnb1
by Contributor
  • 23830 Views
  • 13 replies
  • 13 kudos

Certified Data Engineer Associate - v2 vs. v3 (Databricks Academy)

Which version of the Data Engineering with Databricks learning plan should I do? v2 or v3? Is there a Certified Data Engineer Associate V3 Exam already?Where can I find practice exams for Certified Data Engineer Associate V3?

  • 23830 Views
  • 13 replies
  • 13 kudos
Latest Reply
Frank_Tao
New Contributor II
  • 13 kudos

I would suggest choose v3 - it was latest version and covered more topic.

  • 13 kudos
12 More Replies
VVill_T
by Contributor
  • 3566 Views
  • 4 replies
  • 7 kudos

How to write a Delta Live Table(dlt) pipeline output to Databricks SQL directly

Hi,I am trying to see if it is possible to setup a direct connection from dlt pipeline to a table in Databricks SQL by configuring the Target Schema: with poc being a location of schema like "dbfs:/***/***/***/poc.db The error message was just a...

image image
  • 3566 Views
  • 4 replies
  • 7 kudos
Latest Reply
youssefmrini
Databricks Employee
  • 7 kudos

When ever you store a Delta Table to Hive Metastore. This table will be available in Databricks SQL Workspace ( Data Explorer ) under hive_metastore catalog.

  • 7 kudos
3 More Replies
alexgv12
by New Contributor III
  • 1412 Views
  • 1 replies
  • 3 kudos

creation of tables with cdc

I am using cdc to create different tables, these tables can have one or more dependencies, what is the best practice to create these tables without losing records or changes in both the base table and the join tables? for exampleselect * from ( ...

  • 1412 Views
  • 1 replies
  • 3 kudos
Latest Reply
alexgv12
New Contributor III
  • 3 kudos

more detail

  • 3 kudos
Prototype998
by New Contributor III
  • 2432 Views
  • 1 replies
  • 5 kudos

Resolved! Where can we use Broadcast variable?

best situations where we can use broadcast variables ?

  • 2432 Views
  • 1 replies
  • 5 kudos
Latest Reply
Rishabh-Pandey
Esteemed Contributor
  • 5 kudos

hey @Punit Chauhan​ BV are used in the same way for RDD, DataFrame, and Dataset.When you run a Spark RDD, DataFrame jobs that has the Broadcast variables defined and used, Spark does the following.Spark breaks the job into stages that have distribute...

  • 5 kudos
SudiptaBiswas
by New Contributor III
  • 2565 Views
  • 3 replies
  • 3 kudos

databricks autoloader getting stuck in flattening json files for different scenarios similar in nature.

I have a databricks autoloader notebook that reads json files from an input location and writes the flattened version of json files to an output location. However, the notebook is behaving differently for two different but similar scenarios as descri...

  • 2565 Views
  • 3 replies
  • 3 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 3 kudos

Could you provide a code snippet? also do you see any error logs in the driver logs?

  • 3 kudos
2 More Replies
Rishabh-Pandey
by Esteemed Contributor
  • 1074 Views
  • 1 replies
  • 5 kudos

PrivilegesSELECT: gives read access to an object.CREATE: gives ability to create an object (for example, a table in a schema).MODIFY: gives ability to...

PrivilegesSELECT: gives read access to an object.CREATE: gives ability to create an object (for example, a table in a schema).MODIFY: gives ability to add, delete, and modify data to or from an object.USAGE: does not give any abilities, but is an add...

  • 1074 Views
  • 1 replies
  • 5 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 5 kudos

thanks sir

  • 5 kudos
Yatoom
by New Contributor II
  • 2072 Views
  • 2 replies
  • 2 kudos

Disable access to mount point for client code

We are building a platform where we automatically execute Databricks jobs using Python packages delivered by our end-users. We want to create a mount point so that we can deliver the cluster's driver logs to an external storage. However, we don't wan...

  • 2072 Views
  • 2 replies
  • 2 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 2 kudos

Check with cloud providers

  • 2 kudos
1 More Replies
Aviral-Bhardwaj
by Esteemed Contributor III
  • 3601 Views
  • 1 replies
  • 35 kudos

Understand Trigger Intervals in Streaming Pipelines in Databricks When defining a streaming write, the trigger the method specifies when the system sh...

Understand Trigger Intervals in Streaming Pipelines in DatabricksWhen defining a streaming write, the trigger the method specifies when the system should process the next set of data. Triggers are specified when defining how data will be written to a...

image
  • 3601 Views
  • 1 replies
  • 35 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 35 kudos

Thank you for sharing

  • 35 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels