cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

KuldeepChitraka
by New Contributor III
  • 1480 Views
  • 1 replies
  • 5 kudos

Lakehosue table structure design

We are in process of implementing a lakehouse using Azure Databricks. We already have a datalake in placeAzure Storage Datalake – Contains containers which has data in its native format.How we are planningBuild Bronze layer by create bronze tables by...

  • 1480 Views
  • 1 replies
  • 5 kudos
Latest Reply
Rishabh-Pandey
Esteemed Contributor
  • 5 kudos

hey @Kuldeep Chitrakar​ as you are saying that you do not have a partitioning in bronze tables , so according to that statement that is okay . but in silver as you are going to implement partitioning so , what i will recommend to you is that for bett...

  • 5 kudos
Rishabh-Pandey
by Esteemed Contributor
  • 3906 Views
  • 0 replies
  • 6 kudos

To connect Delta Lake with Microsoft Excel, you can use the Microsoft Power Query for Excel add-in. Power Query is a data connection tool that allows ...

To connect Delta Lake with Microsoft Excel, you can use the Microsoft Power Query for Excel add-in. Power Query is a data connection tool that allows you to connect to various data sources, including Delta Lake. Here's how to do it:Install the Micros...

  • 3906 Views
  • 0 replies
  • 6 kudos
hello_world
by New Contributor III
  • 3728 Views
  • 1 replies
  • 4 kudos

What is the purpose of the USAGE privilege?

I watched a couple of courses on Databricks Academy, none of which clearly explains or demonstrates the purpose of the USAGE privilege.USAGE: does not give any abilities, but is an additional requirement to perform any action on a schema object.I hav...

  • 3728 Views
  • 1 replies
  • 4 kudos
Latest Reply
Rishabh-Pandey
Esteemed Contributor
  • 4 kudos

hey @S L​ â€‹ I also have these questions , and what i get to know that usage is the minimum and mandot requirement which individual should have to perform any actions , that does not mean that you can do any actions by only usage permission , usage is...

  • 4 kudos
supremefist
by New Contributor III
  • 7370 Views
  • 3 replies
  • 2 kudos

Resolved! New spark cluster being configured in local mode

Hi,We have two workspaces on Databricks, prod and dev. On prod, if we create a new all-purpose cluster through the web interface and go to Environment in the the spark UI, the spark.master setting is correctly set to be the host IP. This results in a...

  • 7370 Views
  • 3 replies
  • 2 kudos
Latest Reply
scottb
New Contributor II
  • 2 kudos

I found the same issue when choosing the default cluster setup on first setup that when I went to edit the cluster to add an instance profile, I was not able to save without fixing this. Thanks for the tip

  • 2 kudos
2 More Replies
SIRIGIRI
by Contributor
  • 1146 Views
  • 2 replies
  • 2 kudos

sharikrishna26.medium.com

Spark Dataframe MetadataSpark Dataframe is structurally the same as the table. However, it does not store any schema information in the metadata store. Instead, we have a runtime metadata catalog to store the Dataframe schema information. It is simil...

  • 1146 Views
  • 2 replies
  • 2 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 2 kudos

this is awesome thanks

  • 2 kudos
1 More Replies
SIRIGIRI
by Contributor
  • 2881 Views
  • 3 replies
  • 5 kudos

Availability Zone in Azure

Please Find the content Herehttps://medium.com/@sharikrishna26/availability-zone-in-azure-52e7764357b6

  • 2881 Views
  • 3 replies
  • 5 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 5 kudos

yeah this is awesome thanks, but ever you think that what will happen if in that region no instance will left , how our jobs will start , any idea here?

  • 5 kudos
2 More Replies
CBull
by New Contributor III
  • 2400 Views
  • 3 replies
  • 2 kudos

Spark Notebook to import data into Excel

Is there a way to create a notebook that will take the SQL that I want to put into the Notebook and populate Excel daily and send it to a particular person?

  • 2400 Views
  • 3 replies
  • 2 kudos
Latest Reply
Meghala
Valued Contributor II
  • 2 kudos

@Aviral Bhardwaj​  thanks for this, I was needed this info

  • 2 kudos
2 More Replies
bradm0
by New Contributor III
  • 3947 Views
  • 3 replies
  • 3 kudos

Resolved! Use of badRecordsPath in COPY INTO SQL command

I'm trying to use the badRecordsPath to catch improperly formed records in a CSV file and continue loading the remainder of the file. I can get the option to work using python like thisdf = spark.read\ .format("csv")\ .option("header","true")\ .op...

  • 3947 Views
  • 3 replies
  • 3 kudos
Latest Reply
bradm0
New Contributor III
  • 3 kudos

Thanks. It was the inferSchema setting. I tried it with and without the SELECT and it worked both ways when I added inferSchemaBoth of these workeddrop table my_db.t2; create table my_db.t2 (col1 int,col2 int); copy into my_db.t2 from (SELECT cast(...

  • 3 kudos
2 More Replies
Meghala
by Valued Contributor II
  • 1985 Views
  • 2 replies
  • 2 kudos
  • 1985 Views
  • 2 replies
  • 2 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 2 kudos

Hi @S Meghala​ ,Please go through this Github link you will get good amount of data here ,this way you can learn morehttps://github.com/AlexIoannides/pyspark-example-projectPlease select my answer as best answer if your query is fulfilled ThanksAvira...

  • 2 kudos
1 More Replies
hello_world
by New Contributor III
  • 5146 Views
  • 7 replies
  • 3 kudos

What happens if I have both DLTs and normal tables in a single notebook?

I've just learned Delta Live Tables on Databricks Academy and have no environment to try it out.I'm wondering what happens to the pipeline if the notebook consists of both normal tables and DLTs. For exampleTable ADLT A that reads and cleans Table AT...

  • 5146 Views
  • 7 replies
  • 3 kudos
Latest Reply
Rishabh-Pandey
Esteemed Contributor
  • 3 kudos

hey ,@S L​  According to you , you have normal table table A and DLT table Table B , so it will give thrown an error that your upstream table is not streaming Live table and you need to create streaming live table Table a , if you want to use the ou...

  • 3 kudos
6 More Replies
THIAM_HUATTAN
by Valued Contributor
  • 2957 Views
  • 2 replies
  • 2 kudos

Subquery does not work in Databricks Community version?

I am testing some SQL code based on the book SQL Cookbook Second Edition, available from https://downloads.yugabyte.com/marketing-assets/O-Reilly-SQL-Cookbook-2nd-Edition-Final.pdfBased on Page 43, I am OK with the left join, as shown here:However, w...

image image
  • 2957 Views
  • 2 replies
  • 2 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 2 kudos

it must have some github link check there or you cans hare your code and data we can help you

  • 2 kudos
1 More Replies
MC006
by New Contributor III
  • 10706 Views
  • 4 replies
  • 2 kudos

Resolved! java.lang.NoSuchMethodError after upgrade to Databricks Runtime 11.3 LTS

Hi,  I am using Databricks and want to upgrade to Databricks runtime version 11.3 LTS which uses Spark 3.3 now. Current system enviroment:Operating System: Ubuntu 20.04.4 LTSJava: Zulu 8.56.0.21-CA-linux64Python: 3.8.10Delta Lake: 1.1.0Target system ...

  • 10706 Views
  • 4 replies
  • 2 kudos
Latest Reply
Meghala
Valued Contributor II
  • 2 kudos

Hi everyone this data was helped me thanks ​

  • 2 kudos
3 More Replies
uzadude
by New Contributor III
  • 12974 Views
  • 5 replies
  • 3 kudos

Adding to PYTHONPATH in interactive Notebooks

I'm trying to set PYTHONPATH env variable in the cluster configuration: `PYTHONPATH=/dbfs/user/blah`. But in the driver and executor envs it is probably getting overridden and i don't see it.`%sh echo $PYTHONPATH` outputs:`PYTHONPATH=/databricks/spar...

  • 12974 Views
  • 5 replies
  • 3 kudos
Latest Reply
uzadude
New Contributor III
  • 3 kudos

Update:At last found a (hacky) solution!in the driver I can dynamically set the sys.path in the workers with:`spark._sc._python_includes.append("/dbfs/user/blah/")`combine that with, in the driver:```%load_ext autoreload%autoreload 2```and setting: `...

  • 3 kudos
4 More Replies
rocky5
by New Contributor III
  • 2474 Views
  • 2 replies
  • 2 kudos

DLT UDF and c#

Hello, can I create spark function in .net and use it in DLT table? I would like to encrypt some data, in documentation scala code is being used as an example, but would it be possible to write decryption/encryption function using C# and use it withi...

  • 2474 Views
  • 2 replies
  • 2 kudos
Latest Reply
Meghala
Valued Contributor II
  • 2 kudos

It's not possible. SQL Server 2008 contains SQL CLR runtime that runs .NET languages.

  • 2 kudos
1 More Replies
Aravind_P04
by New Contributor II
  • 4407 Views
  • 3 replies
  • 4 kudos

Clarification on merging multiple notebooks and other

1. Do we have any feature like merge the cells from one or more notebooks into other notebook.2. Do we have any feature like multiple cells from excel is copied it into multiple cells in a notebook . Generally all excel data is copied it into one cel...

  • 4407 Views
  • 3 replies
  • 4 kudos
Latest Reply
youssefmrini
Databricks Employee
  • 4 kudos

1) We can't merge cells right now2)We don't have this feature as well3) We don't have multiple editing right now4)You will know only if you face an error. A Notification will pop up5)You can"t keep running the execution because the cells can be linke...

  • 4 kudos
2 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels