cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

js54123875
by New Contributor III
  • 2774 Views
  • 3 replies
  • 3 kudos

Setup for Unity Catalog, autoloader, three-level namespace, SCD2

I am trying to setup delta live tables pipelines to ingest data to bronze and silver tables. Bronze and Silver are separate schema. This will be triggered by a daily job. It appears to run fine when set as continuous, but fails when triggered.Table...

  • 2774 Views
  • 3 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Jennette Shepard​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answ...

  • 3 kudos
2 More Replies
akshay_patni228
by New Contributor II
  • 9024 Views
  • 2 replies
  • 3 kudos

Missing Credential Scope - Unable to call databrick(Scala) notebook from ADF

Hi Team ,I am using job cluster while setting Linked Service in ADF to call Data bricks Notebook activity .Cluster Detail - Policy - UnrestrictedAccess Mode - Single userUnity Catalog Enabled.databrick run time - 12.2 LTS (includes Apache Spark 3.3.2...

  • 9024 Views
  • 2 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Akshay Patni​ We haven't heard from you since the last response from @Debayan Mukherjee​ â€‹ . Kindly share the information with us, and in return, we will provide you with the necessary solution. Thanks and Regards

  • 3 kudos
1 More Replies
Michael42
by New Contributor III
  • 16945 Views
  • 4 replies
  • 7 kudos

Resolved! Want to load a high volume of CSV rows in the fastest way possible (in excess of 5 billion rows). I want the best approach, in terms of speed, for loading into the bronze table.

My source can only deliver CSV format (pipe delimited).My source has the ability to generate multiple CSV files and transfer them to a single upload folder.All rows must go to the same target bronze delta table.I do not care about the order in which ...

  • 16945 Views
  • 4 replies
  • 7 kudos
Latest Reply
Anonymous
Not applicable
  • 7 kudos

Hi @Michael Popp​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers ...

  • 7 kudos
3 More Replies
vicks
by New Contributor III
  • 9246 Views
  • 5 replies
  • 8 kudos

Resolved! Converting the mon-yy format to date, but showing null for output

I have a date column that comes with month-year format and I am trying to convert that into dd-mm-yyyy format in pyspark for example I have date column with value Jan-2019Feb-2020Mar-2020the output I am expecting is 01/01/201901/02/202001/03/2020here...

  • 9246 Views
  • 5 replies
  • 8 kudos
Latest Reply
Anonymous
Not applicable
  • 8 kudos

Hi @vikram sinhha​ We haven't heard from you since the last response from @Suteja Kanuri​  . Kindly share the information with us, and in return, we will provide you with the necessary solution. Thanks and Regards

  • 8 kudos
4 More Replies
ankris
by New Contributor III
  • 14798 Views
  • 5 replies
  • 6 kudos

Resolved! Databricks and Web app connectivity to build a interactive application

Hi All,Could you please guide and provide me some material on "Databricks and Web app connectivity to build a interactive application"Objective is to build web app so that user can interact for getting different data/visuals from databricks (pyspark ...

  • 14798 Views
  • 5 replies
  • 6 kudos
Latest Reply
Anonymous
Not applicable
  • 6 kudos

Hi @ananthakrishna raikar​ We haven't heard from you since the last response from @Werner Stinckens​ â€‹ . Kindly share the information with us, and in return, we will provide you with the necessary solution. Thanks and Regards

  • 6 kudos
4 More Replies
Ajay-Pandey
by Esteemed Contributor III
  • 11652 Views
  • 7 replies
  • 11 kudos

Resolved! Unzip Files

Hi all, I am trying to unzip a file in databricks but facing an issue,Please help me if you have any doc or codes to share.

  • 11652 Views
  • 7 replies
  • 11 kudos
Latest Reply
vivek_rawat
New Contributor III
  • 11 kudos

Hey ajay,You can follow this module to unzip your zip file.To give your brief idea about this, it will unzip your file directly into your driver node storage.So If your compressed data is inside DBFS then you first have to move that to drive node and...

  • 11 kudos
6 More Replies
Firedubdub
by New Contributor II
  • 3119 Views
  • 2 replies
  • 4 kudos

Resolved! Azure DataBricks REST API 2.0 - Deactivate User by ID. I can't seem to find any more details on this anymore.

Hello team,I am looking to run Azure Logic Apps to deactivate user by ID. Right now there is only the REST API for delete user but I only want to deactivate the user.Can anyone help with the operations op/path/schema that I am missing?  Thank you!

image image.png
  • 3119 Views
  • 2 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hi @Daren Liew​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers yo...

  • 4 kudos
1 More Replies
PragyaS
by New Contributor
  • 2238 Views
  • 2 replies
  • 1 kudos

Resolved! Decimal Automatic Rounding on sum

I am facing an issue, I have implemented a code in which I am performing sum on decimal values. I have set precision as Decimal(19,2). When I am calculating for big data I am getting different value as I got from my .Net Utility application. ex. From...

  • 2238 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Pragya Sharma​ We haven't heard from you since the last response from @Werner Stinckens​  . Kindly share the information with us, and in return, we will provide you with the necessary solution. Thanks and Regards

  • 1 kudos
1 More Replies
lasribeiro_univ
by New Contributor III
  • 6620 Views
  • 3 replies
  • 3 kudos

Resolved! Cluster SQL Warehouse fails to Start.

I'm a new user of Databricks and I'm taking the Academy course, but I'm having difficulty starting a SQL Warehouse cluster. I've tried several different configurations, but I always get the same error:Clusters are failing to launch. Cluster launch wi...

SQL Warehouse Starter 2 SQL Warehouse Starter 1
  • 6620 Views
  • 3 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Luiz Ribeiro​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers ...

  • 3 kudos
2 More Replies
Manimeghala
by New Contributor III
  • 3842 Views
  • 4 replies
  • 5 kudos

Resolved! Can we use %run or any magic commands in a Delta Live Table pipeline?

When we try, it says magic commands are not supported. Is there a way to import other notebooks first so that its functions can be referenced/used while we build a DLT pipeline?

  • 3842 Views
  • 4 replies
  • 5 kudos
Latest Reply
Anonymous
Not applicable
  • 5 kudos

Hi @Manimeghala Vemula​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best an...

  • 5 kudos
3 More Replies
Sudhir1
by New Contributor II
  • 6791 Views
  • 5 replies
  • 1 kudos

Connecting to AWS MSK

how to connect to the AWS MSK which has I-Am based authentication?

  • 6791 Views
  • 5 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Sudhir Jaiswal​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answer...

  • 1 kudos
4 More Replies
f2008700
by New Contributor III
  • 15203 Views
  • 6 replies
  • 7 kudos

Configuring average parquet file size

I have S3 as a data source containing sample TPC dataset (10G, 100G).I want to convert that into parquet files with an average size of about ~256MiB. What configuration parameter can I use to set that?I also need the data to be partitioned. And withi...

  • 15203 Views
  • 6 replies
  • 7 kudos
Latest Reply
Anonymous
Not applicable
  • 7 kudos

Hi @Vikas Goel​ We haven't heard from you since the last response from @Werner Stinckens​ â€‹, and I was checking back to see if her suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be helpful to o...

  • 7 kudos
5 More Replies
adrianhernandez
by New Contributor III
  • 3339 Views
  • 3 replies
  • 1 kudos

Resolved! Databricks Academy login issues

I tried using the same credentials as my Databricks login (provided by my employer) and it doesn't work. I then tried the forgot option password but never get an email. I finally tried registering a new account, but I get an error showing the email i...

  • 3339 Views
  • 3 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Adrian Hernandez​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answ...

  • 1 kudos
2 More Replies
Anonymous
by Not applicable
  • 5465 Views
  • 2 replies
  • 1 kudos
  • 5465 Views
  • 2 replies
  • 1 kudos
Latest Reply
wmespi
New Contributor II
  • 1 kudos

Is this random number not possible to extract from the notebook context? It is available in the browser_hash but that is not populated when running a job.Is this random number static or does it change over time? If it is static, it can then be hardco...

  • 1 kudos
1 More Replies
Mado
by Valued Contributor II
  • 5546 Views
  • 1 replies
  • 0 kudos

Resolved! Error when query a table created by DLT pipeline; "Couldn't find value of a column"

Hi, I create a table using DLT pipeline (triggered once). In the ETL process, I add a new column to the table with Null values by:output = output.withColumn('Indicator_Latest_Value_Date', F.lit(None))Pipeline works and I don't get any error. But, whe...

  • 5546 Views
  • 1 replies
  • 0 kudos
Latest Reply
josruiz22
New Contributor III
  • 0 kudos

Hi,Try converting the None of the output line this :output = output.withColumn('Indicator_Latest_Value_Date', F.lit(None).cast("String"))

  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels