cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

Oliver_Angelil
by Valued Contributor II
  • 6080 Views
  • 8 replies
  • 6 kudos

Resolved! Confusion about Data storage: Data Asset within Databricks vs Hive Metastore vs Delta Lake vs Lakehouse vs DBFS vs Unity Catalogue vs Azure Blob

Hi thereIt seems there are many different ways to store / manage data in Databricks.This is the Data asset in Databricks: However data can also be stored (hyperlinks included to relevant pages):in a Lakehousein Delta Lakeon Azure Blob storagein the D...

Screenshot 2023-05-09 at 17.02.04
  • 6080 Views
  • 8 replies
  • 6 kudos
Latest Reply
karthik_p
Esteemed Contributor
  • 6 kudos

@Oliver Angelil​ There is no concept call internal table, we have 2 types 1. managed 2. external , if you provide any external mount location in legacy it used to be external table . now onwards when you use unity catalog table type will be external ...

  • 6 kudos
7 More Replies
AJK1
by New Contributor II
  • 1378 Views
  • 0 replies
  • 0 kudos

Identity Column Issues

Hello.I'm experiencing what I'm believe are pretty severe (current) shortcomings regarding Identity columns in Databricks.I'm defining a SQL table using spark SQL - the table creates as exepcted; I've tried using both column definitions for this iden...

CreateTable Insert-GeneratedAlways Command success image
  • 1378 Views
  • 0 replies
  • 0 kudos
Moriondo
by New Contributor III
  • 1661 Views
  • 3 replies
  • 2 kudos

Resolved! How to filter a dashboard by the current user email?

Hello,I would like to know if it is possible to filter a dashboard by the current user email?For example, I have a table result of a group of people with the following columns: user_id, user_email, date, productivity. So with this table I create som...

  • 1661 Views
  • 3 replies
  • 2 kudos
Latest Reply
Moriondo
New Contributor III
  • 2 kudos

Hey guys, After some research on the documentation, I found out that if a filter the query using the current_user() function, I will get the result that I was looking for.If anyone need look at this:https://docs.databricks.com/sql/language-manual/fun...

  • 2 kudos
2 More Replies
Swaroop
by New Contributor
  • 594 Views
  • 0 replies
  • 0 kudos

How to receive data from azure event hub in parquet ?

import asyncioimport osfrom azure.eventhub.aio import EventHubConsumerClientCONNECTION_STR = "Connection_string"EVENTHUB_NAME = "event_hub"async def on_event(partition_context, event):    # Put your code here.    # If the operation is i/o intensive, ...

  • 594 Views
  • 0 replies
  • 0 kudos
Teja07
by New Contributor II
  • 3111 Views
  • 0 replies
  • 0 kudos

Ingesting data from oracle to databricks through IICS

While ingesting the data from oracle to databricks through IICS, target table were created however data is not getting inserted. Below is the error. Could someone please help meException occurred when initializing data session. Root cause: java.lang....

  • 3111 Views
  • 0 replies
  • 0 kudos
jannemanson
by New Contributor III
  • 1399 Views
  • 3 replies
  • 0 kudos

Improve iteration time on implementing jobs

Hey there, I am using dbx to create Databricks tasks and deploy the job. I find it not ideal since the iteration circles are sometimes a bit long when I have to wait for a job with several tasks to complete and see where it failed. I am already tryin...

  • 1399 Views
  • 3 replies
  • 0 kudos
Latest Reply
jannemanson
New Contributor III
  • 0 kudos

Hello, thanks for the answer. Unfortunately, this did not help me, since it is general best practice. @Debayan Mukherjee​ 

  • 0 kudos
2 More Replies
Paul_Poco
by New Contributor II
  • 33708 Views
  • 4 replies
  • 5 kudos

Asynchronous API calls from Databricks

Hi, ​I have to send thousands of API calls from a Databricks notebook to an API to retrieve some data. Right now, I am using a sequential approach using the python request package. As the performance is not acceptable anymore, I need to send my API c...

  • 33708 Views
  • 4 replies
  • 5 kudos
Latest Reply
Anonymous
Not applicable
  • 5 kudos

Hi @Paul Poco​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 5 kudos
3 More Replies
de-hru
by New Contributor III
  • 13818 Views
  • 4 replies
  • 1 kudos

Resolved! How to add pre-commit hook to the Git Client on Databricks Cluster?

I'd like to add a Git pre-commit hook to the Databricks Cluster.This pre-commit hook should be executed when pushing to GitHub.Why would I need a pre-commit hook on a Databricks Cluster?My goal is to run blackbricks and format all notebooks automatic...

  • 13818 Views
  • 4 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Dejan Hrubenja​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Tha...

  • 1 kudos
3 More Replies
sanjay
by Valued Contributor II
  • 3535 Views
  • 0 replies
  • 0 kudos

autoloader with real time and batch processing concurrently

Hi,I have data pipeline which is running continuously, processes the micro batch data and store data in delta lake. This is taking care of any new data.But at times, I need to process historical data without disturbing real time data processing.Is th...

  • 3535 Views
  • 0 replies
  • 0 kudos
ahs
by New Contributor
  • 1100 Views
  • 2 replies
  • 1 kudos

DNS resolution on E2 workspaces?

I am trying to find documents/flows that show Databricks' network setup for e2 workspaces. More specifically, I'm interested in how DNS is resolved on AWS. All the pages I could find were regarding using route53 and privatelink for custom dns. But pl...

  • 1100 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @A H​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 1 kudos
1 More Replies
Aj2
by New Contributor III
  • 5106 Views
  • 4 replies
  • 1 kudos

Resolved! How to connect to DB2-AS400?

What are the steps needed to connect to a DB2-AS400 source to pull data to lake using Databricks? I believe it requires establishing a jdbc connection, but I couldnot find much details online

  • 5106 Views
  • 4 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Ajay Menon​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 1 kudos
3 More Replies
vichus1995
by New Contributor
  • 2642 Views
  • 2 replies
  • 0 kudos

Mounted Azure Storage shows mount.err inside folder while reading from Azure Databricks

I'm using Azure Databricks notebook to read a excel file from a folder inside a mounted Azure blob storage. The mounted excel location is like : "/mnt/2023-project/dashboard/ext/Marks.xlsx". 2023-project is the mount point and dashboard is the name o...

  • 2642 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @vichus1995​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 0 kudos
1 More Replies
Jits
by New Contributor II
  • 855 Views
  • 2 replies
  • 3 kudos

Getting Error when Inserting data into table with the column as bigint

Hi All,I am creating table using Databricks SQL editor. The table definition isDROP TABLE IF EXISTS [database].***_test;CREATE TABLE [database].***_jitu_test(  id bigint)USING deltaLOCATION 'test/raw/***_jitu_test'TBLPROPERTIES ('delta.minReaderVersi...

  • 855 Views
  • 2 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @jitendra goswami​ We haven't heard from you since the last response from @Werner Stinckens​ r​, and I was checking back to see if her suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be helpf...

  • 3 kudos
1 More Replies
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels