cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Devsql
by New Contributor III
  • 2555 Views
  • 3 replies
  • 2 kudos

How to find that given Parquet file got imported into Bronze Layer ?

Hi Team,Recently we had created new Databricks project/solution (based on Medallion architecture) having Bronze-Silver-Gold Layer based tables. So we have created Delta-Live-Table based pipeline for Bronze-Layer implementation. Source files are Parqu...

Data Engineering
Azure Databricks
Bronze Job
Delta Live Table
Delta Live Table Pipeline
  • 2555 Views
  • 3 replies
  • 2 kudos
Latest Reply
raphaelblg
Databricks Employee
  • 2 kudos

Hello @Devsql , It appears that you are creating DLT bronze tables using a standard spark.read operation. This may explain why the DLT table doesn't include "new files" during a REFRESH operation. For incremental ingestion of bronze layer data into y...

  • 2 kudos
2 More Replies
youssefmrini
by Databricks Employee
  • 1770 Views
  • 0 replies
  • 2 kudos

Delta Lake Liquid Clustering

Support for liquid clustering is now generally available using Databricks Runtime +15.2 Getting started with Delta Lake Liquid clustering https://lnkd.in/eaCZyhbF#DeltaLake #Databricks

  • 1770 Views
  • 0 replies
  • 2 kudos
pjv
by New Contributor III
  • 3193 Views
  • 2 replies
  • 0 kudos

Asynchronous API calls from Databricks Workflow job

Hi all,I have many API calls to run on a python Databricks notebook which I then run regularly on a Databricks Workflow job. When I test the following code on an all purpose cluster locally i.e. not via a job, it runs perfectly fine. However, when I ...

  • 3193 Views
  • 2 replies
  • 0 kudos
Latest Reply
pjv
New Contributor III
  • 0 kudos

I actually got it too work though I do see that if I run two jobs of the same code in parallel the async execution time slows down. Do the number of workers of the cluster on which the parallel jobs are run effect the execution time of async calls of...

  • 0 kudos
1 More Replies
Olly
by New Contributor II
  • 2246 Views
  • 2 replies
  • 1 kudos

Resolved! DBR14.3 Shared Access cluster delta.DeltaTable.toDF() issues

Having issues with the pyspark DataFrames returned by delta.DeltaTable.toDF(), in what I believe is specific to shared access clusters on DBR14.3. Recently created a near identical workflow with the only major difference being that one of the source ...

  • 2246 Views
  • 2 replies
  • 1 kudos
Latest Reply
Olly
New Contributor II
  • 1 kudos

That works, as mentioned it is easy to work around. as does replacing df = spark.table("test")df.select(df.col)

  • 1 kudos
1 More Replies
standup1
by Contributor
  • 1496 Views
  • 2 replies
  • 0 kudos

How to exclude/skip a file temporarily in DLT

Hi,Is there any way to exclude a file from the dlt pipeline (autoload) run temporarily? What I mean is that I want to be able to exclude a specific file until I decided to include it in the load? I can't control the files or the location where they a...

  • 1496 Views
  • 2 replies
  • 0 kudos
Latest Reply
brockb
Databricks Employee
  • 0 kudos

Hi, I'm not aware of default Autoloader functionality that does what you're looking to do given that Autoloader is designed to incrementally ingest data as it arrives in cloud storage. Can you describe more about: "...exclude a specific file until I ...

  • 0 kudos
1 More Replies
VGS777
by New Contributor III
  • 1419 Views
  • 1 replies
  • 0 kudos

Regarding Databricks Terraform to new user

Hey FolksI am new to terraform and databricksI have usecase I want to create new user or add them to databricks workspace. And assign role to this user. And also assign cluster to this new userAfter 12hrs I want to delete this new user and also these...

  • 1419 Views
  • 1 replies
  • 0 kudos
Latest Reply
VGS777
New Contributor III
  • 0 kudos

Thanks for this information 

  • 0 kudos
wyzer
by Contributor II
  • 9710 Views
  • 8 replies
  • 4 kudos

Resolved! How to pass parameters in SSRS/Power BI (report builder) ?

Hello,In SSRS/Power BI (report builder), how to query a table in Databricks with parameters please ?Because this code doesn't works :SELECT * FROM TempBase.Customers WHERE Name = {{ @P_Name }}Thanks.

  • 9710 Views
  • 8 replies
  • 4 kudos
Latest Reply
Nj11
New Contributor II
  • 4 kudos

Hi, I am not able to see the data in SSRS while I am using date parameters but with manual dates data is populating fine. The database is pointing to databricks. I am not sure what I am missing here. Please help me in this. ThanksI am trying with que...

  • 4 kudos
7 More Replies
mh_db
by New Contributor III
  • 1824 Views
  • 2 replies
  • 0 kudos

Unable to connect to oracle server from databricks notebook in AWS

I'm trying to connect to oracle server hosted in azure from AWS databricks notebook but seems the connection keeps timing out. I tested the connection IP using telnet <hostIP> 1521 command from another EC2 instance and that seems to reach the oracle ...

Data Engineering
AWS
oracle
TCP
  • 1824 Views
  • 2 replies
  • 0 kudos
Latest Reply
Yeshwanth
Databricks Employee
  • 0 kudos

@mh_db good day! Could you please confirm the Cluster type you used for testing? Was it a Shared Cluster, an Assigned/Single-User Cluster, or a No-Isolation cluster? Could you please try the same on the Assigned/Single User Cluster and No Isolation c...

  • 0 kudos
1 More Replies
dbengineer516
by New Contributor III
  • 2004 Views
  • 3 replies
  • 1 kudos

/api/2.0/preview/sql/queries API only returning certain queries

Hello,When using /api/2.0/preview/sql/queries to list out all available queries, I noticed that certain queries were being shown while others were not. I did a small test on my home workspace, and it was able to recognize certain queries when I defin...

  • 2004 Views
  • 3 replies
  • 1 kudos
Latest Reply
brockb
Databricks Employee
  • 1 kudos

Hi,How many queries were returned in the API call in question? The List Queries documentation describes this endpoint as supporting pagination with a default page size of 25, is that how many you saw returned? Query parameters page_size integer <= 10...

  • 1 kudos
2 More Replies
shadowinc
by New Contributor III
  • 2556 Views
  • 1 replies
  • 1 kudos

spark/databricks temporary views and uuid

Hi All,We have a table which has an id column generated by uuid(). For ETL we use databricks/spark sql temporary views. we observed strange behavior between databricks sql temp view (create or replace temporary view) and spark sql temp view (df.creat...

Data Engineering
Databricks SQL
spark sql
temporary views
uuid
  • 2556 Views
  • 1 replies
  • 1 kudos
Maatari
by New Contributor III
  • 1638 Views
  • 1 replies
  • 1 kudos

Resolved! DataBricks Auto loader vs input source files deletion detection

Hi, While ingesting files from a source folder continuously, I would like to be able to detect the case where files are being deleted. As far as I can tell the Autoloader can not handle the detection of files deleted in the source folder. Hence the c...

  • 1638 Views
  • 1 replies
  • 1 kudos
Latest Reply
Yeshwanth
Databricks Employee
  • 1 kudos

@Maatari Yes, it is true that Autoloader in Databricks cannot detect the deletion of files in the source folder during continuous ingestion. The Autoloader is designed to process files exactly once unless the option "cloudFiles.allowOverwrites" is en...

  • 1 kudos
AnkithP
by New Contributor
  • 734 Views
  • 0 replies
  • 0 kudos

Datatype changed while writing in delta format

Hello team,I'm encountering an issue with my batch processing job. Initially, I write the job in overwrite mode with overwrite schema set to true. However, when I attempt to write the next batch in append mode, it fails due to a change in the datatyp...

  • 734 Views
  • 0 replies
  • 0 kudos
chardv
by New Contributor II
  • 2035 Views
  • 2 replies
  • 0 kudos

Lakehouse Federation Multi-User Authorization

Since Lakehouse Fed uses only one credential per connection to the foreign database, all queries using the connection will see all the data the credentials has to access to. Would anyone know if Lakehouse Fed will support authorization using the cred...

  • 2035 Views
  • 2 replies
  • 0 kudos
Latest Reply
Yeshwanth
Databricks Employee
  • 0 kudos

@chardv, good day! Could you please share more details and the documentation [if you have referred any]?

  • 0 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels