cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

YOUKE
by New Contributor II
  • 261 Views
  • 4 replies
  • 1 kudos

Resolved! Managed Tables on Azure databricks

Hi everyone,I was trying to understand: when a managed table is created, Databricks stores the metadata in the Hive metastore and the data in the cloud storage managed by it, which in the case of Azure Databricks will be an Azure Storage Account. But...

  • 261 Views
  • 4 replies
  • 1 kudos
Latest Reply
BraydenJordan
New Contributor II
  • 1 kudos

Thank you so much for the solution.

  • 1 kudos
3 More Replies
Gusman
by New Contributor II
  • 141 Views
  • 1 replies
  • 1 kudos

Resolved! How to send BINARY parameters using the REST Sql API?

We are trying to send a SQL query to the REST API including a BINARY parameter, EX:"INSERT INTO MyTable (BinaryField) VALUES(:binaryData)"We tried to encode the parameter as base64 and specify that is a BINARY type but it throws a mapping error, if w...

  • 141 Views
  • 1 replies
  • 1 kudos
Latest Reply
cgrant
Databricks Employee
  • 1 kudos

Trying to serialize as binary can be pretty challenging, here's a way to do this with base64 - the trick is to serialize as base64 string and insert as binary with unbase64.  databricks api post /api/2.0/sql/statements --json '{ "warehouse_id": "wa...

  • 1 kudos
AcrobaticMonkey
by New Contributor II
  • 213 Views
  • 2 replies
  • 0 kudos

Alerts for Failed Queries in Databricks

How can we set up automated alerts to notify us when queries executed by a specific service principal fail in Databricks?

  • 213 Views
  • 2 replies
  • 0 kudos
Latest Reply
AcrobaticMonkey
New Contributor II
  • 0 kudos

@Alberto_UmanaOur service principal uses the SQL Statement API to execute queries. We want to receive notifications for each query failure. While SQL Alerts are an option, they do not provide immediate responses. Is there a better solution to achieve...

  • 0 kudos
1 More Replies
seanstachff
by New Contributor II
  • 242 Views
  • 5 replies
  • 0 kudos

Using FROM_CSV giving unexpected results

Hello, I am trying to use from_csv in the sql warehouse, but I am getting unexpected results:As a small example I am running: WITH your_table AS ( SELECT 'a,b,c\n1,"hello, world",3.14\n2,"goodbye, world",2.71' AS csv_column ) SELECT from_csv(csv_c...

  • 242 Views
  • 5 replies
  • 0 kudos
Latest Reply
TakuyaOmi
Contributor III
  • 0 kudos

@seanstachff Here is the code I used to produce the results shown in the image I shared earlier. It's a bit verbose, so I’m not entirely satisfied with it, but I hope it might provide some helpful insights for you.%sql WITH your_table AS ( -- Examp...

  • 0 kudos
4 More Replies
LorenRD
by Contributor
  • 8908 Views
  • 10 replies
  • 9 kudos
  • 8908 Views
  • 10 replies
  • 9 kudos
Latest Reply
miranda_luna_db
Databricks Employee
  • 9 kudos

Hi Mike - We're working on a capability that will allow auth to be delegated to the app. Happy to set up some time to share plans/get feedback if of interest. If you reach out to your account team they can help make it happen!

  • 9 kudos
9 More Replies
martkev
by New Contributor
  • 260 Views
  • 1 replies
  • 0 kudos

Networking Setup in Standard Tier – VNet Integration and Proxy Issues

Hi everyone,We are working on an order forecasting model using azure databricks and an ml model from Hugging Face and are running into an issue where the connection over SSL (port 443) fails during the handshake (EOF Error SSL 992). We suspect that a...

  • 260 Views
  • 1 replies
  • 0 kudos
Latest Reply
arjun_kr
Databricks Employee
  • 0 kudos

It may depend on your UDR setup. If you have a UDR rule routing the traffic to any firewall appliance, it may possibly be related to traffic not being allowed in the firewall. If there is no UDR or UDR rule routes this traffic to the Internet, it wou...

  • 0 kudos
Anonymous
by Not applicable
  • 13879 Views
  • 8 replies
  • 13 kudos

Resolved! MetadataChangedException

A delta lake table is created with identity column and I'm not able to load the data parallelly from four process. i'm getting the metadata exception error.I don't want to load the data in temp table . Need to load directly and parallelly in to delta...

  • 13879 Views
  • 8 replies
  • 13 kudos
Latest Reply
cpc0707
New Contributor II
  • 13 kudos

I'm having the same issue, need to load a large amount of data from separate files into a delta table and I want to do it with a for each loop so I don't have to run it sequentially which will take days. There should be a way to handle this 

  • 13 kudos
7 More Replies
Ulman
by New Contributor II
  • 2843 Views
  • 9 replies
  • 1 kudos

Switching to File Notification Mode with ADLS Gen2 - Encountering StorageException

Hello,We are currently utilizing an autoloader with file listing mode for a stream, which is experiencing significant latency due to the non-incremental naming of files in the directory—a condition that cannot be altered.In an effort to mitigate this...

Data Engineering
ADLS gen2
autoloader
file notification mode
  • 2843 Views
  • 9 replies
  • 1 kudos
Latest Reply
Rah_Cencora
New Contributor II
  • 1 kudos

You should also reevaluate your use of premium storage for your landing area files. Typically, storage for raw files does not need to be the fastest and most resilient and expensive tier. Unless you have a compelling reason for premium storage for la...

  • 1 kudos
8 More Replies
vanverne
by New Contributor II
  • 234 Views
  • 2 replies
  • 1 kudos

Assistance with Capturing Auto-Generated IDs in Databricks SQL

Hello,I am currently working on a project where I need to insert multiple rows into a table and capture the auto-generated IDs for each row. I am using databricks sql connector. Here is a simplified version of my current workflow:I create a temporary...

  • 234 Views
  • 2 replies
  • 1 kudos
Latest Reply
vanverne
New Contributor II
  • 1 kudos

Thanks for the reply, Alfonso. I noticed you mentioned "Below are a few alternatives...", however, I am not seeing those. Please let me know if I am missing something. Also, do you know if Databricks is working on supporting the RETURNING clause soon...

  • 1 kudos
1 More Replies
jeremy98
by New Contributor III
  • 211 Views
  • 5 replies
  • 0 kudos

Move Databricks service to another resource group

Hello,Is it possible to move in another resource group the databricks service without any problem?I have a resource group where there are two workspaces the prod and staging environment, I created another resource group to maintain only the databrick...

  • 211 Views
  • 5 replies
  • 0 kudos
Latest Reply
jeremy98
New Contributor III
  • 0 kudos

Hi @szymon_dybczak,Yes, the settings and creation are handled in Terraform.We are using UC while leaving Databricks to manage the UC metastore, so the tables are managed. (Is this something we need to handle on our side?)For table creation, I’ve set ...

  • 0 kudos
4 More Replies
angelop
by New Contributor
  • 84 Views
  • 1 replies
  • 0 kudos

Databricks Clean Rooms creation

I am trying to create a Databricks Clean Rooms instance, I have been following the video from Databricks youtube channel.As I only have one workspace, to create a clean rooms I have added my own Clean Room sharing identifier,when I do that I get the ...

  • 84 Views
  • 1 replies
  • 0 kudos
Latest Reply
TakuyaOmi
Contributor III
  • 0 kudos

@angelop I tried it as well and encountered the same error. A new collaborator needs to be set up. If that’s not feasible, it would be advisable to reach out to Databricks support.By the way, the following video provides a more detailed explanation a...

  • 0 kudos
The_Demigorgan
by New Contributor
  • 1365 Views
  • 1 replies
  • 0 kudos

Autoloader issue

I'm trying to ingest data from Parquet files using Autoloader. Now, I have my custom schema, I don't want to infer the schema from the parquet files.During readstream everything is fine. But during writestream, it is somehow inferring the schema from...

  • 1365 Views
  • 1 replies
  • 0 kudos
Latest Reply
cgrant
Databricks Employee
  • 0 kudos

In this case, please make sure you specify the schema explicitly when reading the Parquet files and do not specify any inference options. Something like spark.readStream.format("cloudFiles").schema(schema)... If you want to more easily grab the schem...

  • 0 kudos
vmpmreistad
by New Contributor II
  • 4148 Views
  • 4 replies
  • 0 kudos

How to make structured streaming with autoloader efficiently and incrementally read files

TLDR format: How do I make a structured streaming job using autoloader read files using InMemoryFileIndex instead of DeltaFileOperations?I'm running a structured streaming job from an external (ADLS Gen2, abfss://), storage account which has avro fil...

vmpmreistad_0-1696330325302.png
  • 4148 Views
  • 4 replies
  • 0 kudos
Latest Reply
cgrant
Databricks Employee
  • 0 kudos

@Wundermobility  The best way to debug this is to look at the Spark UI to see if a job has been launched. One thing to call out is that trigger.Once is deprecated - we recommend using trigger.availableNow instead to avoid overwhelming the cluster.

  • 0 kudos
3 More Replies
jonathan-dufaul
by Valued Contributor
  • 104 Views
  • 1 replies
  • 0 kudos

where to report a bug in the sql formatter?

I was wondering where I go to report a bug in the sql formatter. I tried sending an email to the helpdesk but they think I'm asking for support. I'm not. I just want to report a bug in the application because I think they should know about it. I don'...

jonathandufaul_0-1733417237553.png jonathandufaul_1-1733417252375.png
  • 104 Views
  • 1 replies
  • 0 kudos
Latest Reply
TakuyaOmi
Contributor III
  • 0 kudos

HI, @jonathan-dufaul For such reports, I think it would be appropriate to click on your profile icon in the top-right corner of the workspace and use the "Send Feedback" option. 

  • 0 kudos
jonathan-dufaul
by Valued Contributor
  • 1609 Views
  • 2 replies
  • 1 kudos

How do I specify column types when writing to a MSSQL server using the JDBC driver (

I have a pyspark dataframe that I'm writing to an on-prem MSSQL server--it's a stopgap while we convert data warehousing jobs over to databricks. The processes that use those tables in the on-prem server rely on the tables maintaining the identical s...

  • 1609 Views
  • 2 replies
  • 1 kudos
Latest Reply
dasanro
New Contributor II
  • 1 kudos

It's happenging to me too!Did you find any solution @jonathan-dufaul  ?Thanks!!

  • 1 kudos
1 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels