cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

tingtingchan
by New Contributor
  • 467 Views
  • 0 replies
  • 0 kudos

Greetings for My 1st Data+AI Summit 2023!

Kudos to the amazing instructors and TAs for my first in-person Data Engineer Associate Training, and I've passed my exam! Having a fantastic time so far, can't wait for the content unfolded in the next two days!

  • 467 Views
  • 0 replies
  • 0 kudos
Orianh
by Valued Contributor II
  • 4983 Views
  • 4 replies
  • 3 kudos

function does not exist in JVM ERROR

Hello guys, I'm building a python package that return 1 row from DF at a time inside data bricks environment.To improve the performance of this package i used multiprocessing library in python, I have background process that his whole purpose is to p...

function dont exist in JVM error.
  • 4983 Views
  • 4 replies
  • 3 kudos
Latest Reply
dineshreddy
New Contributor II
  • 3 kudos

Using thread instead of processes solved the issue for me

  • 3 kudos
3 More Replies
Anonymous
by Not applicable
  • 1287 Views
  • 2 replies
  • 1 kudos

Delta Tables copying

Hello, I’m trying to copy a table with all it’s versions to unity catalog, I know I can use deep cloning but I want the table with the full history, is that possible?

  • 1287 Views
  • 2 replies
  • 1 kudos
Latest Reply
bikash84
New Contributor III
  • 1 kudos

To copy history, you would have to copy files along with the delta log folder and then create a delta table on that location

  • 1 kudos
1 More Replies
Kunda
by New Contributor III
  • 773 Views
  • 1 replies
  • 0 kudos

Resolved! Welcome

Welcome!

  • 773 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kunda
New Contributor III
  • 0 kudos

Welcome!

  • 0 kudos
Anonymous
by Not applicable
  • 919 Views
  • 2 replies
  • 0 kudos

databricks view question!

I found this phrase in the document "A view stores the text for a query type again one or more data sources or tables in the metastore."Does "view" in databricks store data in a physical location?

  • 919 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

CREATE VIEW | Databricks on AWS - Constructs a virtual table that has no physical data based on the result-set of a SQL query.

  • 0 kudos
1 More Replies
PraveenKarnam
by New Contributor II
  • 524 Views
  • 0 replies
  • 0 kudos

Set up RBACs with hive catalog

Hello, we are not on unity catalog yet due to limitations on multi cloud implementation of UC. We still want to implement Role Based Acess Control with hive metastore. We are using DBR 11.3. Any pointers will be helpful 

  • 524 Views
  • 0 replies
  • 0 kudos
Serhii
by Contributor
  • 1554 Views
  • 3 replies
  • 1 kudos

Could not launch jobs due to node_type_id (instance) unavailability

I am running hourly job on a cluster using p3.2xlarge GPU instance, but sometimes cluster couldn't start due to instance unavailability. I wander is there is any fallback mechanism to, for example, try a different instance type if one is not availabl...

  • 1554 Views
  • 3 replies
  • 1 kudos
Latest Reply
abagshaw
New Contributor III
  • 1 kudos

 (AWS only) For anyone experiencing capacity related cluster launch failures on non-GPU instance types, AWS Fleet instance types are now GA and available for clusters and instance pools. They help improve chance of successful cluster launch by allowi...

  • 1 kudos
2 More Replies
Anonymous
by Not applicable
  • 951 Views
  • 1 replies
  • 0 kudos

Instance type in Photon

Can Photon run on all instance/VM types?

  • 951 Views
  • 1 replies
  • 0 kudos
Latest Reply
abagshaw
New Contributor III
  • 0 kudos

No, Photon is only supported on a limited set of instance types where it's been benchmarked and tested by Databricks to have optimal performance.

  • 0 kudos
JPKC
by New Contributor
  • 1505 Views
  • 3 replies
  • 1 kudos

Support for multiple EC2 instance types in a worker pool

As per this thread Databricks now integrates with EC2 CreateFleet API that allows customers to create Databricks pools and get EC2 instances from multiple AZs and multiple instance families & sizes. However, in the Databricks UI you can not select mo...

  • 1505 Views
  • 3 replies
  • 1 kudos
Latest Reply
abagshaw
New Contributor III
  • 1 kudos

Fleet instances on Databricks is now GA and available in all AWS workspaces - you can find more details here: https://docs.databricks.com/compute/aws-fleet-instances.html

  • 1 kudos
2 More Replies
umair_hanif
by New Contributor II
  • 1844 Views
  • 2 replies
  • 1 kudos

Ingesting more than 7 million rows into a SQL Server Table

Hi All, I hope you're super well. I need your recommendations and solution for my problem.I am using a Databricks instance DS12_v2 which has 28GB RAM and 4 cores. I am ingesting 7.2 million rows into a SQL Server table and it is taking 57 min - 1 hou...

  • 1844 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

You can try to use BULK INSERT.https://learn.microsoft.com/en-us/sql/t-sql/statements/bulk-insert-transact-sql?view=sql-server-ver16Also using Data Factory instead of Databricks for the copy can be helpful.

  • 1 kudos
1 More Replies
verargulla
by New Contributor III
  • 2784 Views
  • 5 replies
  • 8 kudos

Databricks Academy content for Azure Databricks Customers

Hi! We've recently provisioned an Azure Databricks workspace and started building our pipelines. Do we qualify as Databricks 'customers' who have free access to all self-paced content on Databricks Academy? If so, how do we access it? We don't have a...

  • 2784 Views
  • 5 replies
  • 8 kudos
Latest Reply
fpasid
New Contributor II
  • 8 kudos

They changed the registration process and added 'Additional Fields' section, where you can provide your company email address, that you use in Azure Databricks. This worked automatically for me and I can access the self-paced trainings for free now.

  • 8 kudos
4 More Replies
chandan_a_v
by Valued Contributor
  • 6820 Views
  • 3 replies
  • 6 kudos

How to restart the Spark session within the notebook without reattaching the notebook?

Hi All,I want to run an ETL pipeline in a sequential way in my DB notebook. If I run it without resetting the Spark session or restarting the cluster I am getting a data frame key error. I think this might be because of the Spark cache because If I r...

  • 6820 Views
  • 3 replies
  • 6 kudos
Latest Reply
Anonymous
Not applicable
  • 6 kudos

Is there a solution to the above problem? I also would like to restart SparkSession to free my cluster's resources, but when callingspark.stop()the notebook automatically detach and the following error occurs:The spark context has stopped and the dri...

  • 6 kudos
2 More Replies
Anonymous
by Not applicable
  • 1083 Views
  • 0 replies
  • 0 kudos

Optimal Azure VM type for EventHub streaming

Hello,our spark jobs stream messages from Event Hub then transform it and finally the messages are peristed in storage. We plan to exercise cluster configurations for these jobs in order to find the optimal and procure Azure reservations. Furtemore, ...

Data Engineering
azure
cluster
eventhub
streaming
vm
  • 1083 Views
  • 0 replies
  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels