cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

Kanna1706
by New Contributor III
  • 2982 Views
  • 3 replies
  • 4 kudos
  • 2982 Views
  • 3 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hi @Machireddy Nikitha​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best an...

  • 4 kudos
2 More Replies
kll
by New Contributor III
  • 2755 Views
  • 1 replies
  • 0 kudos

Fatal error: The Python kernel is unresponsive when attempting to query data from AWS Redshift within Jupyter notebook

I am running jupyter notebook on a cluster with configuration: 12.2 LTS (includes Apache Spark 3.3.2, Scala 2.12)Worker type: i3.xlarge 30.5gb memory, 4 coresMin 2 and max 8 workers cursor = conn.cursor()   cursor.execute( """ ...

  • 2755 Views
  • 1 replies
  • 0 kudos
Latest Reply
Debayan
Databricks Employee
  • 0 kudos

Hi, Could you please confirm the usage of your cluster while running this job? you can monitor the performance here: https://docs.databricks.com/clusters/clusters-manage.html#monitor-performance with different metrics. Also, please tag @Debayan​ with...

  • 0 kudos
sensanjoy
by Contributor
  • 13052 Views
  • 5 replies
  • 1 kudos

Resolved! Performance issue with pyspark udf function calling rest api

Hi All,I am facing some performance issue with one of pyspark udf function that post data to REST API(uses cosmos db backend to store the data).Please find the details below: # The spark dataframe(df) contains near about 30-40k data. # I am using pyt...

  • 13052 Views
  • 5 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Sanjoy Sen​ Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedback w...

  • 1 kudos
4 More Replies
MaheshDR
by New Contributor II
  • 8614 Views
  • 6 replies
  • 1 kudos

Open firewall to Azure Databricks workspace from AWS RDS machine/EC2 machine

Hi All,As part of our solution approach, we need to connect to one of our AWS RDS Oracle databases from Azure Databricks notebook.We need your help to understand which IP range of Azure Databricks to consider to whitelist them on AWS RDS security gro...

  • 8614 Views
  • 6 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Mahesh D​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 1 kudos
5 More Replies
jakubk
by Contributor
  • 9261 Views
  • 13 replies
  • 9 kudos

dbt workflow job limitations - naming the target? where do docs go?

I'm on unity catalogI'm trying to do a dbt run on a project that works locallybut the databricks dbt workflow task seems to be ignoring the project.yml settings for schemas and catalogs, as well as that defined in the config block of individual model...

  • 9261 Views
  • 13 replies
  • 9 kudos
Latest Reply
Anonymous
Not applicable
  • 9 kudos

Hi @Jakub K​ I'm sorry you could not find a solution to your problem in the answers provided.Our community strives to provide helpful and accurate information, but sometimes an immediate solution may only be available for some issues.I suggest provid...

  • 9 kudos
12 More Replies
SS2
by Valued Contributor
  • 1846 Views
  • 2 replies
  • 0 kudos

How we can read data from adls gen 2 using bash (%sh) command.(without mounting)

Hi @Ananth Arunachalam/Team,Can we read file from ADLS gen 2 using shell script (%%bash or %%sh ) without doing mounting.​ Please let me know. Thank you.​

  • 1846 Views
  • 2 replies
  • 0 kudos
Latest Reply
karthik_p
Esteemed Contributor
  • 0 kudos

@S S​ you can access data in ADLS GEn2 using multiple ways, please check below article.easy way is using storage account access key method https://learn.microsoft.com/en-us/azure/databricks/storage/azure-storage

  • 0 kudos
1 More Replies
blockee
by New Contributor II
  • 6422 Views
  • 3 replies
  • 0 kudos

DE 4.1 - DLT UI Walkthrough Error in Classroom Setup

Trying to follow along with the DLT videos in the academy. I get an error when running the setup script. Error trace below. It stems from running Classroom-Setup-04.1DA = DBAcademyHelper(course_config=course_config,                     lesson_config=...

  • 6422 Views
  • 3 replies
  • 0 kudos
Latest Reply
blockee
New Contributor II
  • 0 kudos

I tried with Py4J versions 0.10.9.5, .3, and .1. None of those versions worked. I also tried upgrading the runtime to 13.0 and 12.1 and saw the same issue. The 13.0 runtime upgraded Py4J to 0.10.9.7 and that didn't resolve the issue. The error stayed...

  • 0 kudos
2 More Replies
adrin
by New Contributor III
  • 36010 Views
  • 9 replies
  • 6 kudos

Resolved! How to access the result of a %sql cell from python

I see the way to move from python to sql is to create a temp view, and then access that dataframe from sql, and in a sql cell. Now the question is, how can I have a %sql cell with a select statement in it, and assign the result of that statement to ...

  • 36010 Views
  • 9 replies
  • 6 kudos
Latest Reply
dogwoodlx
New Contributor II
  • 6 kudos

Results from an SQL cell are available as a Python DataFrame. The Python DataFrame name is _sqldf.To save the DataFrame, run this code in a Python cell:df = _sqldfKeep in mind that the value in _sqldf is held in memory and will be replaced with the m...

  • 6 kudos
8 More Replies
shamly
by New Contributor III
  • 4077 Views
  • 4 replies
  • 4 kudos

Urgent - Use Python Variable in shell command in databricks notebook

I am trying to read a csv and do an activity from azure storage account using databricks shell script. I wanted to add this shell script into my big python code for other sources as well. I have created widgets for file path in python. I have created...

  • 4077 Views
  • 4 replies
  • 4 kudos
Latest Reply
SS2
Valued Contributor
  • 4 kudos

You can mount the storage account and then can set env level variable and can do the operation that you want.

  • 4 kudos
3 More Replies
KVNARK
by Honored Contributor II
  • 3082 Views
  • 9 replies
  • 5 kudos

It would be great if Databricks starts increasing the number of rewards, as the no of users in community ae increasing. When we want to redeem somethi...

It would be great if Databricks starts increasing the number of rewards, as the no of users in community ae increasing. When we want to redeem something the limited goodies available in community rewards portal are out of stock. So its better to incr...

  • 3082 Views
  • 9 replies
  • 5 kudos
Latest Reply
pvignesh92
Honored Contributor
  • 5 kudos

@Kaniz Fatma​ @Vidula Khanna​ Hi. I just see the below rewards available to redeem. Is this different based on the location?

  • 5 kudos
8 More Replies
fuselessmatt
by Contributor
  • 5783 Views
  • 2 replies
  • 1 kudos

Can assign a default value for job parameter from the widget?

The Databricks widget (dbutils) provides the get function for accessing the job parameters of a job.​dbutils.widgets.get('my_param')Unlike Python dict, where get returns None or an optional argument if the dict doesn't contain the parameter, the widg...

  • 5783 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Mattias P​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers you...

  • 1 kudos
1 More Replies
DBJmet
by New Contributor
  • 2038 Views
  • 2 replies
  • 0 kudos

Databricks-Connect Error occurred while running *** java.io.StreamCorruptedException: invalid type code: 00

I am using databricks-connect to access a remote cluster. Everything works as expected and I can set breakpoints and interrogate the results, same for when it trys to execute the following code:val testDF = spark.createDataFrame(spark.sparkContext .e...

  • 2038 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @James Metcalf​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers...

  • 0 kudos
1 More Replies
MetaRossiVinli
by Contributor
  • 6739 Views
  • 2 replies
  • 2 kudos

Resolved! DLT data quality UI was present last week. Now absent. Did I change a setting?

Last week, I started running a DLT pipeline with expectations that dropped rows on streaming live tables. In the side bar for a table, I saw a nice circular chart with Written/Dropped rows and Failed records stats.Today, I ran a similar DLT pipeline ...

  • 6739 Views
  • 2 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

@Kevin Rossi​ :The circular chart with Written/Dropped rows and Failed records stats that you saw in the sidebar of a table in Delta Live Tables (DLT) is a built-in feature called "Data Quality Metrics" that provides a visual representation of the da...

  • 2 kudos
1 More Replies
karthik_p
by Esteemed Contributor
  • 1095 Views
  • 1 replies
  • 4 kudos

Resolved! while creating serverless warehouse we are receiving below message <workspaceId> is no longer eligible for Serverless Compute. Please reach out to your administrator

HI team,As far as limitations and pre-requisites we have met all, able to create warehouse in other workspace which are part of same account, for one of the workspaces we are seeing above issue. we don't any clear error log other than <workspaceId >i...

  • 1095 Views
  • 1 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

@karthik p​ Your workspace might have blocked from serverless feature if there are some uncleared bills. If that's not the case, please file a support case to us.

  • 4 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels