Data Engineering

Forum Posts

Sorted by:

by syedmuhammedmeh • New Contributor III

09-17-2022 3:05:36 AM

4325 Views
2 replies
6 kudos

Resolved! Databricks Kafka Read Not connecting

I'm trying to read data from GCP kafka through azure databricks but getting below warning and notebook is simply not completing. Any suggestion please? WARN NetworkClient: Consumer groupId Bootstrap broker rack disconnectedPlease note I've properly c...

Data Engineering

4325 Views
2 replies
6 kudos

09-17-2022 3:05:36 AM

View Replies

Latest Reply

jose_gonzalez
Databricks Employee

09-19-2022 2:45:43 PM

6 kudos

Could you share the full error stack trace from your driver's logs? This is a Warning message, we need to take a look at the error level messages.

6 kudos

09-19-2022 2:45:43 PM

1 More Replies

by antoniok • New Contributor II

09-08-2022 5:36:57 AM

4875 Views
1 replies
3 kudos

dbutils.fs.ls is giving "null uri host This can be caused by unencoded / in the password string"

I'm trying to list number of files in s3 bucket. I've initially used "aws s3 ls <s3://>" to list the files and it worked. However, when trying to do the same using dbutils.fs.ls, I'm getting java.lang.NullPointerException: null uri host. This can be ...

Data Engineering

4875 Views
1 replies
3 kudos

09-08-2022 5:36:57 AM

View Replies

Latest Reply

marcus1
New Contributor III

09-27-2022 8:53:32 AM

3 kudos

You might be encountering an issue with bucket naming. Which I'm also getting with a bucket named something.[0-9]https://issues.apache.org/jira/browse/HADOOP-17241

3 kudos

09-27-2022 8:53:32 AM

by Lizzz • New Contributor II

09-26-2022 3:34:44 PM

4823 Views
1 replies
3 kudos

Resolved! Forward Spark structured streaming metrics to Datadog

We have a spark streaming application written in Pyspark that we'd like to monitor with Datadog. By default, datadog collects a couple of streaming metrics like 'spark.structured_streaming.processing_rate' and 'spark.structured_streaming.latency'. Ho...

Data Engineering

4823 Views
1 replies
3 kudos

09-26-2022 3:34:44 PM

View Replies

Latest Reply

shan_chandra
Databricks Employee

09-27-2022 8:48:52 AM

3 kudos

@Liz Zhang , Please refer to the below documentation contain pyspark implementation of streamingQueryListener https://www.databricks.com/blog/2022/05/27/how-to-monitor-streaming-queries-in-pyspark.html

3 kudos

09-27-2022 8:48:52 AM

by fhte • New Contributor

09-14-2022 12:00:58 AM

2789 Views
2 replies
0 kudos

How to install R GeoLift library on Databrickts

Hi, I am having problems installing the GeoLift library. I am proceeding as per the official instructions: https://facebookincubator.github.io/GeoLift/docs/GettingStarted/InstallingRThis is what I run in the notebook:1) I install this particular vers...

Data Engineering

2789 Views
2 replies
0 kudos

09-14-2022 12:00:58 AM

View Replies

Latest Reply

jose_gonzalez
Databricks Employee

09-27-2022 8:44:48 AM

0 kudos

Hi @Ludmila Kuncarova,I would like to share the following link to our docs https://docs.databricks.com/libraries/notebooks-r-libraries.html in this link you will be able to find more details on how to install R libraries.

0 kudos

09-27-2022 8:44:48 AM

1 More Replies

by Yuliya • New Contributor II

09-13-2022 7:29:42 PM

2948 Views
2 replies
3 kudos

Azure Databricks SQL Warehouse connection issue

When trying to start SQL Warehouse from my Azure pay-as-you-go subscription, I'm getting error about not enough vCPUs provisioned. Documentation says to increase quota at Azure portal - but it requires knowing type of vCPUs to provision. What type of...

Data Engineering

2948 Views
2 replies
3 kudos

09-13-2022 7:29:42 PM

View Replies

Latest Reply

jose_gonzalez
Databricks Employee

09-27-2022 8:40:43 AM

3 kudos

Hi @Yuliya Quintela,Just a friendly follow-up. Did Rostislaw's response help you to resolve your question? if it did, please mark it as best. Otherwise, please let us know if you still need help.

3 kudos

09-27-2022 8:40:43 AM

1 More Replies

by Frank • New Contributor III

09-21-2022 12:55:29 PM

14224 Views
9 replies
2 kudos

SQLAlchemy ORM Connection String Error

We tried to insert records to Delta table using ORM. It looks like only SQLAlchemy has option to connect to Delta table.We tried the following codefrom sqlalchemy import Column, String, DateTime, Integer, create_engine engine = create_engine("data...

Data Engineering

14224 Views
9 replies
2 kudos

09-21-2022 12:55:29 PM

View Replies

Latest Reply

Ryan_Chynoweth
Databricks Employee

09-26-2022 8:42:14 AM

2 kudos

Hi @Frank Zhang , Please disregard the driver comment. The Python SQL Connector requires no driver. Just a pip install and you are good to go. The links you provided don't actually show a working example of using SQL Alchemy's ORM to connect to Data...

2 kudos

09-26-2022 8:42:14 AM

8 More Replies

by KrishZ • Contributor

09-13-2022 12:09:45 PM

2057 Views
2 replies
0 kudos

Where to report a bug with Databricks ?

I have in issue in Pyspark.Pandas to report. Is there a github or some forum where I can register my issue?Here's the issue

Data Engineering

2057 Views
2 replies
0 kudos

09-13-2022 12:09:45 PM

View Replies

Latest Reply

Anonymous
Not applicable

09-27-2022 5:14:48 AM

0 kudos

Hi @Krishna Zanwar Does @Debayan Mukherjee response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

0 kudos

09-27-2022 5:14:48 AM

1 More Replies

by PriyaTech • New Contributor

09-26-2022 11:47:08 PM

4869 Views
1 replies
2 kudos

Resolved! Converting Dataframe into Nested xml

e.g.dataframe is having firstname,lastname,middlename,id,salaryI need to convert dataframe in xml file but in nested format.output as nested xml<Name> <firatname> <middlename> <lastname> </Name><id></id><salary></salary>Anyone has ides ho...

Data Engineering

4869 Views
1 replies
2 kudos

09-26-2022 11:47:08 PM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

09-27-2022 2:42:38 AM

2 kudos

databricks has a xml connector:https://docs.databricks.com/data/data-sources/xml.htmlBasically you just define a df with the correct structure and write it to xml.To create a nested df, here you can find some info.

2 kudos

09-27-2022 2:42:38 AM

by LearningDatabri • Contributor II

09-22-2022 4:14:55 PM

8978 Views
8 replies
9 kudos

repos issue

Why repos works on one workspace and doesn't on another workspace? both have repos enabled.

Data Engineering

8978 Views
8 replies
9 kudos

09-22-2022 4:14:55 PM

View Replies

Latest Reply

Prabakar
Databricks Employee

09-27-2022 1:32:54 AM

9 kudos

Do you see any errors or what is the issue that you are facing? Could you please describe more about this problem?

9 kudos

09-27-2022 1:32:54 AM

7 More Replies

by Abhijeet • New Contributor III

09-26-2022 11:29:12 PM

3336 Views
3 replies
6 kudos

Resolved! Streaming pipeline orchestration

For a batch job I can use ADF and Databricks notebook activity to create a pipeline.Similarly what Azure stack I should use to run Structured streaming Databricks notebook for a production ready pipeline.

Data Engineering

3336 Views
3 replies
6 kudos

09-26-2022 11:29:12 PM

View Replies

Latest Reply

Abhijeet
New Contributor III

09-27-2022 1:44:54 AM

6 kudos

ok Sure

6 kudos

09-27-2022 1:44:54 AM

2 More Replies

by Frank • New Contributor III

09-26-2022 11:15:48 PM

6477 Views
1 replies
2 kudos

Resolved! Serverless or Managed

We have about 12k write/s and 1.5TB/mo compressed S3 data. How can we choose between Serverless vs managed? And what will be good way to project the cost? In serverless, how the machine and hours scaled or scheduled based on the load? If there is a l...

Data Engineering

6477 Views
1 replies
2 kudos

09-26-2022 11:15:48 PM

View Replies

Latest Reply

Prabakar
Databricks Employee

09-27-2022 1:21:59 AM

2 kudos

Hi @Frank Zhang How can we choose between Serverless vs managed? And what will be good way to project the cost? -- Once you enable the serverless feature on your workspace, by default the new warehouse will be created with a serverless option. If yo...

2 kudos

09-27-2022 1:21:59 AM

by Monika8991 • New Contributor II

09-13-2022 12:18:06 AM

3823 Views
2 replies
1 kudos

Getting spark/scala versioning issues while running the spark jobs through Jar

We tried moving our scala script from standalone cluster to databricks platform. Our script is compatible with following version:Spark: 2.4.8 Scala: 2.11.12The databricks cluster has spark/scala following with version:Spark: 3.2.1. Scala: 2.121: we ...

Data Engineering

3823 Views
2 replies
1 kudos

09-13-2022 12:18:06 AM

View Replies

Latest Reply

Anonymous
Not applicable

09-27-2022 12:29:27 AM

1 kudos

Hi @Monika Samant Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Than...

1 kudos

09-27-2022 12:29:27 AM

1 More Replies

by j_afanador • Contributor II

09-26-2022 12:27:50 PM

2379 Views
1 replies
2 kudos

Resolved! Badge not received for Databricks Lakehouse Fundamentals Accreditation

Hello!I cleared the assessment for Databricks Lakehouse Fundamentals Accreditationbut not received a badge. Kindly assist me with this

Data Engineering

2379 Views
1 replies
2 kudos

09-26-2022 12:27:50 PM

View Replies

Latest Reply

Anonymous
Not applicable

09-26-2022 11:39:30 PM

2 kudos

Hi @Juan Afanador Thank you for reaching out! Please submit a ticket to our Training Team here: https://help.databricks.com/s/contact-us?ReqType=training and our team will get back to you shortly.

2 kudos

09-26-2022 11:39:30 PM

by Maho • New Contributor

09-26-2022 5:12:58 AM

1949 Views
1 replies
1 kudos

Resolved! Lakehouse Fundamentals badge not received

Hi I have finished Lakehouse Fundamentals assessment, received my completion certificate but so far did not receive a badge for it. Would you be able to assist please?

Data Engineering

1949 Views
1 replies
1 kudos

09-26-2022 5:12:58 AM

View Replies

Latest Reply

Anonymous
Not applicable

09-26-2022 10:01:52 PM

1 kudos

Hi @Maciej Oleksy Thank you for reaching out! Please submit a ticket to our Training Team here: https://help.databricks.com/s/contact-us?ReqType=training and our team will get back to you shortly.

1 kudos

09-26-2022 10:01:52 PM

by Trushna • New Contributor II

09-22-2022 9:43:59 AM

4546 Views
3 replies
0 kudos

How to restart Databricks Cluster at specific time?

Command available for restart but not at specific time.databricks clusters restart --cluster-id <>

Data Engineering

4546 Views
3 replies
0 kudos

09-22-2022 9:43:59 AM

View Replies

Latest Reply

karthik_p
Databricks Partner

09-26-2022 1:07:59 PM

0 kudos

@Trushna Khatri adding some more information to prabakar. can you please let me know what is actual need of starting cluster during specific time. usually if you criteria is to use for jobs go with job cluster. here cluster start when ever your job ...

0 kudos

09-26-2022 1:07:59 PM

2 More Replies

Databricks Community

Forum Posts

Resolved! Databricks Kafka Read Not connecting

dbutils.fs.ls is giving "null uri host This can be caused by unencoded / in the password string"

Resolved! Forward Spark structured streaming metrics to Datadog

How to install R GeoLift library on Databrickts

Azure Databricks SQL Warehouse connection issue

SQLAlchemy ORM Connection String Error

Where to report a bug with Databricks ?

Resolved! Converting Dataframe into Nested xml

repos issue

Resolved! Streaming pipeline orchestration

Resolved! Serverless or Managed

Getting spark/scala versioning issues while running the spark jobs through Jar

Resolved! Badge not received for Databricks Lakehouse Fundamentals Accreditation

Resolved! Lakehouse Fundamentals badge not received

How to restart Databricks Cluster at specific time?

File Arrival Trigger - Multiple tables

Issue while handling Deletes and Inserts in Struct...

DLT with CDC and schema changes in streaming pipel...

how to update not tracked column only in new row v...

Databricks Cost Estimation Template