Topics with Label: Data Ingestion & connectivity

Forum Posts

Sorted by:

by User16826992666 • Valued Contributor

06-15-2021 7:31:16 PM

1889 Views
1 replies
0 kudos

Can you use external job scheduling tools to start and schedule Databricks jobs?

I am wondering if I have to use the Databricks jobs scheduler to kick off Databricks jobs. My company already uses another job scheduler for our workflows and it would be useful to add our Databricks jobs to that flow.

Data Engineering

1889 Views
1 replies
0 kudos

06-15-2021 7:31:16 PM

View Replies

Latest Reply

sajith_appukutt
Honored Contributor II

06-17-2021 2:37:42 PM

0 kudos

You could use external tools to schedule jobs in Databricks. Here is a blogpost explaining how Databricks could be used along with Azure Data factory . This blog explains how to use Airflow with DatabricksIt is worth noting that a lot Databricks's f...

0 kudos

06-17-2021 2:37:42 PM

by sajith_appukutt • Honored Contributor II

06-11-2021 5:32:47 PM

5218 Views
1 replies
1 kudos

Resolved! How can I configure a custom DNS for my databricks workspace to talk to my on-premises systems

Data Engineering

5218 Views
1 replies
1 kudos

06-11-2021 5:32:47 PM

View Replies

Latest Reply

sajith_appukutt
Honored Contributor II

06-17-2021 11:18:28 AM

1 kudos

You could set up dnsmasq to configure routing between your Databricks workspace and your on-premise network. More details here

1 kudos

06-17-2021 11:18:28 AM

by User16826992666 • Valued Contributor

06-16-2021 6:13:43 AM

12354 Views
1 replies
1 kudos

Resolved! Can you import a Jupyter notebook to a Databricks workspace?

Also curious if you can export a notebook created in Databricks as a Jupyter notebook

Data Engineering

12354 Views
1 replies
1 kudos

06-16-2021 6:13:43 AM

View Replies

Latest Reply

User16826992666
Valued Contributor

06-16-2021 7:54:05 AM

1 kudos

Yes, the .ipynb format is a supported file type which can be imported to a Databricks workspace. Note that some special configurations may need to be adjusted to work in the Databricks environment. Additional accepted file formats which can be import...

1 kudos

06-16-2021 7:54:05 AM

by Anonymous • Not applicable

06-10-2021 9:12:05 PM

2179 Views
1 replies
1 kudos

Resolved! What are the benefits of Databricks? How is it different than Open Source Spark?

Data Engineering

2179 Views
1 replies
1 kudos

06-10-2021 9:12:05 PM

View Replies

Latest Reply

Digan_Parikh
Valued Contributor

06-11-2021 4:43:00 AM

1 kudos

High level:Check this out for a detailed comparison - https://databricks.com/spark/comparing-databricks-to-apache-spark

1 kudos

06-11-2021 4:43:00 AM

by User16783853032 • Databricks Employee

06-07-2021 2:42:59 PM

2721 Views
1 replies
0 kudos

Databricks notebook command gets cancelled:Generally when cluster is having init scripts or lib issues while starting cluster. Exact error can be look...

Databricks notebook command gets cancelled:Generally when cluster is having init scripts or lib issues while starting cluster. Exact error can be looked into driver logs.

Data Engineering

2721 Views
1 replies
0 kudos

06-07-2021 2:42:59 PM

View Replies

Latest Reply

User16826994223
Honored Contributor III

06-08-2021 12:15:46 AM

0 kudos

Awsome Knowledge

0 kudos

06-08-2021 12:15:46 AM

by User16783855534 • New Contributor III

06-07-2021 10:59:24 AM

6791 Views
3 replies
1 kudos

How many IPs do databricks nodes use?

Data Engineering

6791 Views
3 replies
1 kudos

06-07-2021 10:59:24 AM

View Replies

Latest Reply

sajith_appukutt
Honored Contributor II

06-07-2021 10:25:36 PM

1 kudos

The answer varies depending on the cloud provider (as of June 2021) . In GCP, since the architecture is based on GKE , there are additional ip requirements. For more details see

1 kudos

06-07-2021 10:25:36 PM

2 More Replies

by Anonymous • Not applicable

06-07-2021 8:22:51 AM

1519 Views
1 replies
0 kudos

How long are Databricks Runtime versions supported?

Data Engineering

1519 Views
1 replies
0 kudos

06-07-2021 8:22:51 AM

View Replies

Latest Reply

Anonymous
Not applicable

06-07-2021 8:23:38 AM

0 kudos

Full support for Databricks Runtime versions lasts for six months, with the exception of Long Term Support (LTS) versions, which Databricks supports for two years.https://docs.databricks.com/release-notes/runtime/databricks-runtime-ver.html

0 kudos

06-07-2021 8:23:38 AM

by User16790091296 • Contributor II

05-21-2021 11:29:21 AM

1561 Views
0 replies
0 kudos

docs.databricks.com

What is Databricks Database?A Databricks database is a collection of tables. A Databricks table is a collection of structured data. You can cache, filter, and perform any operations supported by Apache Spark DataFrames on Databricks tables. You can q...

Data Engineering

1561 Views
0 replies
0 kudos

05-21-2021 11:29:21 AM

by saninanda • New Contributor II

09-23-2019 11:48:33 PM

16513 Views
7 replies
0 kudos

how to read schema from text file stored in cloud storage

I have file a.csv or a.parquet while creating data frame reading we can explictly define schema with struct type. instead of write the schema in the notebook want to create schema lets say for all my csv i have one schema like csv_schema and stored ...

Data Engineering

16513 Views
7 replies
0 kudos

09-23-2019 11:48:33 PM

View Replies

Latest Reply

Nakeman
New Contributor II

05-14-2021 2:28:39 AM

0 kudos

@shyampsr big thanks, was searching for the solution almost 3 hours _https://luckycanadian.com/

0 kudos

05-14-2021 2:28:39 AM

6 More Replies

by User16873043212 • New Contributor III

05-07-2021 3:00:48 AM

1163 Views
0 replies
0 kudos

We can now launch pools on databricks with different instance types. Hybrid Pools allows customers to create clusters and select different Databricks ...

We can now launch pools on databricks with different instance types. Hybrid Pools allows customers to create clusters and select different Databricks pools for driver and workers. It provides a way to support driver vs. worker heterogeneity, and ther...

Data Engineering

1163 Views
0 replies
0 kudos

05-07-2021 3:00:48 AM

by PraveenKumarB • New Contributor

04-24-2019 7:08:28 AM

9304 Views
5 replies
0 kudos

java.io.IOException: No FileSystem for scheme: null

Getting the error when try to load the uploaded file in py notebook.# File location and type file_location = "//FileStore/tables/data/d1.csv" file_type = "csv" # CSV options infer_schema = "true" first_row_is_header = "false" delimiter = ","# The app...

Data Engineering

9304 Views
5 replies
0 kudos

04-24-2019 7:08:28 AM

View Replies

Latest Reply

DivyanshuBhatia
New Contributor II

11-22-2020 6:29:46 AM

0 kudos

@naughtonelad if your issue is solved,please let me know as I am facing the same problem

0 kudos

11-22-2020 6:29:46 AM

4 More Replies

by SatheeshSathees • New Contributor

08-19-2020 11:31:33 AM

8374 Views
1 replies
0 kudos

how to dynamically explode array type column in pyspark or scala

HI, i have a parquet file with complex column types with nested structs and arrays. I am using the scrpit from below link to flatten my parquet file. https://docs.microsoft.com/en-us/azure/synapse-analytics/how-to-analyze-complex-schema I am able ...

Data Engineering

8374 Views
1 replies
0 kudos

08-19-2020 11:31:33 AM

View Replies

Latest Reply

shyam_9
Databricks Employee

09-18-2020 12:39:35 PM

0 kudos

Hello, Please check out the below docs and notebook which has similar examples, https://docs.microsoft.com/en-us/azure/synapse-analytics/how-to-analyze-complex-schemahttps://docs.microsoft.com/en-us/azure/databricks/_static/notebooks/transform-comple...

0 kudos

09-18-2020 12:39:35 PM

by DimitrisMpizos • New Contributor

02-08-2016 7:45:52 AM

46513 Views
16 replies
0 kudos

Exporting data from databricks

I couldn't find in documentation a way to export an RDD as a text file to a local folder by using python. Is it possible?

Data Engineering

46513 Views
16 replies
0 kudos

02-08-2016 7:45:52 AM

View Replies

Latest Reply

Manu1
New Contributor II

03-25-2019 8:18:04 AM

0 kudos

To: Export a file to local desktop Workaround : Basically you have to do a "Create a table in notebook" with DBFS The steps are: Click on "Data" icon > Click "Add Data" button > Click "DBFS" button > Click "FileStore" folder icon in 1st pane "Sele...

0 kudos

03-25-2019 8:18:04 AM

15 More Replies

by SergeyIvanchuk • New Contributor

11-16-2018 2:06:02 PM

12815 Views
4 replies
0 kudos

Seaborn plot display in Databricks

I am using Seaborn version 0.7.1 and matplotlib version 1.5.3 The following code does not display a graph in the end. Any idea how to resolve ? (works in Python CLI on my local computer) import seaborn as sns sns.set(style="darkgrid") tips = sns.lo...

Data Engineering

12815 Views
4 replies
0 kudos

11-16-2018 2:06:02 PM

View Replies

Latest Reply

AbbyLemon
New Contributor II

08-04-2020 2:58:33 PM

0 kudos

I found that you create a similar comparison plot as what you get from seaborn by using the display(sparkdf) and adding multiple columns to the 'Values' section while creating a 'Scatter plot'. You get to the 'Customize Plot' by clicking on the icon ...

0 kudos

08-04-2020 2:58:33 PM

3 More Replies

by bhaumikg • New Contributor II

08-29-2019 11:47:37 AM

18816 Views
7 replies
2 kudos

Databricks throwing error "SQL DW failed to execute the JDBC query produced by the connector." while pushing the column with string length more than 255

I am using databricks to transform the data and than pushing the data into datalake. the data is getting pushed in if the length of the string field is 255 or less but it throws following error if it is beyond that. "SQL DW failed to execute the JDB...

Data Engineering

18816 Views
7 replies
2 kudos

08-29-2019 11:47:37 AM

View Replies

Latest Reply

bhaumikg
New Contributor II

04-24-2020 9:23:13 AM

2 kudos

As suggested by ZAIvR, please use append and provide maxlength while pushing the data. Overwrite may not work with this unless databricks team has fixed the issue

2 kudos

04-24-2020 9:23:13 AM

6 More Replies