cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

User16826992666
by Valued Contributor
  • 1889 Views
  • 1 replies
  • 0 kudos

Can you use external job scheduling tools to start and schedule Databricks jobs?

I am wondering if I have to use the Databricks jobs scheduler to kick off Databricks jobs. My company already uses another job scheduler for our workflows and it would be useful to add our Databricks jobs to that flow.

  • 1889 Views
  • 1 replies
  • 0 kudos
Latest Reply
sajith_appukutt
Honored Contributor II
  • 0 kudos

You could use external tools to schedule jobs in Databricks. Here is a blogpost explaining how Databricks could be used along with Azure Data factory . This blog explains how to use Airflow with DatabricksIt is worth noting that a lot Databricks's f...

  • 0 kudos
sajith_appukutt
by Honored Contributor II
  • 5218 Views
  • 1 replies
  • 1 kudos
  • 5218 Views
  • 1 replies
  • 1 kudos
Latest Reply
sajith_appukutt
Honored Contributor II
  • 1 kudos

You could set up dnsmasq to configure  routing between your Databricks workspace and your on-premise network. More details here

  • 1 kudos
User16826992666
by Valued Contributor
  • 12354 Views
  • 1 replies
  • 1 kudos

Resolved! Can you import a Jupyter notebook to a Databricks workspace?

Also curious if you can export a notebook created in Databricks as a Jupyter notebook

  • 12354 Views
  • 1 replies
  • 1 kudos
Latest Reply
User16826992666
Valued Contributor
  • 1 kudos

Yes, the .ipynb format is a supported file type which can be imported to a Databricks workspace. Note that some special configurations may need to be adjusted to work in the Databricks environment. Additional accepted file formats which can be import...

  • 1 kudos
User16783853032
by Databricks Employee
  • 2721 Views
  • 1 replies
  • 0 kudos

Databricks notebook command gets cancelled:Generally when cluster is having init scripts or lib issues while starting cluster. Exact error can be look...

Databricks notebook command gets cancelled:Generally when cluster is having init scripts or lib issues while starting cluster. Exact error can be looked into driver logs.

Screen Shot 2021-06-07 at 2.42.14 PM Screen Shot 2021-06-07 at 2.45.22 PM
  • 2721 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16826994223
Honored Contributor III
  • 0 kudos

Awsome Knowledge

  • 0 kudos
User16783855534
by New Contributor III
  • 6791 Views
  • 3 replies
  • 1 kudos
  • 6791 Views
  • 3 replies
  • 1 kudos
Latest Reply
sajith_appukutt
Honored Contributor II
  • 1 kudos

The answer varies depending on the cloud provider (as of June 2021) . In GCP, since the architecture is based on GKE , there are additional ip requirements. For more details see

  • 1 kudos
2 More Replies
Anonymous
by Not applicable
  • 1519 Views
  • 1 replies
  • 0 kudos
  • 1519 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Full support for Databricks Runtime versions lasts for six months, with the exception of Long Term Support (LTS) versions, which Databricks supports for two years.https://docs.databricks.com/release-notes/runtime/databricks-runtime-ver.html

  • 0 kudos
User16790091296
by Contributor II
  • 1561 Views
  • 0 replies
  • 0 kudos

docs.databricks.com

What is Databricks Database?A Databricks database is a collection of tables. A Databricks table is a collection of structured data. You can cache, filter, and perform any operations supported by Apache Spark DataFrames on Databricks tables. You can q...

  • 1561 Views
  • 0 replies
  • 0 kudos
saninanda
by New Contributor II
  • 16513 Views
  • 7 replies
  • 0 kudos

how to read schema from text file stored in cloud storage

I have file a.csv or a.parquet while creating data frame reading we can explictly define schema with struct type. instead of write the schema in the notebook want to create schema lets say for all my csv i have one schema like csv_schema and stored ...

  • 16513 Views
  • 7 replies
  • 0 kudos
Latest Reply
Nakeman
New Contributor II
  • 0 kudos

@shyampsr big thanks, was searching for the solution almost 3 hours _https://luckycanadian.com/

  • 0 kudos
6 More Replies
User16873043212
by New Contributor III
  • 1163 Views
  • 0 replies
  • 0 kudos

We can now launch pools on databricks with different instance types. Hybrid Pools allows customers to create clusters and select different Databricks ...

We can now launch pools on databricks with different instance types. Hybrid Pools allows customers to create clusters and select different Databricks pools for driver and workers. It provides a way to support driver vs. worker heterogeneity, and ther...

  • 1163 Views
  • 0 replies
  • 0 kudos
PraveenKumarB
by New Contributor
  • 9304 Views
  • 5 replies
  • 0 kudos

java.io.IOException: No FileSystem for scheme: null

Getting the error when try to load the uploaded file in py notebook.# File location and type file_location = "//FileStore/tables/data/d1.csv" file_type = "csv" # CSV options infer_schema = "true" first_row_is_header = "false" delimiter = ","# The app...

  • 9304 Views
  • 5 replies
  • 0 kudos
Latest Reply
DivyanshuBhatia
New Contributor II
  • 0 kudos

@naughtonelad​  if your issue is solved,please let me know as I am facing the same problem

  • 0 kudos
4 More Replies
SatheeshSathees
by New Contributor
  • 8374 Views
  • 1 replies
  • 0 kudos

how to dynamically explode array type column in pyspark or scala

HI, i have a parquet file with complex column types with nested structs and arrays. I am using the scrpit from below link to flatten my parquet file. https://docs.microsoft.com/en-us/azure/synapse-analytics/how-to-analyze-complex-schema I am able ...

  • 8374 Views
  • 1 replies
  • 0 kudos
Latest Reply
shyam_9
Databricks Employee
  • 0 kudos

Hello, Please check out the below docs and notebook which has similar examples, https://docs.microsoft.com/en-us/azure/synapse-analytics/how-to-analyze-complex-schemahttps://docs.microsoft.com/en-us/azure/databricks/_static/notebooks/transform-comple...

  • 0 kudos
DimitrisMpizos
by New Contributor
  • 46513 Views
  • 16 replies
  • 0 kudos

Exporting data from databricks

I couldn't find in documentation a way to export an RDD as a text file to a local folder by using python. Is it possible?

  • 46513 Views
  • 16 replies
  • 0 kudos
Latest Reply
Manu1
New Contributor II
  • 0 kudos

To: Export a file to local desktop Workaround : Basically you have to do a "Create a table in notebook" with DBFS The steps are: Click on "Data" icon > Click "Add Data" button > Click "DBFS" button > Click "FileStore" folder icon in 1st pane "Sele...

  • 0 kudos
15 More Replies
SergeyIvanchuk
by New Contributor
  • 12815 Views
  • 4 replies
  • 0 kudos

Seaborn plot display in Databricks

I am using Seaborn version 0.7.1 and matplotlib version 1.5.3 The following code does not display a graph in the end. Any idea how to resolve ? (works in Python CLI on my local computer) import seaborn as sns sns.set(style="darkgrid") tips = sns.lo...

  • 12815 Views
  • 4 replies
  • 0 kudos
Latest Reply
AbbyLemon
New Contributor II
  • 0 kudos

I found that you create a similar comparison plot as what you get from seaborn by using the display(sparkdf) and adding multiple columns to the 'Values' section while creating a 'Scatter plot'. You get to the 'Customize Plot' by clicking on the icon ...

  • 0 kudos
3 More Replies
bhaumikg
by New Contributor II
  • 18816 Views
  • 7 replies
  • 2 kudos

Databricks throwing error "SQL DW failed to execute the JDBC query produced by the connector." while pushing the column with string length more than 255

I am using databricks to transform the data and than pushing the data into datalake. the data is getting pushed in if the length of the string field is 255 or less but it throws following error if it is beyond that. "SQL DW failed to execute the JDB...

  • 18816 Views
  • 7 replies
  • 2 kudos
Latest Reply
bhaumikg
New Contributor II
  • 2 kudos

As suggested by ZAIvR, please use append and provide maxlength while pushing the data. Overwrite may not work with this unless databricks team has fixed the issue

  • 2 kudos
6 More Replies
Labels