cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

User16765131552
by Contributor III
  • 3017 Views
  • 0 replies
  • 0 kudos

Dataframe.write with table containing Always generate columns and auto generate columns is failing(SQL SERVER + sql-spark-connector)

Dataframe write to SQL Server table containing Always autogenerate column fails. I am using Apache Spark Connector for SQL Server and Azure SQL. When autogenerate field are not included in dataframe, I encountered - "No key found " error If auto-gene...

  • 3017 Views
  • 0 replies
  • 0 kudos
User16765131552
by Contributor III
  • 2455 Views
  • 1 replies
  • 1 kudos

Resolved! Create a new cluster in Databricks using databricks-cli

I'm trying to create a new cluster in Databricks on Azure using databricks-cli.I'm using the following command:databricks clusters create --json '{ "cluster_name": "template2", "spark_version": "4.1.x-scala2.11" }'And getting back this error: Error: ...

  • 2455 Views
  • 1 replies
  • 1 kudos
Latest Reply
User16765131552
Contributor III
  • 1 kudos

I found the right answer here.The correct format to run this command on azure is:databricks clusters create --json '{ "cluster_name": "my-cluster", "spark_version": "4.1.x-scala2.11", "node_type_id": "Standard_DS3_v2", "autoscale" : { "min_workers": ...

  • 1 kudos
User15787040559
by Databricks Employee
  • 2312 Views
  • 1 replies
  • 0 kudos
  • 2312 Views
  • 1 replies
  • 0 kudos
Latest Reply
sajith_appukutt
Honored Contributor II
  • 0 kudos

In addition to subscription limits, the total capacity of clusters in each workspace is a function of the masks used for the workspace's enclosing Vnet and the pair of subnets associated with each cluster in the workspace. The masks can be changed if...

  • 0 kudos
Srikanth_Gupta_
by Databricks Employee
  • 2383 Views
  • 2 replies
  • 0 kudos

What are best instance types to use Delta Lake on AWS, Azure and GCP?

Best instance types to use Delta in a better way, are there any recommendations?Example: i3.xlarge vs m5.2x large vs D3v2

  • 2383 Views
  • 2 replies
  • 0 kudos
Latest Reply
Mooune_DBU
Valued Contributor
  • 0 kudos

Depending on your queries, if you're looking for Delta Cache Optimized instances, here's the list per provider:AWS: i3.* (i.e. i3.xlarge)Azure: Ls-types (i.e. L4sv2)GCP: n2-highmem-*

  • 0 kudos
1 More Replies
User16826994223
by Honored Contributor III
  • 4327 Views
  • 1 replies
  • 0 kudos

How to export full result Databricks Azure

what is the best way to see all the data , I see display shows up to 100000 data only . any way in which I can see all the data or do I need to download or export it in different file

  • 4327 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16826994223
Honored Contributor III
  • 0 kudos

Yes, databricks display only a limited dataframe. It allows you to download the data like a csv, . You can save the dataframe as a table in the databricks database with this:predictions.select("salry", "dept").write.saveAsTable("depsalry")Then you ca...

  • 0 kudos
Anonymous
by Not applicable
  • 860 Views
  • 0 replies
  • 0 kudos

Using multiple clouds

Are there recommendations and/or examples of leveraging AWS and Azure with Databricks? If so, is there any best practices to follow? Want to ensure we avoid expensive data transfer across clouds

  • 860 Views
  • 0 replies
  • 0 kudos
User16826994223
by Honored Contributor III
  • 5451 Views
  • 1 replies
  • 0 kudos

How to conver Dataframe into JSON on Databricks?

Can I convert my jdbc Dataframe into JSON ? Because when I tried it, it got an error. I'm using a script as Pandas DataFrame function df.to_json()

  • 5451 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16826994223
Honored Contributor III
  • 0 kudos

df.toJSON()

  • 0 kudos
microamp
by New Contributor II
  • 12817 Views
  • 12 replies
  • 0 kudos

Azure Data Lake Config Issue: No value for dfs.adls.oauth2.access.token.provider found in conf file.

Hi,I have files hosted on an Azure Data Lake Store which I can connect from Azure Databricks configured as per instructions here.I can read JSON files fine, however, I'm getting the following error when I try to read an Avro file.spark.read.format("c...

  • 12817 Views
  • 12 replies
  • 0 kudos
Latest Reply
User16301467523
New Contributor II
  • 0 kudos

Taras's answer is correct. Because spark-avro is based on the RDD APIs, the properties must be set in the hadoopConfiguration options. Please note these docs for configuration using the RDD API: https://docs.azuredatabricks.net/spark/latest/data-sou...

  • 0 kudos
11 More Replies
Labels