cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

andreas9898
by New Contributor II
  • 3231 Views
  • 3 replies
  • 5 kudos

Getting error with spark-sftp, no such file

In a databricks cluster with Scala 2.1.1 I am trying to read a file into a spark data frame using the following code.val df = spark.read .format("com.springml.spark.sftp") .option("host", "*") .option("username", "*") .option("password", "*")...

  • 3231 Views
  • 3 replies
  • 5 kudos
Latest Reply
Anonymous
Not applicable
  • 5 kudos

Hi @Andreas P​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 5 kudos
2 More Replies
KVNARK
by Honored Contributor II
  • 5582 Views
  • 8 replies
  • 28 kudos

Resolved! Can we use Databricks or code in data bricks without learning Pyspark in depth which is used for ETL purpose and data engineering perspective.

Can we use Databricks or code in data bricks without learning Pyspark in depth which is used for ETL purpose and data engineering perspective. can someone throw some light on this. Currently learning Pyspark (basics of Pythion in handling the data) a...

  • 5582 Views
  • 8 replies
  • 28 kudos
Latest Reply
KVNARK
Honored Contributor II
  • 28 kudos

Thanks All for your valuable suggestions!

  • 28 kudos
7 More Replies
alxsbn
by Contributor
  • 932 Views
  • 0 replies
  • 2 kudos

Terraform x Databricks error INVALID_STATE subscription disabled

Hello,I just bootstrap a new Databricks EC2 on an AWS account with Terraform. Priori dependencies seems OK on my side (network, root storage, credentials configuration). I'm referring mainly to this guide and of course pages related to each Databrick...

  • 932 Views
  • 0 replies
  • 2 kudos
VN11111
by New Contributor III
  • 9644 Views
  • 5 replies
  • 6 kudos

Resolved! ERROR: Some streams terminated before this command could finish!

I have a databricks notebook which is to read stream from Azure Event Hub.My code does the following:1.Configure path for Eventhubs2.Read Streamdf_read_stream = (spark.readStream .format("eventhubs") .options(**conf)...

  • 9644 Views
  • 5 replies
  • 6 kudos
Latest Reply
guru1
New Contributor II
  • 6 kudos

I am also facing same issue , using Cluster11.3 LTS (includes Apache Spark 3.3.0, Scala 2.12) liberary : com.microsoft.azure:azure-eventhubs-spark_2.12:2.3.21Please help me for sameconf = {}conf["eventhubs.connectionString"] = "Endpoint=sb://xxxx.ser...

  • 6 kudos
4 More Replies
huyd
by New Contributor III
  • 1400 Views
  • 0 replies
  • 4 kudos

Optimizing a batch load process, reading with the JDBC driver

I am doing a batch load, using the JDBC driver from a database table. I am noticing in Sparkui, that there is both memory and disk spill, but only on one executor. I am also, noticing that when trying to use the JDBC parallel read, it seems to run sl...

  • 1400 Views
  • 0 replies
  • 4 kudos
magnus778
by New Contributor III
  • 2236 Views
  • 2 replies
  • 4 kudos

Resolved! Error writing parquet to specific container in Azure Data Lake

I'm retrieving two files from container1, transforming them and merging before writing to a container2 within the same Storage Account in Azure. I'm mounting container1, unmouting and mounting countainer2 before writing. My code for writing the parqu...

  • 2236 Views
  • 2 replies
  • 4 kudos
Latest Reply
Pat
Honored Contributor III
  • 4 kudos

Hi @Magnus Asperud​ ,1 mounting container12 you should persist the data somewhere, creating df doesnt mean that you are reading data from container and have it accessible after unmounting. Make sure to store this merged data somewhere. Not sure if th...

  • 4 kudos
1 More Replies
Constantino
by New Contributor III
  • 2349 Views
  • 3 replies
  • 4 kudos

Resolved! cannot list all tokens with account admin

I'm trying to list all tokens (both user and service principal) for a given workspace; using an Account level admin I've tried both the CLI as well as the API endpoint to list tokens, however each time, only the admin's tokens are returned.I've confi...

  • 2349 Views
  • 3 replies
  • 4 kudos
Latest Reply
Pat
Honored Contributor III
  • 4 kudos

Great that I could help

  • 4 kudos
2 More Replies
ckwan48
by New Contributor III
  • 4237 Views
  • 2 replies
  • 4 kudos

Date schema issues with pyspark dataframe creation

I'm having some issues with creating a dataframe with a date column. Could I know what is wrong?from pyspark.sql import SparkSession from pyspark.sql.types import StructType from pyspark.sql.types import DateType, FloatType spark = SparkSession.bui...

  • 4237 Views
  • 2 replies
  • 4 kudos
Latest Reply
ckwan48
New Contributor III
  • 4 kudos

Hi @Kaniz Fatma​,I actually changed the date format to 'M/d/Y' and it didn't throw any errors. I found in my csv file that it had dates like '3/1/2022'. Could that be the issue? But some dates also were like '12/1/2022. So I'm kind of confused.

  • 4 kudos
1 More Replies
Thanapat_S
by Contributor
  • 23437 Views
  • 8 replies
  • 5 kudos

Resolved! Can I change from default showing first 1,000 to return all records when query?

I have to query a data for showing in my dashboard.But it truncated the results and showing only first 1,000 rows.In the dashboard view, there is no option to re-execute with maximum result limits.I don't want to switch back to standard view and clic...

image image.png
  • 23437 Views
  • 8 replies
  • 5 kudos
Latest Reply
Srihasa_Akepati
Databricks Employee
  • 5 kudos

@Thanapat Sontayasara​ 10000 limit is available as an option in the notebook(which propagates to dashboard after its run in notebook) while 1000 rows still is the default. 10000 limit is experimental and it can be made as default depending on the num...

  • 5 kudos
7 More Replies
Saikrishna2
by New Contributor III
  • 879 Views
  • 0 replies
  • 2 kudos

Databricks SQL user has limitation with 10 queries ?

•Power BI is a publisher that uses AD group authentication to publish result sets. Since the publisher's credentials are maintained, the same user can access the data bricks database.•Number of the users are retrieving the data from the power bi or i...

  • 879 Views
  • 0 replies
  • 2 kudos
NSRBX
by Contributor
  • 3626 Views
  • 8 replies
  • 19 kudos

Databricks-connect not available on Databricks Runtime > 10.4

Hello Databricks Team,Databricks-connect doesn't work on databricks runtime 11.3.Databricks recommends that we use dbx for Databricks Lab instead of databricks-connect. Databricks plans no new feature development for Databricks Connect at this time.D...

  • 3626 Views
  • 8 replies
  • 19 kudos
Latest Reply
xiangzhu
Contributor III
  • 19 kudos

thx @Landan George​ do you have any ETA about the public preview ?

  • 19 kudos
7 More Replies
AnubhavG
by Contributor
  • 5342 Views
  • 8 replies
  • 29 kudos

Resolved! Are Python External UDFs supported in Databricks SQL warehouse?

I tried running a python UDF in the Databricks SQL warehouse but it did not run and gave the "Python UDF is not supported" error.Can i get a clear picture if the Python External UDFs are supported or not?

  • 5342 Views
  • 8 replies
  • 29 kudos
Latest Reply
youssefmrini
Databricks Employee
  • 29 kudos

It's a private preview and It will be supported only in PRO SQL Warehouse and the Serverless SQL Warehouse

  • 29 kudos
7 More Replies
mr_poola49
by New Contributor III
  • 1991 Views
  • 0 replies
  • 5 kudos

Azure Databricks Jobs Connection Timeout (Read Failed)

Azure Databricks Jobs failed intermittently due to connection timeout (Read Failed) while executing a MS SQL stored procedure which is in Azure SQL database.My requirement is to process delta records(Get delta records using last refresh date) from Da...

  • 1991 Views
  • 0 replies
  • 5 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels