cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

DoD
by New Contributor III
  • 3273 Views
  • 2 replies
  • 1 kudos

Resolved! Why are R scripts inside of Databricks notebooks creating writeLines errors?

I recently posted this in Stack Overflow. I'm using R in Databricks. R Studio runs fine and executes from the Databricks cluster. I would like to transition from R Studio to notebooks. When I start the cluster, R seems to run fine from notebooks. ...

  • 3273 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

@Paul Evangelista​ - Thank you for letting us know. You did great!Would you be happy to mark your answer as best so that others can find your solution more easily?

  • 1 kudos
1 More Replies
wyzer
by Contributor II
  • 10618 Views
  • 2 replies
  • 2 kudos

Why database/table names are in lower case ?

Hello,When I run this code :CREATE DATABASE BackOfficeI see the database like this :backofficeWhy everything is in lower case ?Is it possible to configure Databricks in order to keep the real name ?Thanks.

  • 10618 Views
  • 2 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Databricks MVP
  • 2 kudos

It is managed by hive metastore as you can put it in different databases is saver this way as some database are Case Sensitive and some not (you can easily test it with standard WHERE syntax).Probably you could change it with some hive settings but i...

  • 2 kudos
1 More Replies
hetadesai
by New Contributor II
  • 7709 Views
  • 1 replies
  • 3 kudos

How to download zip file from SFTP location and put that file into Azure Data Lake and unzip there ?

I have zip file on SFTP location. I want to copy that file from SFTP location and put it into Azure Data lake and want to unzip there using spark notebook. Please help me to solve this.

  • 7709 Views
  • 1 replies
  • 3 kudos
Latest Reply
Hubert-Dudek
Databricks MVP
  • 3 kudos

I would go with @Kaniz Fatma​ approach and download data in Data Factory and after is downloaded on success trigger databricks spark notebook. With spark you can read also compressed data so maybe you will not need to do even separate unzip.

  • 3 kudos
BorislavBlagoev
by Databricks Partner
  • 34828 Views
  • 16 replies
  • 10 kudos

Resolved! Error in databricks-sql-connector

from databricks import sql hostname = '<name>.databricks.com' http_path = '/sql/1.0/endpoints/<endpoint_id>' access_token = '<personal_token>' connection = sql.connect(server_hostname=hostname, http_path=http_path, access_token=access_token) cu...

  • 34828 Views
  • 16 replies
  • 10 kudos
Latest Reply
NiallEgan__Data
Databricks Employee
  • 10 kudos

Hi @Borislav Blagoev​ ,Thanks very much for taking the time to collect these logs.The problem here (as indicated by the `IpAclValidation` message) is that IP allow listing (enabled for your workspace) will not allow arbitrary connections from Spark c...

  • 10 kudos
15 More Replies
SajiD
by Databricks Partner
  • 1967 Views
  • 0 replies
  • 0 kudos

Snowflake Connector for Databricks

Hi everyone, I am working with Databricks Notebooks and I am facing an issue with snowflake connector, I wanted to use DDL/DML with snowflake connector. Can someone please help me out with this, Thanks in advance !!

  • 1967 Views
  • 0 replies
  • 0 kudos
Olli
by Databricks Partner
  • 5669 Views
  • 3 replies
  • 0 kudos

Resolved! Autoloader streams fail unable to locate checkpoint/metadata or metadata/rocksdb/SSTs/sst files after interruption from cluster termination

I have a pipeline with + 20 streams running based on autoloader. The pipeline crashed and after the crash I'm unable to start the streams and they fail with one of the following messages:1): The metadata file in the streaming source checkpoint direct...

  • 5669 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Olli Tiihonen​  - Thanks for letting us know. I'm glad you were able to get to the bottom of things.

  • 0 kudos
2 More Replies
NextIT
by New Contributor
  • 1235 Views
  • 0 replies
  • 0 kudos

www.nextitvision.com

Online IT Training: ERP/SAP Online Training | JAVA Online Training | C++Online Training | ORACLE Online Training | Online Python Training | Machine Learning Training. If you Need more Details and Information Regarding IT Online Training. Please Visi...

  • 1235 Views
  • 0 replies
  • 0 kudos
sh_abrishami_ie
by New Contributor II
  • 5732 Views
  • 1 replies
  • 3 kudos

Resolved! Driver is up but is not responsive, likely due to GC.

Hi,I have a problem with writing an excel file into the mounted file.after 10 mins I see the Driver is up but is not responsive, likely due to GC on the log events.I'm using the following script:df.repartition(1).write .format("com.crealytics.spark....

  • 5732 Views
  • 1 replies
  • 3 kudos
Latest Reply
Hubert-Dudek
Databricks MVP
  • 3 kudos

It is not solution to that problem but I recommend to handle excel reads and writes with Spark Koalas https://koalas.readthedocs.io/en/latest/reference/api/databricks.koalas.DataFrame.to_excel.html just give it a try maybe it will solve your issue

  • 3 kudos
Robbie
by New Contributor III
  • 4081 Views
  • 1 replies
  • 2 kudos

How can I avoid this 'java.sql.SQLException: Too many connections' error?

I'm having difficulty with a job (parent) that triggers multiple parallel runs of another job (child) in batches (e.g. 10 parallel runs per batch).Occasionally some of the parallel "child" jobs will crash a few minutes in-- either during or immediate...

  • 4081 Views
  • 1 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Databricks MVP
  • 2 kudos

It is MariaDB JDBC error so probably database which you are trying to connect can not handle this amount of concurrent connections (alternatively if you are not connecting to MariaDB databse, MariaDB is used also for hive metastore in your case maria...

  • 2 kudos
bchaubey
by Contributor II
  • 1894 Views
  • 1 replies
  • 1 kudos

Azure Databricks Certification

@Hubert Dudek​  what is the Certification name of Azure Databricks?

  • 1894 Views
  • 1 replies
  • 1 kudos
Latest Reply
Hubert-Dudek
Databricks MVP
  • 1 kudos

Hi @Bhagwan Chaubey​ ,There is Spark developer certification from Databricks - https://databricks.com/learn/training/home (and some higher levels as well)In Azure databricks is included in DP-100 and DP-203 certification (together with around 10 diff...

  • 1 kudos
Ashish
by New Contributor II
  • 10719 Views
  • 4 replies
  • 3 kudos

Resolved! Cost of individual jobs running on a shared Databricks cluster

Hi All,I am working on a requirement where I need to calculate the cost of each spark job individually on a shared Azure/AWS Databricks cluster. There can be multiple jobs running on the cluster parallelly.Cost needs to be calculated after job comple...

  • 10719 Views
  • 4 replies
  • 3 kudos
Latest Reply
alexott
Databricks Employee
  • 3 kudos

There is a built-in functionality for getting the costs:AWS - https://docs.databricks.com/administration-guide/account-settings-e2/usage.htmlAzure - via built-in Cost Management + BillingThe main problem with that functionality is that the smallest g...

  • 3 kudos
3 More Replies
Autel
by New Contributor II
  • 6151 Views
  • 3 replies
  • 0 kudos

Resolved! concurrent update to same hive or deltalake table

HI, I'm interested to know if multiple executors to append the same hive table using saveAsTable or insertInto sparksql. will that cause any data corruption? What configuration do I need to enable concurrent write to same hive table? what about the s...

  • 6151 Views
  • 3 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

The Hive table will not like this, as the underlying data is parquet format which is not ACID compliant.Delta lake however is:https://docs.delta.io/0.5.0/concurrency-control.htmlYou can see that inserts do not give conflicts.

  • 0 kudos
2 More Replies
Maverick1
by Valued Contributor II
  • 37462 Views
  • 9 replies
  • 7 kudos

Resolved! How to deploy a databricks managed workspace model to sagemaker from databricks notebook

I wanted to deploy a registered model present in databricks managed MLFlow to a sagemaker via databricks notebook?As of now, it is not able to run mlflow sagemaker build-and-push container command directly. What all configurations or steps needed to ...

  • 37462 Views
  • 9 replies
  • 7 kudos
Latest Reply
User16871418122
Databricks Employee
  • 7 kudos

@Saurabh Verma​ Please try!import mlflow.sagemaker as mfs sys.stdout.fileno = lambda: 0 mfs.run_local(model_uri=model_uri,port=8000,image="test")

  • 7 kudos
8 More Replies
cbynum
by New Contributor III
  • 5112 Views
  • 4 replies
  • 1 kudos

Resolved! Terraform authentication with SSO enabled

After enabling SSO on my account I now don't have any way to change my terraform for provisioning AWS workspaces because username/password is disabled. Is there a workaround for this?

  • 5112 Views
  • 4 replies
  • 1 kudos
Latest Reply
cbynum
New Contributor III
  • 1 kudos

Never mind, the account owner creds do work, but I had to add the account owner to all of the workspaces. The terraform didn't give me an informative error, it just hung forever when applying.

  • 1 kudos
3 More Replies
Ketna
by New Contributor
  • 2818 Views
  • 1 replies
  • 0 kudos

I have included SparkJDBC42.jar in my war file. but when i start my application using tomcat, i get EOFExceptions from log4j classes. I need help with what is causing this and How to resolve this issue? Please help.

Below is part of the exceptions I am getting:org.apache.catalina.startup.ContextConfig processAnnotationsJarSEVERE: Unable to process Jar entry [com/simba/spark/jdbc42/internal/apache/logging/log4j/core/pattern/ThreadIdPatternConverter.class] from Ja...

  • 2818 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hello, @Ketna Khalasi​ ! My name is Piper, and I'm a moderator here at Databricks. Thank you for posting your question and I'm sorry to hear you're having this problem. We generally give the community a chance to respond before jumping in. Thanks in ...

  • 0 kudos
Labels