cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

KellenO
by New Contributor II
  • 2480 Views
  • 2 replies
  • 8 kudos

Resolved! How can I use cluster autoscaling with intensive subprocess calls?

I have a custom application/executable that I upload to DBFS and transfer to my cluster's local storage for execution. I want to call multiple instances of this application in parallel, which I've only been able to successfully do with Python's subpr...

  • 2480 Views
  • 2 replies
  • 8 kudos
Latest Reply
Anonymous
Not applicable
  • 8 kudos

Autoscaling works for spark jobs only. It works by monitoring the job queue, which python code won't go into. If it's just python code, try single node.https://docs.databricks.com/clusters/configure.html#cluster-size-and-autoscaling

  • 8 kudos
1 More Replies
Taha_Hussain
by Databricks Employee
  • 7201 Views
  • 5 replies
  • 5 kudos

Connect a BI Tool: How do I access my lakehouse data from my BI tool?

You can find a rich ecosystem of tools that allow you to work with all your data in-place and deliver real-time business insights faster.This post will help you connect your existing tools like dbt, Fivetran, PowerBI, Tableau or SAP to ingest, transf...

Screen Shot 2022-08-09 at 11.54.56 PM
  • 7201 Views
  • 5 replies
  • 5 kudos
Latest Reply
Axserv
New Contributor II
  • 5 kudos

Hello Taha, here is a fairly recent video provided by Databricks on conncecting Power BI : Demo Video: Connect to Power BI Desktop from Databricks - YouTube

  • 5 kudos
4 More Replies
ranged_coop
by Valued Contributor II
  • 2013 Views
  • 2 replies
  • 3 kudos

Equivalent Machine Types between Databricks on Azure and GCP

Hi All,Hope everyone is doing well.We are currently validating Databricks on GCP and Azure.We have a python notebook that does some ETL (Copy, extract zip files and process files within the zip files)Our Cluster Config on AzureDBX Runtime - 10.4 - Dr...

  • 2013 Views
  • 2 replies
  • 3 kudos
Latest Reply
ranged_coop
Valued Contributor II
  • 3 kudos

hi @Tunde Abib​ , I have gone through the links while updating, but did not see any major documented slow downs mentioned in them. 

  • 3 kudos
1 More Replies
Sujitha
by Databricks Employee
  • 2236 Views
  • 6 replies
  • 5 kudos

KB Feedback Discussion  In addition to the Databricks Community, we have a Support team that maintains a Knowledge Base (KB). The KB contains answers ...

KB Feedback Discussion In addition to the Databricks Community, we have a Support team that maintains a Knowledge Base (KB). The KB contains answers to common questions about Databricks, as well as information on optimisation and troubleshooting.Thes...

  • 2236 Views
  • 6 replies
  • 5 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 5 kudos

Thanks for sharing @Sujitha Ramamoorthy​ 

  • 5 kudos
5 More Replies
Netty
by New Contributor III
  • 5419 Views
  • 5 replies
  • 7 kudos

Resolved! What's the easiest way to upsert data into a table? (Azure ADLS Gen2)

I had been trying to upsert rows into a table in Azure Blob Storage (ADLS Gen 2) based on two partitions (sample code below). insert overwrite table new_clicks_table partition(client_id, mm_date) select click_id ,user_id ,click_timestamp_gmt ,ca...

  • 5419 Views
  • 5 replies
  • 7 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 7 kudos

Below code might help youPython- (df.write .mode("overwrite") .option("partitionOverwriteMode", "dynamic") .saveAsTable("default.people10m") )   SQL- SET spark.sql.sources.partitionOverwriteMode=dynamic; INSERT OVERWRITE TABLE default.people10m...

  • 7 kudos
4 More Replies
KVNARK
by Honored Contributor II
  • 9360 Views
  • 11 replies
  • 8 kudos

Resolved! Databricks lakehouse platform administration accreditation

How to complete the Databricks lakehouse platform administration for free just like Lakehouse fundamentals. How to get the accreditation for platform administrator like lakehouse fundamentals.

  • 9360 Views
  • 11 replies
  • 8 kudos
Latest Reply
KVNARK
Honored Contributor II
  • 8 kudos

Through community partner account only I tried.

  • 8 kudos
10 More Replies
Rishabh-Pandey
by Esteemed Contributor
  • 1178 Views
  • 1 replies
  • 7 kudos

Regarding my free lake house 100 points

hi @Christy Seto​, i have cleared the lake house exam before 30 november 2022 and was eligible to get a 100 community points , i have cleared with the email id of manpreet.kaur@celebaltech.com but till now i havent get 100 points . i have edited my e...

  • 1178 Views
  • 1 replies
  • 7 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 7 kudos

Hi @Rishabh Pandey​ Please raise a request via this link this might help you.

  • 7 kudos
ncouture
by Contributor
  • 5720 Views
  • 3 replies
  • 1 kudos

Resolved! How to install a JAR library via a global init script?

I have a JAR I want to be installed as a library on all clusters. I have tried both wget /databricks/jars/ some_repoandcp /dbfs/FileStore/jars/name_of_jar.jar /databricks/jars/clusters start up but the JAR is not installed as a library. I am aware th...

  • 5720 Views
  • 3 replies
  • 1 kudos
Latest Reply
ncouture
Contributor
  • 1 kudos

Found a solution.echo /databricks/databricks-hive /databricks/jars /databricks/glue | xargs -n 1 cp /dbfs/FileStore/jars/NAME_OF_THE_JAR.jarhad to first add the jar as a library through the GUI via Create -> Library then uploaded the downloaded JAR. ...

  • 1 kudos
2 More Replies
kpendergast
by Contributor
  • 1704 Views
  • 1 replies
  • 1 kudos

Resolved! Modify the Json Schema Stored in a File for AutoLoader

We are reading over an S3 bucket which contains a several million json files. The schema from the read is stored in a json file in the dbfs filestore. This file is then utilized by autoloader to write new files nightly to a delta table. The schema is...

  • 1704 Views
  • 1 replies
  • 1 kudos
Latest Reply
kpendergast
Contributor
  • 1 kudos

if anyone is curious I ended up just passing the schema as a string to .schema(eval(the_schema)) in StructType format and not using the file based approach.

  • 1 kudos
abizid
by New Contributor
  • 878 Views
  • 0 replies
  • 0 kudos

.Net thrift client for sql warehouse

I'm trying to port python-sql thrift client to .net and I receive a 500 error when trying to open a session.Is there a way to have an sql warehouse server mock in order to investigate the error.

  • 878 Views
  • 0 replies
  • 0 kudos
moski
by New Contributor II
  • 9393 Views
  • 8 replies
  • 7 kudos

Databricks short cut to split a cell

Is there a shortcut to split a cell into two in Dtabricks notebook as in Jupiter notebook? in jupyter notebook it is Shift/Ctr/-

  • 9393 Views
  • 8 replies
  • 7 kudos
Latest Reply
Harshjot
Contributor III
  • 7 kudos

 Hi @mundy Jim​ / All, Attached are two snapshots so first snapshot with one cell if pressed Ctrl+Alt+Minus split into two.  

  • 7 kudos
7 More Replies
alhuelamo
by New Contributor II
  • 8455 Views
  • 4 replies
  • 1 kudos

Getting non-traceable NullPointerExceptions

We're running a job that's issuing NullPointerException without traces of our job's code.Does anybody know what would be the best course of action when it comes to debugging these issues?The job is a Scala job running on DBR 11.3 LTS.In case it's rel...

  • 8455 Views
  • 4 replies
  • 1 kudos
Latest Reply
UmaMahesh1
Honored Contributor III
  • 1 kudos

NullPointerException will occur when you are accessing an instance method or if you are trying to access elements in a null array or you are calling a method on an object referred by null value. To give you suggestion on how to avoid that, we might ...

  • 1 kudos
3 More Replies
DB_developer
by New Contributor III
  • 1745 Views
  • 3 replies
  • 0 kudos
  • 1745 Views
  • 3 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

there is no single answer to this.If you look at parquet, which is a very common format on data lakes:https://parquet.apache.org/docs/file-format/nulls/and on SO

  • 0 kudos
2 More Replies
learnerbricks
by New Contributor II
  • 7247 Views
  • 2 replies
  • 1 kudos

Unable to save CSV file into DBFS

Hello,I have took the azure datasets that are available for practice. I got the 10 days data from that dataset and now I want to save this data into DBFS in csv format. I have facing an error :" No such file or directory: '/dbfs/tmp/myfolder/mytest.c...

  • 7247 Views
  • 2 replies
  • 1 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 1 kudos

You can use spark dataframe to read and write the CSV files-Read- df=spark.read.csv("Path")   Write-   df.write.csv("Path")

  • 1 kudos
1 More Replies
KVNARK
by Honored Contributor II
  • 1322 Views
  • 2 replies
  • 4 kudos

How much time does it take for the databricks partner account to get created

How much time does it take for the databricks partner account to get created after we submit the application to databricks.?

  • 1322 Views
  • 2 replies
  • 4 kudos
Latest Reply
Harshjot
Contributor III
  • 4 kudos

Hi @KVNARK .​ On training academy? It was instant for me.

  • 4 kudos
1 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels