cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

hiral_jasani
by New Contributor
  • 529 Views
  • 0 replies
  • 0 kudos

Hands-On Workshop: Simplify Data Integration for the Modern Data Stack  Do you have a lot of data that is stuck in your source systems? Data engineers...

Hands-On Workshop: Simplify Data Integration for the Modern Data Stack Do you have a lot of data that is stuck in your source systems? Data engineers too bottlenecked to build another ingest pipeline? Join us for a live, hands-on workshop on building...

Image
  • 529 Views
  • 0 replies
  • 0 kudos
PJ
by New Contributor III
  • 2316 Views
  • 3 replies
  • 3 kudos

Resolved! How should you optimize <1GB delta tables?

I have seen the following documentation that details how you can work with the OPTIMIZE function to improve storage and querying efficiency. However, most of the documentation focuses on big data, 10 GB or larger. I am working with a ~7million row ...

  • 2316 Views
  • 3 replies
  • 3 kudos
Latest Reply
PJ
New Contributor III
  • 3 kudos

Thank you @Hubert Dudek​ !! So I gather from your response that it's totally fine to have a delta table that lives under 1 file that's roughly 211 MB. And I can use OPTIMIZE in conjunction with ZORDER to filter on a frequently filtered, high cardina...

  • 3 kudos
2 More Replies
PJ
by New Contributor III
  • 3605 Views
  • 7 replies
  • 0 kudos

Please bring back "Right Click > Clone" functionality within Databricks Repos! After this was removed, the best way to replicate this fun...

Please bring back "Right Click > Clone" functionality within Databricks Repos!After this was removed, the best way to replicate this functionality was to:Export the file in .dbc format Import the .dbc file back in. New file has a suffix of " (1)"As o...

  • 3605 Views
  • 7 replies
  • 0 kudos
Latest Reply
PJ
New Contributor III
  • 0 kudos

Hello! Just to update the group on this question, the clone right-click functionality is working again in Repos for me I believe this fix came with a new databricks upgrade on 2022-04-20 / 2022-04-21

  • 0 kudos
6 More Replies
ImAbhishekTomar
by New Contributor III
  • 3033 Views
  • 1 replies
  • 1 kudos

Resolved! Trying to Flatten My Json using CosmosDB Spark connector - Azure Databricks

Hi,Using the below cosmos DB query it is possible to achieve the expected output, but how can I do the same with spark SQL in Databricks.COSMOSDB QUERY : select c.ReportId,c.ReportName,i.price,p as provider from c join i in in_network join p in i.pr...

  • 3033 Views
  • 1 replies
  • 1 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 1 kudos

Hi @Abhishek Tomar​ , If you want to get it from Cosmos DB, use the connector with a custom query https://github.com/Azure/azure-cosmosdb-sparkIf you want to have JSON imported directly by databricks/spark, please go with the below solution:SELECT ...

  • 1 kudos
MartinB
by Contributor III
  • 23297 Views
  • 16 replies
  • 3 kudos

Does partition pruning / partition elimination not work for folder partitioned JSON files? (Spark 3.1.2)

Imagine the following setup:I have log files stored as JSON files partitioned by year, month, day and hour in physical folders:""" /logs |-- year=2020 |-- year=2021 `-- year=2022 |-- month=01 `-- month=02 |-- day=01 |-- day=.....

  • 23297 Views
  • 16 replies
  • 3 kudos
Latest Reply
MartinB
Contributor III
  • 3 kudos

@Kaniz Fatma​  could you maybe involve a Databricks expert?

  • 3 kudos
15 More Replies
Michael_Galli
by Contributor III
  • 10593 Views
  • 6 replies
  • 3 kudos

Resolved! com.microsoft.sqlserver.jdbc.SQLServerException:The driver could not establish a secure connection to SQL Server by using SSL encr. Error: "Unexpected rethrowing"

Hi all,there is a random error when pushing data from Databricks to a Azure SQL Database.Anyone else also had this problem? Any ideas are appreciated.See stacktrace attached.Target: Azure SQL Database, Standard S6: 400 DTUsDatabricks Cluster config:"...

  • 10593 Views
  • 6 replies
  • 3 kudos
Latest Reply
Michael_Galli
Contributor III
  • 3 kudos

@Pearl Ubaru​ TLS 1.1 is already deprecated.Are there any concerns from your side to set TLS 1.2 in the connection string?

  • 3 kudos
5 More Replies
JakeP
by New Contributor III
  • 2211 Views
  • 3 replies
  • 1 kudos

Resolved! Is there a way to create a path under /Repos via API?

Trying to use Repos API to automate creation and updates to repos under paths not specific to a user, i.e. /Repos/Admin/<repo-name>. It seems that creating a repo via POST to /api/2.0/repos will fail if you don't include a path, and will also fail i...

  • 2211 Views
  • 3 replies
  • 1 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 1 kudos

https://docs.databricks.com/dev-tools/api/latest/workspace.html#mkdirs try through Workspace API:curl --netrc --request POST \ https://dbc-a1b2345c-d6e7.cloud.databricks.com/api/2.0/workspace/mkdirs \ --header 'Accept: application/json' \ --dat...

  • 1 kudos
2 More Replies
mroy
by Contributor
  • 2896 Views
  • 3 replies
  • 0 kudos

Resolved! Bug Report: "Unsubscribed from" emails for deleted jobs have bad templating

I guess someone inverted the tokens in the template, because the emails look like this:Subject: "[user@company.com] Unsubscribed from 'Job'"Body: "This job has been deleted by dbc-12345678-1234."But it should look like this instead:Subject: "[dbc-123...

  • 2896 Views
  • 3 replies
  • 0 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 0 kudos

The bug reported has been fixed and merged. It will be deployed in the next release, which is planned for tomorrow in the PST time zone. !!!! Thanks to @Marco Roy​ 

  • 0 kudos
2 More Replies
Dunken
by New Contributor III
  • 3883 Views
  • 3 replies
  • 0 kudos

Resolved! SSO with Auth0?

Do you support SSO with any IdP which supports SAML 2.0 (e.g. Auth0) or is it limited to https://docs.databricks.com/administration-guide/users-groups/single-sign-on/index.html#supported-identity-providers?

  • 3883 Views
  • 3 replies
  • 0 kudos
Latest Reply
525374
New Contributor II
  • 0 kudos

I am currently having few applications (say App1, App2) along with databricks all integrated with auth0. Now what I wanted to achieve is that when we login to say databricks and then access other apps url in another tab it should not ask for login in...

  • 0 kudos
2 More Replies
_r_vind1199
by New Contributor II
  • 3695 Views
  • 3 replies
  • 3 kudos

Resolved! Pyspark installation issue

When I try to start pyspark session in pycharm. It throws me this error "RuntimeError("Java gateway process exited before sending its port number"). Could anyone help me to solve this?

  • 3695 Views
  • 3 replies
  • 3 kudos
Latest Reply
_r_vind1199
New Contributor II
  • 3 kudos

@Aashita Ramteke​ , Pyspark version 3.2.1

  • 3 kudos
2 More Replies
Chennaiyan
by New Contributor
  • 608 Views
  • 0 replies
  • 0 kudos

IntelliMindz is the best IT Training in Chennai with Placement, offering 200 and more software courses with 100% Placement Assistance. Start learning ...

IntelliMindz is the best IT Training in Chennai with Placement, offering 200 and more software courses with 100% Placement Assistance. Start learning with us intellimindz, and became an expert in Online Training. Contact 9655877577 for more details.S...

  • 608 Views
  • 0 replies
  • 0 kudos
Karik
by New Contributor II
  • 2699 Views
  • 1 replies
  • 2 kudos

No module named 'dependencies.spark'

Everyone help me solve bug  No module named 'dependencies.spark'source code:from pyspark.sql import Rowfrom pyspark.sql.functions import col, concat_ws, litfrom dependencies.spark import start_spark

  • 2699 Views
  • 1 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

What are you trying to do?

  • 2 kudos
User16826987838
by Contributor
  • 1262 Views
  • 2 replies
  • 0 kudos

Looking for information on security design on how JDBC connections to clusters function

I am looking for more information around the security design around how JDBC connections to clusters function:What security controls are in operation to safeguard the Databricks clusters?Is the API gateway abstracted from the Databricks cluster in th...

  • 1262 Views
  • 2 replies
  • 0 kudos
Latest Reply
Albina228
New Contributor II
  • 0 kudos

In fact, I have no idea what kind of design we are talking about, it causes associations with Cloud Ceilings

  • 0 kudos
1 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels