cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Divya_Bhadauria
by New Contributor III
  • 1200 Views
  • 0 replies
  • 0 kudos

Number of rows displayed in sql cell

Even though the default limit on rows displayed is 10,000, the SQL cell is showing rows less than the limit when my resultant has more rows than 10k.It should alteast show the default limit .

  • 1200 Views
  • 0 replies
  • 0 kudos
source2sea
by Contributor
  • 7650 Views
  • 4 replies
  • 2 kudos

Resolved! how to make databricks job to fail when the application has already given "exit code 1"?

object OurMainObject extends LazyLogging with IOApp { def run(args: List[String]): IO[ExitCode] = { logger.info("Started the application")   val conf = defaultOverrides.withFallback(defaultApplication).withFallback(defaultReference) val...

  • 7650 Views
  • 4 replies
  • 2 kudos
Latest Reply
source2sea
Contributor
  • 2 kudos

my workaround now is to make the code like below, so the databricks jobs becomes failure. case Left(ex) => { IO(logger.error("Glue failure", ex)).map(_ => ExitCode.Error) IO.raiseError(ex) }

  • 2 kudos
3 More Replies
DomDuf
by New Contributor II
  • 8195 Views
  • 3 replies
  • 3 kudos

Resolved! Roll back to previous version of an AutoLoader checkpoint file

I know to "reset" AutoLoader, you can delete the checkpoint file entirely. I was wondering if it's possible to and how would someone :Get the checkpoint file to a previous version so I can reload certain files that were already processedDelete certai...

  • 8195 Views
  • 3 replies
  • 3 kudos
Latest Reply
MRTN
New Contributor III
  • 3 kudos

This would for sure be a useful feature.

  • 3 kudos
2 More Replies
Hubert-Dudek
by Databricks MVP
  • 2290 Views
  • 0 replies
  • 4 kudos

databricks Runtime 13.1 has added the sql_keywords() function, which lists all SQL keywords. It is a good practice to refrain from using these keyword...

databricks Runtime 13.1 has added the sql_keywords() function, which lists all SQL keywords. It is a good practice to refrain from using these keywords as names for tables or fields, although, in standard ANSI false mode, it will work without problem...

Untitled
  • 2290 Views
  • 0 replies
  • 4 kudos
KVNARK
by Honored Contributor II
  • 5330 Views
  • 2 replies
  • 1 kudos

Resolved! Notebook activity is getting timed out in ADF pipeline.

Notebook activity is getting timed out after certain time of running (5 hours) in ADF pipeline and getting timeout error.Its just simply getting timed out error. Problem is this will process TB of data daily. can anyone have any idea to fix this.

  • 5330 Views
  • 2 replies
  • 1 kudos
Latest Reply
KVNARK
Honored Contributor II
  • 1 kudos

@Daniel Sahal​ - Noted. Thanks Daniel!

  • 1 kudos
1 More Replies
Hubert-Dudek
by Databricks MVP
  • 6948 Views
  • 2 replies
  • 9 kudos

You can use apache hudi in databricks without a problem: - in cluster settings, install Maven library org.apache.hudi:hudi-spark3.3-bundle_2.12:0.13.0...

You can use apache hudi in databricks without a problem:- in cluster settings, install Maven library org.apache.hudi:hudi-spark3.3-bundle_2.12:0.13.0 for Databricks 12.2 LTS- in cluster spark config, add three lines:spark.sql.extensions org.apache.sp...

hudi
  • 6948 Views
  • 2 replies
  • 9 kudos
Latest Reply
ros
New Contributor III
  • 9 kudos

I tried installing library and configuring spark configs, restarted the cluster and then in notebook ran the create cmd but it gives me error stating java.io.FileNotFoundException: No such file or directory: s3://incred-databricks-data/hudi_dms_data/...

  • 9 kudos
1 More Replies
_deepak_
by New Contributor II
  • 2353 Views
  • 1 replies
  • 2 kudos

Resolved! Shallow copy in databricks

Hi, I am new to Databricks. I need to setup a non-prod environment for which I need data of prod to be cloned in non-prod. Explored some and got to know about shallow copy. Is it possible to do shallow copy across environments? or Is it possible to d...

  • 2353 Views
  • 1 replies
  • 2 kudos
Latest Reply
daniel_sahal
Databricks MVP
  • 2 kudos

@deepak prasad​ I'm not sure it's possible to do that. Even with Unity Catalog enabled, you cannot use shallow clone.You can do two things here:Without UC - just simply recreate an empty table in your non-prod environment and do SELECT * from prod st...

  • 2 kudos
SenthilJ
by New Contributor III
  • 2207 Views
  • 1 replies
  • 2 kudos

Resolved! Databricks Account

Hi,In my org, we are using Azure Databricks. As an Azure AD user, I and my project team have access to Databricks workspaces. In our context, what's exactly meant as Databricks Account? I understand it's a group of workspaces used for billing, but at...

  • 2207 Views
  • 1 replies
  • 2 kudos
Latest Reply
daniel_sahal
Databricks MVP
  • 2 kudos

@Senthilnathan J​ Databricks Account is like a top level of administration layer for everything that's going on your tenant.

  • 2 kudos
alexiswl
by Contributor
  • 2971 Views
  • 1 replies
  • 2 kudos

Resolved! Version Controlling SQL Query snippets

Hello, I have a suite of SQL Queries for creating table views (i.e)``` CREATE OR REPLACE VIEW silver.filtered_samples.metadata_table AS (  SELECT * FROM bronze.samples.table  WHERE sample_status ='pass' )```I have tried moving these a repo but I ...

  • 2971 Views
  • 1 replies
  • 2 kudos
Latest Reply
daniel_sahal
Databricks MVP
  • 2 kudos

@Alexis Lucattini​ You're right, the best approach would be to use notebooks with a single %sql magic block.

  • 2 kudos
Chinu
by New Contributor III
  • 2797 Views
  • 1 replies
  • 1 kudos

Resolved! Databricks query history api with filter_by warehouse_id

Hi, Im trying to pull query history filtered by warehouse id but my url is not working. Do you have an example on how the url will looks like?I tried this --> https://**.cloud.databricks.com/api/2.0/sql/history/queries?filter_by={"warehouse_id":"193b...

  • 2797 Views
  • 1 replies
  • 1 kudos
Latest Reply
Chinu
New Contributor III
  • 1 kudos

Oh, looks like I need to add this raw data. { "filter_by": {  "warehouse_ids": "193b15a590ed23d2" }}

  • 1 kudos
kll
by New Contributor III
  • 1364 Views
  • 0 replies
  • 0 kudos

Spark DataFrame apply Databricks geospatial indexing functions

I have a spark DataFrame with `h3` hex ids and I am trying to obtain the polygon geometries. from pyspark.sql import SparkSession from pyspark.sql.functions import col, expr from pyspark.databricks.sql.functions import *   from mosaic import enable_m...

  • 1364 Views
  • 0 replies
  • 0 kudos
rami1
by New Contributor III
  • 13463 Views
  • 2 replies
  • 6 kudos

METASTORE_DOWN: Cannot connect to metastore

I am trying to view databases and tables, default as well user created but it looks like the cluster created is not able to connect. I am using databricks default hive metastore. Viewing cluster logs provide following ventMETASTORE_DOWN Metastore is...

  • 13463 Views
  • 2 replies
  • 6 kudos
Latest Reply
Anonymous
Not applicable
  • 6 kudos

@rami​ :If the metastore is down, it means that the Databricks cluster is not able to connect to the metastore. Here are a few things you can try to resolve the issue:Check if the Hive metastore is up and running. You can try to connect to the metast...

  • 6 kudos
1 More Replies
Mumrel
by Contributor
  • 3599 Views
  • 2 replies
  • 2 kudos

Resolved! Error 95 when importing one Notebook into another

When I follow the instructions Modularize your code using files I get the following error:I am on azure, use DBRT 12.2 LTS, use ADLS as storage, I am happy to provide more details if needed. My research suggest that the reason is that the dfbs fuse...

image
  • 3599 Views
  • 2 replies
  • 2 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 2 kudos

import works for .py files..%run is for notebooks.is lib a .py file or a notebook?

  • 2 kudos
1 More Replies
Thijs
by New Contributor III
  • 4385 Views
  • 3 replies
  • 4 kudos

How do I define & run jobs that execute scripts that are copied inside a custom DataBricks container?

Hi all, we are building custom Databricks containers (https://docs.databricks.com/clusters/custom-containers.html). During the container build process we install dependencies and also python source code scripts. We now want to run some of these scrip...

  • 4385 Views
  • 3 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hi @Thijs van den Berg​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best an...

  • 4 kudos
2 More Replies
frank7
by New Contributor II
  • 4758 Views
  • 2 replies
  • 1 kudos

Resolved! Is it possible to write a pyspark dataframe to a custom log table in Log Analytics workspace?

I have a pyspark dataframe that contains information about the tables that I have on sql database (creation date, number of rows, etc)Sample data: { "Day":"2023-04-28", "Environment":"dev", "DatabaseName":"default", "TableName":"discount"...

  • 4758 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

@Bruno Simoes​ :Yes, it is possible to write a PySpark DataFrame to a custom log table in Log Analytics workspace using the Azure Log Analytics Workspace API.Here's a high-level overview of the steps you can follow:Create an Azure Log Analytics Works...

  • 1 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels