cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

Erik_L
by Contributor II
  • 2799 Views
  • 3 replies
  • 4 kudos

Resolved! Support for Parquet brotli compression or a work around

Spark 3.3.1 supports the brotli compression codec, but when I use it to read parquet files from S3, I get:INVALID_ARGUMENT: Unsupported codec for Parquet page: BROTLIExample code:df = (spark.read.format("parquet") .option("compression", "brotli")...

  • 2799 Views
  • 3 replies
  • 4 kudos
Latest Reply
Erik_L
Contributor II
  • 4 kudos

Given the new information I appended, I looked into the Delta caching and I can disable it:.option("spark.databricks.io.cache.enabled", False)This works as a work around while I read these files in to save them locally in DBFS, but does it have perfo...

  • 4 kudos
2 More Replies
KVNARK
by Honored Contributor II
  • 847 Views
  • 1 replies
  • 4 kudos

Resolved! REVOKE access from users

There is a use-case where we want to REVOKE access from users so that they can't run VACUUM Command on Delta Table.Can anyone please help here.

  • 847 Views
  • 1 replies
  • 4 kudos
Latest Reply
Priyanka_Biswas
Valued Contributor
  • 4 kudos

Hello @KVNARK .​ We cannot specifically restrict Vacuum operation alone.You need to remove "MODIFY" access on the table and restrict only to the "Read" (SELECT) operationPlease note if you restrict to only "read" it will also affect all the write, up...

  • 4 kudos
229031
by New Contributor II
  • 973 Views
  • 1 replies
  • 1 kudos

Using your own docker container to launch databricks cluster.

When using your own docker container while creating a databricks cluster, what is the mapping between the number of containers launched and the nodes launched?Is it 1:1 mapping? or is it similar to other orchestration framework like Kubernetes?Or is ...

  • 973 Views
  • 1 replies
  • 1 kudos
Latest Reply
FRG96
New Contributor III
  • 1 kudos

+1

  • 1 kudos
Asterol
by New Contributor III
  • 1694 Views
  • 4 replies
  • 5 kudos

Data Engineer Associate and Professional tittle holders count

How many people hold tittles of certified Databricks Data Engineers Asociate/Professional right now?Is there any place I can check the global certificate count?

  • 1694 Views
  • 4 replies
  • 5 kudos
Latest Reply
sher
Valued Contributor II
  • 5 kudos

check here: https://credentials.databricks.com/collection/da21363e-5c7d-410a-b144-dd07d3e22942?_ga=2.163643839.1823848454.1674389186-2106443313.1667211405&_gac=1.49521364.1672812437.CjwKCAiAwc-dBhA7EiwAxPRylBN9S-JeQ8779ec3GXJYBQPfnu_qkv5l_MKO1u4jw2w-...

  • 5 kudos
3 More Replies
TeachingWithDat
by New Contributor II
  • 5221 Views
  • 3 replies
  • 3 kudos

I am getting this error: com.databricks.backend.common.rpc.DatabricksExceptions$SQLExecutionException: com.databricks.rpc.UnknownRemoteException: Remote exception occurred:

I am teaching a class for BYU Idaho and every table in every database has been imploded for my class. We keep getting this error:com.databricks.backend.common.rpc.DatabricksExceptions$SQLExecutionException: com.databricks.rpc.UnknownRemoteException: ...

  • 5221 Views
  • 3 replies
  • 3 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 3 kudos

Hi @Databricks University Alliance​, We haven’t heard from you since the last response from @Debayan Mukherjee​ , and I was checking back to see if his suggestions helped you. Or else, If you have any solution, please do share that with the community...

  • 3 kudos
2 More Replies
Ogi
by New Contributor II
  • 1925 Views
  • 3 replies
  • 1 kudos

Resolved! Azure CosmosDB change feed ingestion via DLT

Is there a way to ingest Azure CosmosDB data via Delta Live Tables? If I use regular workflows it works well, but with DLT I'm not able to set CosmosDB Connector on a cluster.

  • 1925 Views
  • 3 replies
  • 1 kudos
Latest Reply
Ogi
New Contributor II
  • 1 kudos

Thanks a lot! Just wanted to doublecheck whether this natively exists.

  • 1 kudos
2 More Replies
andrew0117
by Contributor
  • 2009 Views
  • 2 replies
  • 0 kudos

depth of view exceeds the maximum view resolution depth (100).

I got this error after updating a view. How can I increase the value of spark.sql.view.maNestedViewDepth to work around this? Thanks!

  • 2009 Views
  • 2 replies
  • 0 kudos
Latest Reply
Debayan
Esteemed Contributor III
  • 0 kudos

Hi, Could you please confirm if you are showing the view? (https://docs.databricks.com/sql/language-manual/sql-ref-syntax-aux-show-views.html) also, it will be helpful if you post the screenshot of the error.

  • 0 kudos
1 More Replies
User16765131552
by Contributor III
  • 1236 Views
  • 1 replies
  • 3 kudos

Delta Sharing Costs

When Delta Sharing is enabled and a link is shared, I understand that the data transfer happens directly and not through the sharing server. I'm curious how costs are calculated. Is the entity making the share available charged for data egress and ...

  • 1236 Views
  • 1 replies
  • 3 kudos
Latest Reply
Databricks_love
New Contributor II
  • 3 kudos

Any news

  • 3 kudos
blackcoffeeAR
by Contributor
  • 3026 Views
  • 5 replies
  • 2 kudos

Cannot install com.microsoft.azure.kusto:kusto-spark

Hello,I'm trying to install/update the library com.microsoft.azure.kusto:kusto-spark_3.0_2.12:3.1.xTried to install with Maven central repository and using Terraform.It was working previously and now the installation always ends with error:│ Error: c...

  • 3026 Views
  • 5 replies
  • 2 kudos
Latest Reply
phisolani
New Contributor II
  • 2 kudos

I have the same problem with a slightly different version of the connector (change on the minor version). I have a job that runs every hour and specifically, this started to happen on the 23rd of January onwards. The error indeed does say the same:Ru...

  • 2 kudos
4 More Replies
Dipesh
by New Contributor II
  • 3170 Views
  • 4 replies
  • 2 kudos

Pausing a scheduled Azure Databricks job after failure

Hi All,I have a job/workflow scheduled in Databricks to run after every hour.How can I configure my Job to pause whenever a job run fails? (Pause the job/workflow on first failure)I would want to prevent triggering multiple runs due to the scheduled/...

  • 3170 Views
  • 4 replies
  • 2 kudos
Latest Reply
Dipesh
New Contributor II
  • 2 kudos

Hi @Hubert Dudek​ , Thank you for your suggestion.I understand that we can use Jobs API to change the pasue_status of job on errors, but sometimes we observed that the workflow/job fails due to cluster issues (while the job clusters are getting creat...

  • 2 kudos
3 More Replies
User16783853906
by Contributor III
  • 1196 Views
  • 1 replies
  • 1 kudos

Understanding file retention with Vacuum

I have seen few instances where users reported that they run OPTIMIZE for the past week worth of data and they follow by VACUUM with RETAIN of 168 HOURS (for example), the old files aren't being deleted, "VACUUM is not removing old files from the tab...

  • 1196 Views
  • 1 replies
  • 1 kudos
Latest Reply
Priyanka_Biswas
Valued Contributor
  • 1 kudos

Hello @Venkatesh Kottapalli​ VACUUM removes all files from the table directory that are not managed by Delta, as well as data files that are no longer in the latest state of the transaction log for the table and are older than a retention threshold. ...

  • 1 kudos
Kaniz_Fatma
by Community Manager
  • 1026 Views
  • 1 replies
  • 1 kudos
  • 1026 Views
  • 1 replies
  • 1 kudos
Latest Reply
Priyanka_Biswas
Valued Contributor
  • 1 kudos

Here is the command to create cluster using databricks-clidatabricks clusters create --json-file create-cluster.jsoncreate-cluster.json{ "cluster_name": "my-cluster", "spark_version": "7.3.x-scala2.12", "node_type_id": "i3.xlarge", "spark_conf": ...

  • 1 kudos
User16826992666
by Valued Contributor
  • 910 Views
  • 1 replies
  • 3 kudos

When developing Delta Live Tables, is there a way to see the query history?

I am not sure where I can look currently to see how my DLT queries are performing. How can I investigate the query plan for past DLT runs?

  • 910 Views
  • 1 replies
  • 3 kudos
Latest Reply
Priyanka_Biswas
Valued Contributor
  • 3 kudos

Hello @Trevor Bishop​ You can check the query plan in the Spark UI , SQL tab. You would need to select the past run from dropdown and click on SparkUIAdditionally an event log is created and maintained for every Delta Live Tables pipeline. The event ...

  • 3 kudos
databicky
by Contributor II
  • 1299 Views
  • 2 replies
  • 1 kudos

how to get the status of notebook in different notebook

i want to run two notebook like if the count is not equal to zero, first i want to trigger the first notebook and i want to check the particular notebook is succeeded or not ,until the success it need to wait like sleep, if its succeeded means then ...

  • 1299 Views
  • 2 replies
  • 1 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 1 kudos

You can use dbutils.notebook.run() to execute a notebook from another notebook if conditions are met in your custom logic; you can also use dbutils.jobs.taskValues to pass values between notebooks https://docs.databricks.com/workflows/jobs/how-to-sha...

  • 1 kudos
1 More Replies
Dipesh
by New Contributor II
  • 1424 Views
  • 1 replies
  • 1 kudos

Resolved! Bulk updating Delta tables in Databricks

Hi All,I have some data in Delta table with multiple columns and each record has a unique identifier.I want to update some columns as per the new values coming in for each of these unique records. However updating one record at a time is taking a lot...

  • 1424 Views
  • 1 replies
  • 1 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 1 kudos

yes by using MERGE statment

  • 1 kudos
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels