cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

AndyM
by New Contributor II
  • 1304 Views
  • 1 replies
  • 0 kudos

Databricks import api with lakeview dashboard "error_code":"INVALID_PARAMETER_VALUE"

Hi Community!I was trying to use the import api to replicate a lakehouse dashboard in a workspace but I keep bumping into INVALID_PARAMETER_VALUE error. After spending some time with getting the "content" property to (probably) correct base64 string ...

  • 1304 Views
  • 1 replies
  • 0 kudos
Latest Reply
SergeRielau
Databricks Employee
  • 0 kudos

There may be more information in the log. Look for "Cannot parse the zip archive."

  • 0 kudos
zyang
by Contributor
  • 14610 Views
  • 12 replies
  • 13 kudos

Option "delta.columnMapping.mode","name" introduces unexpected result

Hi, I am trying to write and create a delta table by enable "delta.columnMapping.mode","name", and the partition is date. But I found that when I enable this option, the partition folder name is not date any more while it is some random two letters.A...

image
  • 14610 Views
  • 12 replies
  • 13 kudos
Latest Reply
CkoockieMonster
New Contributor II
  • 13 kudos

Hello, I'm a bit late to the party, but I'll put that for posterity:There's a way to rename your weird two letter named folders and still have your table working, but it violates the good practices guidelines suggested by Data Bricks, and I don't thi...

  • 13 kudos
11 More Replies
melbourne
by Contributor
  • 1442 Views
  • 2 replies
  • 1 kudos

Unable to write to Volume from DLT pipeline

Hi,I've a DLT pipeline running in Unity Catalog, and one of the task is to write content into a file within volume.I was able to write to file within volume using just PySpark, however when I do the same in DLT, I get an error:OSError: [Errno 30] Rea...

  • 1442 Views
  • 2 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

It seems like a permission error.  Can you check if the managed identity has correct permissions to write to the volume?DLT should be supported to write into volumes.

  • 1 kudos
1 More Replies
param_sen
by New Contributor II
  • 3028 Views
  • 1 replies
  • 0 kudos

What is the best practice for data model in silver layer in lakehouse

As per databricks https://www.databricks.com/glossary/medallion-architecture silver layers typically represent the "enterprise view" with improved quality than bronze (cleansed, deduplicated , augmented ) and mostly has 3NF like normalised data . The...

  • 3028 Views
  • 1 replies
  • 0 kudos
Latest Reply
Palash01
Valued Contributor
  • 0 kudos

Hey @param_sen Given your concerns about expensive joins and prioritizing analytics with flat raw data, here are some suggestions:Analyze the most common queries and reports you anticipate. Do they heavily rely on joins across dimensions? If not, the...

  • 0 kudos
VitaliiK
by New Contributor
  • 3426 Views
  • 1 replies
  • 1 kudos

Resolved! Asset Bundles Scala

Do asset bundles support Scala notebooks? I am trying to run a simple job that uses Scala notebook and getting an error: "run failed with error message Your administrator has only allowed sql and python commands on this cluster. This execution contai...

Data Engineering
asset bundles
  • 3426 Views
  • 1 replies
  • 1 kudos
Latest Reply
Ayushi_Suthar
Databricks Employee
  • 1 kudos

Hi @VitaliiK,Thanks for bringing up your concerns; I am always happy to help  The error means that the cluster is configured to only allow SQL and Python languages to be used in notebooks and the notebook you are trying to run contains a Scala langua...

  • 1 kudos
OLAPTrader
by New Contributor III
  • 2786 Views
  • 3 replies
  • 1 kudos

Resolved! autoloader stops working if I do not drop table each time

I first create a catalog and schema and ingest some data into it as follows:catalogName = 'neo'schemaName ='indicators'locationPath = 's3a://databricks-workspace-olaptrader-stack-1-bucket/unity-catalog/99999999xxx'sqlContext.sql(f"CREATE CATALOG IF N...

  • 2786 Views
  • 3 replies
  • 1 kudos
Latest Reply
OLAPTrader
New Contributor III
  • 1 kudos

My issue was due to the fact that I have over 300 columns and due to datatype mismatches, the rows were actually written to the table, but values were all null. That's why I didnt get any errors. I am doing manual datatype mapping now and I am able t...

  • 1 kudos
2 More Replies
Anonymous
by Not applicable
  • 2111 Views
  • 2 replies
  • 0 kudos

Resolved! Auto optimize config

Does auto-optimize work for existing tables only or will it work for both existing and new tables when we enable at the cluster config level?

  • 2111 Views
  • 2 replies
  • 0 kudos
Latest Reply
Mooune_DBU
Valued Contributor
  • 0 kudos

If you're referring to Delta Tables, Auto-Optimize will work for both.For new tables:CREATE TABLE student (id INT, name STRING, age INT) TBLPROPERTIES (delta.autoOptimize.optimizeWrite = true, delta.autoOptimize.autoCompact = true)For existing tables...

  • 0 kudos
1 More Replies
karthik-kobai
by New Contributor II
  • 1421 Views
  • 0 replies
  • 0 kudos

Databricks-jdbc and vulnerabilities CVE-2021-36090 CVE-2023-6378 CVE-2023-6481

The latest version of Databricks-jdbc available through Maven (2.6.36) now has these three vulnerabilities:https://www.cve.org/CVERecord?id=CVE-2021-36090https://www.cve.org/CVERecord?id=CVE-2023-6378https://www.cve.org/CVERecord?id=CVE-2023-6481All ...

  • 1421 Views
  • 0 replies
  • 0 kudos
Christoph
by New Contributor II
  • 1422 Views
  • 3 replies
  • 0 kudos

Internal Error when querying a doubleType column of a delta table using ">" "<" operators

Hi there,we are currently facing a pretty confusing issue:We have a delta table (~2TB) which has been working just fine over the last few years and months. For a few days or weeks now, querying the table on one of its columns, let´s call it double_co...

  • 1422 Views
  • 3 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

it might be a bug which is already logged, or a new one.  You can check the Spark Jira pages.

  • 0 kudos
2 More Replies
Mohamednazeer
by New Contributor III
  • 2519 Views
  • 1 replies
  • 0 kudos

Resolved! IllegalArgumentException: Mount failed due to invalid mount source

We are trying to create mount for containers from two different storage accounts. We are using Azure Storage Account and Azure Data bricks.We could able to create mount for containers from one storage account, but when we try to create the mount for ...

  • 2519 Views
  • 1 replies
  • 0 kudos
Latest Reply
Mohamednazeer
New Contributor III
  • 0 kudos

Hi community,The issue was becoz of cross vent access. The storage account and the databricks workspace both are in different vnet. Since that we had to create private end point to access the cross vnet resources. Once we crated the private endpoint ...

  • 0 kudos
rhevarr
by New Contributor II
  • 1185 Views
  • 0 replies
  • 0 kudos

Course: Apache Spark Programming with Databricks ID: E-P0W7ZV // Issue Classroom-Setup

Hello,I am trying to run the Classroom-Setup from the course files notebook (ASP 1.1 - Databricks Platform)(Course:Apache Spark™ Programming with DatabricksID: E-P0W7ZV)Instructions: "Setup:Run classroom setup to mount Databricks training datasets an...

Data Engineering
academy
Course
Databricks
spark
  • 1185 Views
  • 0 replies
  • 0 kudos
Hardy
by New Contributor III
  • 7781 Views
  • 6 replies
  • 3 kudos

upload files to dbfs:/volume using databricks cli

In our azure pipeline we are using databricks-cli command to upload jar files at dbfs:/FileStore location and that works perfectly fine. But when we try to use the same command to upload files at dbfs:/Volume/dev/default/files, it does not work and g...

  • 7781 Views
  • 6 replies
  • 3 kudos
Latest Reply
saikumar246
Databricks Employee
  • 3 kudos

@Hardy I think you are using the word volume in the path but it should be Volumes(plural), not Volume(singular). Do one thing, copy the volume path directly from the Workspace and try.

  • 3 kudos
5 More Replies
Volker
by New Contributor III
  • 1843 Views
  • 2 replies
  • 2 kudos

Preferred compression format for ingesting large amounts of JSON files with Autoloader

Hello Databricks Community,in an IOT context we plan to ingest a large amount of JSON files (~2 Million per Day). The JSON files are in json lines format und need to be compressed on the IOT devices. We can provide suggestions for the type of compres...

  • 1843 Views
  • 2 replies
  • 2 kudos
Latest Reply
Volker
New Contributor III
  • 2 kudos

Hi, sorry I guess my response wasn't sent. The source are JSON files that are uploaded to an S3 bucket. The sink will be a Delta Table and we are using autoloader.The question was about the compression format of the incoming JSON files, e.g. if it wo...

  • 2 kudos
1 More Replies
FerArribas
by Contributor
  • 1149 Views
  • 1 replies
  • 0 kudos

Custom JobGroup in Spark UI for cluster with multiple executions

Does anyone know what the first digits of the jobgroup that are shown in the spark ui mean when using all purpose clusters to launch multiple jobs?Right now the pattern is something like: [id_random]_job_[jod_id]_run-[run_id]_action_[action].

  • 1149 Views
  • 1 replies
  • 0 kudos
Latest Reply
saikumar246
Databricks Employee
  • 0 kudos

Hi @FerArribas  The first digits of the jobgroup that are shown in the spark UI are execContextId and cmdId(Command_ID). You can think of the execContextId as some kind of “REPL ID” For example, if you take the below job group ID as an example, jobGr...

  • 0 kudos
luriveros
by New Contributor
  • 5386 Views
  • 1 replies
  • 0 kudos

implementing liquid clustering for DataFrames directly

 Hi !! I have a question is it possible to implementing liquid clustering for DataFrames directly saved to delta files (df.write.format("delta").save("path")), The conventional approach involving table creation

  • 5386 Views
  • 1 replies
  • 0 kudos
Latest Reply
brockb
Databricks Employee
  • 0 kudos

Hi,Hopefully this question is related to testing and any production data would get persisted to a table but one example is:df = (spark.range(10).write.format("delta").mode("append").save("file:/tmp/data"))ALTER TABLE delta.`file:/tmp/data` CLUSTER BY...

  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels