- 4880 Views
- 4 replies
- 4 kudos
Spark 3.3.1 supports the brotli compression codec, but when I use it to read parquet files from S3, I get:INVALID_ARGUMENT: Unsupported codec for Parquet page: BROTLIExample code:df = (spark.read.format("parquet")
.option("compression", "brotli")...
- 4880 Views
- 4 replies
- 4 kudos
Latest Reply
Given the new information I appended, I looked into the Delta caching and I can disable it:.option("spark.databricks.io.cache.enabled", False)This works as a work around while I read these files in to save them locally in DBFS, but does it have perfo...
3 More Replies
- 2064 Views
- 2 replies
- 7 kudos
Rename and drop columns with Delta Lake column mapping. Hi all,Now databricks started supporting column rename and drop.Column mapping requires the following Delta protocols:Reader version 2 or above.Writer version 5 or above.Blog URL##Available in D...
- 2064 Views
- 2 replies
- 7 kudos
Latest Reply
Above mentioned feature is not working in the DLT pipeline. if the scrip has more than 4 columns
1 More Replies
- 15425 Views
- 9 replies
- 1 kudos
Hi Everybody,I have 20 years of data, 600m rows.I have partitioned them on year and month to generated a files size which seems reasonable.(128Mb)All data is queried using timestamp, as all queries needs to filter on the exact hours.So my requirement...
- 15425 Views
- 9 replies
- 1 kudos
Latest Reply
Hi Guys, thanks for your advices. I found a solution. We upgrade the Databricks Runtime to 12.2 and now the pushdown of the partitionfilter works. The documentation said that 10.4 would be adequate, but obviously it wasn't enough.
8 More Replies
- 1764 Views
- 0 replies
- 1 kudos
I have a large delta table that I need to analyze in native R. The only option I have currently is to query the delta table then use collect() to bring that spark dataframe into an R dataframe. Is there an alternative method that would allow me to qu...
- 1764 Views
- 0 replies
- 1 kudos
- 3283 Views
- 8 replies
- 18 kudos
Dear Team,While i was doing hands on practice from the course - Delta Lake Rapid Start with Pythonhttps://customer-academy.databricks.com/learn/course/97/delta-lake-rapid-start-with-pythoni have come across false as the output dbutils.fs.rm(health_t...
- 3283 Views
- 8 replies
- 18 kudos
Latest Reply
Could you give more description about your issue (screenshot or something). Hope to help you find the issue?
7 More Replies
by
AJDJ
• New Contributor III
- 7628 Views
- 9 replies
- 4 kudos
Hi there, I imported the delta lake demo notebook from databricks link and at command 12 it errors out. I tired other ways and path but couldnt get past the error. May be the notebook is outdated?https://www.databricks.com/notebooks/Demo_Hub-Delta_La...
- 7628 Views
- 9 replies
- 4 kudos
Latest Reply
Hi @AJ DJ Does @Hubert Dudek response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!
8 More Replies
- 2658 Views
- 4 replies
- 3 kudos
Hey guys,We're considering Delta Lake as the storage for our project and have a couple questions. The first one is what's the pricing for Delta Lake - can't seem to find a page that says x amount costs y.The second question is more technical - if we...
- 2658 Views
- 4 replies
- 3 kudos
Latest Reply
delta lake itself is free. It is a file format. But you will have to pay for storage and compute of course.If you want to use Databricks with delta lake, it will not be free unless you use the community edition.Depending on what you are planning to...
3 More Replies
by
Bency
• New Contributor III
- 6916 Views
- 6 replies
- 4 kudos
I am trying to use Databricks Delta Lake Sink Connector(confluent cloud ) and write to S3 . the connector starts up with the following error . Any help on this could be appreciated org.apache.kafka.connect.errors.ConnectException: java.sql.SQLExcepti...
- 6916 Views
- 6 replies
- 4 kudos
Latest Reply
Bency
New Contributor III
Hi @Kaniz Fatma yes we did , looks like it was indeed a whitelisting issue . Thanks @Hubert Dudek @Kaniz Fatma
5 More Replies
- 1445 Views
- 1 replies
- 0 kudos
I've been doing some research on optimizing data storage while implementing delta, however, I'm not sure which instance type would be best for this.
- 1445 Views
- 1 replies
- 0 kudos
Latest Reply
OPTIMIZE as you alluded has two operations , Bin-packing and multi-dimensional clustering ( zorder)Bin-packing optimization is idempotent, meaning that if it is run twice on the same dataset, the second run has no effectZ-Ordering is not idempotent b...