cancel
Showing results for 
Search instead for 
Did you mean: 
Page Title

Welcome to the Databricks Community

Discover the latest insights, collaborate with peers, get help from experts and make meaningful connections

102397members
52767posts
cancel
Showing results for 
Search instead for 
Did you mean: 
Registration now open! Databricks Data + AI Summit 2024

Join tens of thousands of data leaders, engineers, scientists and architects from around the world at Moscone Center in San Francisco, June 10–13.  Explore the latest advances in Apache Spark™, Delta Lake, MLflow, LangChain, PyTorch, dbt, Prest...

  • 7653 Views
  • 1 replies
  • 4 kudos
02-12-2024
Meet DBRX, the New Standard for High-Quality LLMs

Get your first look at DBRX April 25, 2024 | 8 AM PT If you’re using off-the-shelf LLMs to build GenAI applications, you’re probably struggling with quality, privacy and governance issues. What you need is a way to cost-effectively build a custom LLM...

  • 2307 Views
  • 3 replies
  • 2 kudos
2 weeks ago
Data Warehousing in the Era of AI

AI has the power to address the data warehouse’s biggest challenges — performance, governance and usability — thanks to its deeper understanding of your data and how it’s used. This is data intelligence and it’s revolutionizing the way you query, man...

  • 3026 Views
  • 5 replies
  • 1 kudos
2 weeks ago

Community Activity

HaripriyaP
by New Contributor
  • 56 Views
  • 1 replies
  • 0 kudos

Multiple Tables Migration from one workspace to another.

Hi all!I need to copy multiple tables from one workspace to another with metadata information. Is there any way to do it?Please reply as soon as possible.

  • 56 Views
  • 1 replies
  • 0 kudos
Latest Reply
shan_chandra
Honored Contributor III
  • 0 kudos

@HaripriyaP - Depends on your use case, Either of the below approach can be chosen. 1)  DELTA CLONE(DEEP CLONE) to clone them to the new workspace. 2) Have the same cluster policy/Instance profile of the old workspace to access them in the new worksp...

  • 0 kudos
liormayn
by Visitor
  • 118 Views
  • 1 replies
  • 0 kudos

OSError: [Errno 78] Remote address changed

Hello:)as part of deploying an app that previously ran directly on emr to databricks, we are running experiments using LTS 9.1, and getting the following error: PythonException: An exception was thrown from a UDF: 'pyspark.serializers.SerializationEr...

  • 118 Views
  • 1 replies
  • 0 kudos
Latest Reply
shan_chandra
Honored Contributor III
  • 0 kudos

@liormayn  - could you please let us know if you had a chance to run it on DBR 10.4 LTS?

  • 0 kudos
jenshumrich
by New Contributor III
  • 175 Views
  • 1 replies
  • 0 kudos

Long running jobs get lost

Hello,I tried to schedule a long running job and surprisingly it does seem to neither terminate (and thus does not let the cluster shut down), nor continue running, even though the state is still "Running":But the truth is that the job has miserably ...

jenshumrich_0-1712742957610.png jenshumrich_2-1712743008070.png jenshumrich_3-1712743098546.png
  • 175 Views
  • 1 replies
  • 0 kudos
Latest Reply
shan_chandra
Honored Contributor III
  • 0 kudos

@jenshumrich -  There is not much information to decipher. However, can you please check if you have enough parallelism built for the task to execute. (spark.sql.shuffle.partitions and the no.of cores on the cluster) to begin with

  • 0 kudos
moh3th1
by Visitor
  • 4 Views
  • 0 replies
  • 0 kudos

Optimal Cluster Configuration for Training on Billion-Row Datasets

Hello Databricks Community,I am currently facing a challenge in configuring a cluster for training machine learning models on a dataset consisting of approximately a billion rows and 40 features. Given the volume of data, I want to ensure that the cl...

  • 4 Views
  • 0 replies
  • 0 kudos
bozhu
by Contributor
  • 919 Views
  • 4 replies
  • 0 kudos

Delta Live Tables Materialised View Column Comment Error

While materialised view doc says MVs support columns comments, this does not seem like the case for MVs created by DLT. For example, when trying to add a comment to a MV created by DLT, it errors:Any ideas on when this will be fixed/supported?

bozhu_0-1692702233893.png
  • 919 Views
  • 4 replies
  • 0 kudos
Latest Reply
bozhu
Contributor
  • 0 kudos

Just to close the loop here that it seems DLT generated MVs now support column comments.

  • 0 kudos
3 More Replies
Chinu
by New Contributor III
  • 977 Views
  • 1 replies
  • 0 kudos

How do I access to DLT advanced configuration from python notebook?

Hi Team, Im trying to get DLT Advanced Configuration value from the python dlt notebook. For example, I set "something": "some path" in Advanced configuration in DLT and I want to get the value from my dlt notebook. I tried "dbutils.widgets.get("some...

  • 977 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Moderator
  • 0 kudos

The following docs will help. Please check the examples https://docs.databricks.com/en/delta-live-tables/settings.html#parameterize-pipelines

  • 0 kudos
jenshumrich
by New Contributor III
  • 59 Views
  • 1 replies
  • 0 kudos

Filter not using partition

I have the following code:spark.sparkContext.setCheckpointDir("dbfs:/mnt/lifestrategy-blob/checkpoints") result_df.repartitionByRange(200, "IdStation") result_df_checked = result_df.checkpoint(eager=True) unique_stations = result_df.select("IdStation...

  • 59 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Moderator
  • 0 kudos

Please check the physical query plan. Add .explain() API to your existing call and check the physical query plan for any filter push-down  values happening in your query.

  • 0 kudos
toolhater
by New Contributor II
  • 67 Views
  • 1 replies
  • 0 kudos

Installing dlt causing error

I'm trying to use the example in big book of engineering 2nd edition-final.pdf and I had an issue with the statementimport dltSo I created another cell and installed it and I noticed I was getting this error:"dataclass_transform() got an unexpected k...

  • 67 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Moderator
  • 0 kudos

could you get the full error stack trace please

  • 0 kudos
anish2102
by New Contributor
  • 77 Views
  • 1 replies
  • 0 kudos

Pyspark operations slowness in CLuster 14.3LTS as compared to 13.3 LTS

In my notebook, i am performing few join operations which are taking more than 30s in cluster 14.3 LTS where same operation is taking less than 4s in 13.3 LTS cluster. Can someone help me how can i optimize pyspark operations like joins and withColum...

Data Engineering
clustr-14.3
spark-3.5
  • 77 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Moderator
  • 0 kudos

check the physical query plan for both, DBR 14.3 and 13.3 to compare if these values are different. If they are, then check the Spark UI to identify where did it changed

  • 0 kudos
Hubert-Dudek
by Esteemed Contributor III
  • 74 Views
  • 1 replies
  • 0 kudos

Spot databricks VMs - eviction rates

Before using Spot machines in #databricks, it's a good idea to check their eviction rates in your region. Azure Resource Graph Explorer and that simple query will help. SpotResources | where type =~ 'microsoft.compute/skuspotevictionrate/location' ...

spots.png
  • 74 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Moderator
  • 0 kudos

Thank you for sharing this @Hubert-Dudek 

  • 0 kudos
Karlo_Kotarac
by New Contributor II
  • 34 Views
  • 1 replies
  • 0 kudos

Run failed with error message ContextNotFound

Hi all!Recently we've been getting lots of these errors when running Databricks notebooks:At that time we observed DRIVER_NOT_RESPONDING (Driver is up but is not responsive, likely due to GC.) log on the single-user cluster we use.Previously when thi...

Karlo_Kotarac_0-1713422302017.png
  • 34 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Moderator
  • 0 kudos

are you able to get the full error stack trace from the driver's logs? 

  • 0 kudos
Hubert-Dudek
by Esteemed Contributor III
  • 42 Views
  • 1 replies
  • 0 kudos

Nulls in Merge

If you are going to handle any null values in your MERGE condition, better watch out for your syntax #databricks

merge_danger.png
  • 42 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Moderator
  • 0 kudos

Thank you for sharing @Hubert-Dudek 

  • 0 kudos
mvmiller
by New Contributor III
  • 12 Views
  • 0 replies
  • 0 kudos

Workflow file arrival trigger - does it apply to overwritten files?

I am exploring the use of the "file arrival" trigger for a workflow for a use case I am working on.  I understand from the documentation that it checks every minute for new files in an external location, then initiates the workflow when it detects a ...

  • 12 Views
  • 0 replies
  • 0 kudos
Ajay-Pandey
by Esteemed Contributor III
  • 636 Views
  • 3 replies
  • 2 kudos

Resolved! Update regarding Community Reward Store

Hi Team,Is there any update on the Community Reward Store, as it's been discontinued from the old portal, and we still can't see the new portal for that.Is there any expected date when this will be available for community members?

  • 636 Views
  • 3 replies
  • 2 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 2 kudos

Thanks for update.

  • 2 kudos
2 More Replies
chloeh
by New Contributor
  • 158 Views
  • 1 replies
  • 0 kudos

Using SQL for Structured Streaming

Hi!I'm new to Databricks. I'm trying to create a data pipeline with structured streaming. A minimal example data pipeline would look like: read from upstream Kafka source, do some data transformation, then write to downstream Kafka sink. I want to do...

  • 158 Views
  • 1 replies
  • 0 kudos
Latest Reply
chloeh
New Contributor
  • 0 kudos

Ok I figured out why I was getting an error on the usage of `read_kafka`. My default cluster was set up with the wrong Databricks runtime

  • 0 kudos

Latest from our Blog

Attributing Costs in Databricks Model Serving

Databricks Model Serving provides a scalable, low-latency hosting service for AI models. It supports models ranging from small custom models to best-in-class large language models (LLMs). In this blog...

2504Views 1kudos

MLOps Gym - Unity Catalog Setup for MLOps

Unity Catalog (UC) is Databricks unified governance solution for all data and AI assets on the Data Intelligence Platform. UC is central to implementing MLOps on Databricks as it is where all your as...

2784Views 1kudos