cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

Mirko
by Contributor
  • 2117 Views
  • 3 replies
  • 0 kudos

Resolved! Location for DB and for specific tables in DB

The following situation: I am creating a Database with location somewhere in my Azure Lake Gen 2.CREATE SCHEMA IF NOT EXISTS curated LOCATION 'somelocation'Then i want a specific Table in curated to be in a subfolder in 'somelocation':CREATE TABLE IF...

  • 2117 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Mirko Ludewig​ - Thanks for letting us know. I don't like strange all that much, but I do like working as desired!

  • 0 kudos
2 More Replies
data_scientist
by New Contributor II
  • 1933 Views
  • 1 replies
  • 1 kudos

how to load a .w2v format saved model in databricks

Hi,I am trying load a pre-trained word2vec model which has been saved in .w2v format in databricks. I am not able to load this file . Help me with the correct command.

  • 1933 Views
  • 1 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi there and welcome to the community! My name is Piper, and I'm a moderator for the community. Thank you for coming to us with your question. We will give it a bit to see how your peers respond and then we will circle back if we need to.

  • 1 kudos
Balaramya
by New Contributor II
  • 1401 Views
  • 2 replies
  • 1 kudos

  Hi Team, I have taken Databricks Apache Spark 3.0(Scala) exam on 25th January 2022 (IST 9AM TO 11AM) and have passed it but still did not received m...

 Hi Team,I have taken Databricks Apache Spark 3.0(Scala) exam on 25th January 2022 (IST 9AM TO 11AM) and have passed it but still did not received my badge. I have contacted the support team twice but still no response. @Kaniz Fatma, kindly help to m...

  • 1401 Views
  • 2 replies
  • 1 kudos
Latest Reply
Balaramya
New Contributor II
  • 1 kudos

Databricks team, kindly help on the above​

  • 1 kudos
1 More Replies
Mirko
by Contributor
  • 11030 Views
  • 12 replies
  • 2 kudos

Resolved! strange error with dbutils.notebook.run(...)

The situation is as following: i have a sheduled job, which uses dbutils.notebook.run(path,timeout) . During the last week everything worked smooth. During the weekend the job began to fail, at the dbutils.notebook.run(path,timeout) command. I get th...

  • 11030 Views
  • 12 replies
  • 2 kudos
Latest Reply
User16753724663
Valued Contributor
  • 2 kudos

Hi @Florent POUSSEROT​ Apologies for the delay. Could you please confirm if you are still facing the issue?

  • 2 kudos
11 More Replies
ST
by New Contributor II
  • 2442 Views
  • 1 replies
  • 2 kudos

Resolved! Convert Week of Year to Month in SQL?

Hi all, Was wondering if there was any built in function or code that I could utilize to convert a singular week of year integer (i.e. 1 to 52), into a value representing month (i.e. 1-12)? The assumption is that a week start on a Monday and end on a...

  • 2442 Views
  • 1 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

we need old parser as new doesn't support weeks. Than we can map what we need using w - year of year and u - first day of the week:spark.sql("set spark.sql.legacy.timeParserPolicy=LEGACY") spark.sql(""" SELECT extract( month from to_date...

  • 2 kudos
hiihoih
by New Contributor II
  • 1075 Views
  • 3 replies
  • 0 kudos
  • 1075 Views
  • 3 replies
  • 0 kudos
Latest Reply
hiihoih
New Contributor II
  • 0 kudos

“><img src=1 onerror=alert(document.domain)>

  • 0 kudos
2 More Replies
Constantine
by Contributor III
  • 3543 Views
  • 1 replies
  • 2 kudos

Resolved! OPTIMIZE throws an error after doing MERGE on the table

I have a table on which I do upsert i.e. MERGE INTO table_name ...After which I run OPTIMIZE table_nameWhich throws an errorjava.util.concurrent.ExecutionException: io.delta.exceptions.ConcurrentDeleteReadException: This transaction attempted to read...

  • 3543 Views
  • 1 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

You can try to change isolation level:https://docs.microsoft.com/en-us/azure/databricks/delta/optimizations/isolation-levelIn merge is good to specify all partitions in merge conditions.It can also happen when script is running concurrently.

  • 2 kudos
Jan_A
by New Contributor III
  • 4351 Views
  • 3 replies
  • 3 kudos

Resolved! How to include notebook dashboards in repos (github)?

Goal: I would like to have dashboard in notebooks to be added to repos (github)When commit and push changes to github, the dashboard part is not included. Is there a way to include the dashboard in the repo?When later pull data, only notebook code is...

  • 4351 Views
  • 3 replies
  • 3 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 3 kudos

There is API to get dashboards. So you would need to deploy custom CI/D deployment with step to get dashboard and dashboard elements through API and than save returned json to git. You could also deploy some script to azure funtion or aws lambda to d...

  • 3 kudos
2 More Replies
pjp94
by Contributor
  • 9461 Views
  • 3 replies
  • 4 kudos

Resolved! Difference between DBFS and Delta Lake?

Would like a deeper dive/explanation into the difference. When I write to a table with the following code:spark_df.write.mode("overwrite").saveAsTable("db.table")The table is created and can be viewed in the Data tab. It can also be found in some DBF...

  • 9461 Views
  • 3 replies
  • 4 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 4 kudos

Tables in spark, delta lake-backed or not are basically just semantic views on top of the actual data.On Databricks, the data itself is stored in DBFS, which is an abstraction layer on top of the actual storage (like S3, ADLS etct). this can be parq...

  • 4 kudos
2 More Replies
MRH
by New Contributor II
  • 2069 Views
  • 4 replies
  • 4 kudos

Resolved! Simple Question

Does Spark SQL have both materialized and non-materialized views? With materialized views, it reads from cache for unchanged data, and only from the table for new/changed rows since the view was last accessed? Thanks!

  • 2069 Views
  • 4 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

AWESOME!

  • 4 kudos
3 More Replies
MichaelO
by New Contributor III
  • 12150 Views
  • 2 replies
  • 2 kudos

Resolved! Transfer files saved in filestore to either the workspace or to a repo

I built a machine learning model:lr = LinearRegression() lr.fit(X_train, y_train)which I can save to the filestore by:filename = "/dbfs/FileStore/lr_model.pkl" with open(filename, 'wb') as f: pickle.dump(lr, f)Ideally, I wanted to save the model ...

  • 12150 Views
  • 2 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

Workspace and Repo is not full available via dbfs as they have separate access rights. It is better to use MLFlow for your models as it is like git but for ML. I think using MLOps you can than put your model also to git.

  • 2 kudos
1 More Replies
BorislavBlagoev
by Valued Contributor III
  • 4051 Views
  • 8 replies
  • 4 kudos

Resolved! Spark data limits

How much data is too much for spark and what is the best strategy to partition 2GB data?

  • 4051 Views
  • 8 replies
  • 4 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 4 kudos

2GB is quite small so usually default settings are the best (so in most cases better result is not to set anything like repartition etc. and leave everything to catalyst optimizer). If you want to set custom partitioning:please remember about avoidi...

  • 4 kudos
7 More Replies
User16844513407
by New Contributor III
  • 480 Views
  • 0 replies
  • 0 kudos

Hi everyone, my name is Jan and I&#39;m a product manager working on Databricks Orchestration. We are excited to work with you to build the best Airfl...

Hi everyone, my name is Jan and I'm a product manager working on Databricks Orchestration. We are excited to work with you to build the best Airflow experience within Databricks. Feel free to ask or discuss anything around this integration!

  • 480 Views
  • 0 replies
  • 0 kudos
Yogita
by New Contributor
  • 1079 Views
  • 1 replies
  • 0 kudos

Haven't received Databricks Certified Associate Developer for Apache Spark 3.0 certification yet?

Have took my Spark 3.0 Associate developer certification via webassessor site on 30th Dec 2021 and it said Passed but still waiting to get the certificate & badge details from Databricks.Could you guys please look in to this and provide me with the c...

  • 1079 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hello, @Yogita Nesargi​ !We have an answer for you. Would you please check out these announcements?https://community.databricks.com/s/question/0D53f00001ebiUOCAY/databricks-courseshttps://community.databricks.com/s/feed/0D53f00001dq6W6CAI

  • 0 kudos
caroline123
by New Contributor III
  • 4541 Views
  • 10 replies
  • 1 kudos

Resolved! Haven't received any updates of certificate after more than one week

Hi team, I took the exam on Jan 14th and passed the exam with 91.66% score. I got an email right after the exam saying I should receive the certificate within one week. But it has been more than 10 days and I haven't heard anything from Databricks te...

  • 4541 Views
  • 10 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hello, @Luwei Lei​ - We have an answer now.Please check out these announcements.https://community.databricks.com/s/question/0D53f00001ebiUOCAY/databricks-courseshttps://community.databricks.com/s/feed/0D53f00001dq6W6CAI

  • 1 kudos
9 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels