cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Gustavo_Az
by Contributor
  • 5137 Views
  • 9 replies
  • 27 kudos

Resolved! When will be available the course "Data Engineering with Databricks V3" in Databricks Academy?

In the documentation of the V2 it says that it would be released 4 days ago. But searching for it in the academy only throws results for V1 and V2.

  • 5137 Views
  • 9 replies
  • 27 kudos
Latest Reply
Anonymous
Not applicable
  • 27 kudos

@Gustavo Amadoz Navarro​  Updated new infoThis course will be part of the data engineer learning path once the Databricks Certified Data Engineer Associate V3 exam is released (November 19, 2022). BEFORE YOU GET STARTED: Please note that this course,...

  • 27 kudos
8 More Replies
Milind
by New Contributor III
  • 4799 Views
  • 7 replies
  • 23 kudos

Resolved! Is there syllabus change in self paced Data Engineering with Databrick course video?

Is there syllabus change in self paced Data Engineering with Databrick course video?Last week i started that video lecture, but today i found that everything is change.https://partner-academy.databricks.com/learn/course/62/data-engineering-with-datab...

  • 4799 Views
  • 7 replies
  • 23 kudos
Latest Reply
DeepakMakwana74
New Contributor III
  • 23 kudos

Hi @Milind Singh​ yes there is keep on updation of syllabus so it is required to be updated on self paced course

  • 23 kudos
6 More Replies
HariSelvarajan
by Databricks Employee
  • 757 Views
  • 0 replies
  • 5 kudos

DAIWT22_RadicalSpeenInLakehouse_Photon

Topic: Radical Speed on the Lakehouse: Photon under the hoodI am Hari and I works as a Specialist Solutions Architect at Databricks. I specialise in Data engineering and Cloud platforms problems helping client in EMEA.Purpose:I recently presented a t...

  • 757 Views
  • 0 replies
  • 5 kudos
Dave_Nithio
by Contributor
  • 2180 Views
  • 3 replies
  • 0 kudos

Resolved! Data Engineering with Databricks Module 6.3L Error: Autoload CSV

I am currently taking the Data Engineering with Databricks course and have run into an error. I have also attempted this with my own data and had a similar error. In the lab, we are using autoloader to read a spark stream of csv files saved in the DB...

  • 2180 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

As a small aside, you don't need the third argument in the structfields

  • 0 kudos
2 More Replies
Gim
by Contributor
  • 11338 Views
  • 7 replies
  • 10 kudos

Resolved! Delta Table storage best practices

Hi!We have a project where we do some Data Engineering for a client. I implemented a scheduled batch processing of Databricks' autoloader (stream w/ availableNow) since they primarily have numerous file exports from several sources. We wanted to foll...

  • 11338 Views
  • 7 replies
  • 10 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 10 kudos

Hi @Gimwell Young​,Just a friendly follow-up. Did any of the responses help you to resolve your question? if it did, please mark it as best. Otherwise, please let us know if you still need help.

  • 10 kudos
6 More Replies
LearnerShahid
by New Contributor II
  • 5802 Views
  • 6 replies
  • 4 kudos

Resolved! Lesson 6.1 of Data Engineering. Error when reading stream - java.lang.UnsupportedOperationException: com.databricks.backend.daemon.data.client.DBFSV1.resolvePathOnPhysicalStorage(path: Path)

Below function executes fine: def autoload_to_table(data_source, source_format, table_name, checkpoint_directory):  query = (spark.readStream         .format("cloudFiles")         .option("cloudFiles.format", source_format)         .option("cloudFile...

I have verified that source data exists.
  • 5802 Views
  • 6 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Autoloader is not supported on community edition.

  • 4 kudos
5 More Replies
harrisriaz
by New Contributor
  • 3453 Views
  • 2 replies
  • 5 kudos

Resolved! what are the key Data engineering problems that databricks solve?

what are the problem that databricks address from typical data engineering prespective and comparing with other cloud DE tools.

  • 3453 Views
  • 2 replies
  • 5 kudos
Latest Reply
Rheiman
Contributor II
  • 5 kudos

Annoying things databricks solvesSane Data Movement (Fast Parallelized Compute, Table Versioning and History)Environment Management (spark + delta + java) are installed out-of-the-boxCost and Job Monitoring (Overwatch)I've only worked with it for 6 m...

  • 5 kudos
1 More Replies
Eyespoop
by New Contributor II
  • 21431 Views
  • 3 replies
  • 2 kudos

Resolved! PySpark: Writing Parquet Files to the Azure Blob Storage Container

Currently I am having some issues with the writing of the parquet file in the Storage Container. I do have the codes running but whenever the dataframe writer puts the parquet to the blob storage instead of the parquet file type, it is created as a f...

image image(1) image(2)
  • 21431 Views
  • 3 replies
  • 2 kudos
Latest Reply
User16764241763
Honored Contributor
  • 2 kudos

Hello @Karl Saycon​ Can you try setting this config to prevent additional parquet summary and metadata files from being written? The result from dataframe write to storage should be a single file.https://community.databricks.com/s/question/0D53f00001...

  • 2 kudos
2 More Replies
Thom
by New Contributor
  • 468 Views
  • 0 replies
  • 0 kudos

There seems to be missing lesson files in the repo I downloaded for the Data Engineering with Databricks course. The lesson Advanced SQL Transformati...

There seems to be missing lesson files in the repo I downloaded for the Data Engineering with Databricks course. The lesson Advanced SQL Transformations refers to files that aren't in the repo. One or two other lessons were missing as well.

  • 468 Views
  • 0 replies
  • 0 kudos
TimK
by New Contributor II
  • 3791 Views
  • 2 replies
  • 1 kudos

Resolved! Cannot Get Databricks SQL to read external Hive Metastore

I have followed the documentation and using the same metastore config that is working in the Data Engineering context. When attempting to view the Databases, I get the error:Encountered an internal errorThe following information failed to load:The li...

  • 3791 Views
  • 2 replies
  • 1 kudos
Latest Reply
TimK
New Contributor II
  • 1 kudos

@Bilal Aslam​  I didn't think to look there before since I hadn't tried to run any queries. I see the failed SHOW DATABASES queries in history and they identify the error: Builtin jars can only be used when hive execution version == hive metastore v...

  • 1 kudos
1 More Replies
Chris_Shehu
by Valued Contributor III
  • 7938 Views
  • 7 replies
  • 2 kudos

Resolved! Can I disable the workspace directory for specific user groups?

We want to use the REPO directory in our production environment only and have a dev environment with less restrictions. If I use the checkbox on the group admin screen to disable workspace access, it locks out the entire Data Engineering section.

  • 7938 Views
  • 7 replies
  • 2 kudos
Latest Reply
Chris_Shehu
Valued Contributor III
  • 2 kudos

So I found a way to get 85% of the way there:1) Disable workspace access for the users group.2) Create a new group or use another group that you created for the next step.3) Go to the workspace and right click on whitespace in the root directory.4) A...

  • 2 kudos
6 More Replies
Anonymous
by Not applicable
  • 2917 Views
  • 3 replies
  • 7 kudos

Resolved! How does 73% of the data go unused for analytics or decision-making?

Is Lakehouse the answer? Here's a good resource that was just published: https://dbricks.co/3q3471X

  • 2917 Views
  • 3 replies
  • 7 kudos
Latest Reply
Anonymous
Not applicable
  • 7 kudos

@Alexis Lopez​ - If @Dan Zafar​ 's or @Harikrishnan Kunhumveettil​'s answers solved the issue, would you be happy to mark one of their answers as best so other members can find the solution more easily?

  • 7 kudos
2 More Replies
UM
by New Contributor II
  • 2552 Views
  • 2 replies
  • 4 kudos

Resolved! Identifying the right tools for the job

Hi all, thank you for taking the time to attend to my post. A background to preface, my team and I have been prototyping an ML model that we would like to push into the production and deployment phase. We have been prototyping on Jupyter Notebooks bu...

untitled
  • 2552 Views
  • 2 replies
  • 4 kudos
Latest Reply
Dan_Z
Databricks Employee
  • 4 kudos

For production model serving, why not just use MLflow Model Serving? You just code it up/import it with the notebooks, then Log it using MLflow, then Register it with the MLflow Registry, then there is a nice UI to serve it using Model Serving. It wi...

  • 4 kudos
1 More Replies
Nick_Hughes
by New Contributor III
  • 2135 Views
  • 3 replies
  • 3 kudos

Is there an alerting API please?

Is there an alerting api so that alerts can be source controlled and automated, please ?https://docs.databricks.com/sql/user/alerts/index.html

  • 2135 Views
  • 3 replies
  • 3 kudos
Latest Reply
Dan_Z
Databricks Employee
  • 3 kudos

Hello @Nick Hughes​ , as of today we do not expose or document the API for these features. I think it will be a useful feature so I created an internal feature request for it (DB-I-4289). If you (or any future readers) want more information on this f...

  • 3 kudos
2 More Replies
User16752246002
by Databricks Employee
  • 2184 Views
  • 2 replies
  • 6 kudos

Resolved! New Bill Inmon Book, What are your thoughts?

Have you checked out the new Bill Inmon Book, Building the Data Lakehouse? https://dbricks.co/3uxCXjO What were your thoughts if you read it?

  • 2184 Views
  • 2 replies
  • 6 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 6 kudos

The quality of the book depends on the audience IMO. For people who have no background in data warehousing it will be interesting to read. For the others the book is too general and descriptive. The 'HOW do you do x' is missing.

  • 6 kudos
1 More Replies
Labels