Machine Learning

by Kash • Contributor III

07-01-2022 7:44:42 AM

3501 Views
3 replies
1 kudos

Building a Data Quality pipeline with alerting

Hi there,My question is how do we setup a data-quality pipeline with alerting?Background: We would like to setup a data-quality pipeline to ensure the data we collect each day is consistent and complete. We will use key metrics found in our bronze JS...

Machine Learning

Reply

3501 Views
3 replies
1 kudos

07-01-2022 7:44:42 AM

View Replies

Latest Reply

dataoculus_app
New Contributor III

06-18-2025 12:21:50 AM

1 kudos

Hi Kash, on 4th point, do you guys have realtime ingestion to model ? or its batch. in case of batch, DLT will be fine i guess. but would love to know more. never seen realtime model updates ealier.

1 kudos

06-18-2025 12:21:50 AM

2 More Replies

by david_torres • New Contributor II

06-17-2023 8:57:49 AM

4588 Views
3 replies
4 kudos

Can you use autoloader with a fixed width file?

I have a collection fixed width files that I would like to ingest monthly with autoloader but I can't seem to find an example. I can read the files into Dataframes using a python function to map the index and length of each field with no issues but ...

Machine Learning

Reply

4588 Views
3 replies
4 kudos

06-17-2023 8:57:49 AM

View Replies

Latest Reply

david_torres
New Contributor II

06-21-2023 8:29:41 AM

4 kudos

I found a way to get what I needed and I can apply this to any fixed width file. Will share for anyone trying to do the same thing. I accomplished this in a Python notebook and will explain the code:Import the libraries needed and define a schema.i...

4 kudos

06-21-2023 8:29:41 AM

2 More Replies

by js54123875 • New Contributor III

06-12-2023 11:58:34 AM

6142 Views
4 replies
3 kudos

Resolved! How to enforce schema with Autoloader?

I have a number of csv files that I am working to ingest using autoloader. There is an ID field that I want to require to be a STRING, but using SchemaHints is not working and is instead setting as an INT.The first few csv files have just integer va...

Machine Learning

Reply

6142 Views
4 replies
3 kudos

06-12-2023 11:58:34 AM

View Replies

Latest Reply

Anonymous
Not applicable

06-14-2023 11:36:32 PM

3 kudos

Hi @Jennette Shepard We haven't heard from you since the last response from @Suteja Kanuri . Kindly share the information with us, and in return, we will provide you with the necessary solution.Thanks and Regards

3 kudos

06-14-2023 11:36:32 PM

3 More Replies

by lurban • New Contributor II

01-25-2023 9:56:15 AM

4446 Views
1 replies
0 kudos

CloudFilesIllegalStateException: Found mismatched event: key old_file_path doesn't have the prefix: new_file_path

My team currently uses Autoloader and Delta Live Tables to process incremental data from ADLS storage. We are needing to keep the same table and history, but switch the filepath to a different location in storage. When I test a filepath change, I rec...

Machine Learning

Reply

4446 Views
1 replies
0 kudos

01-25-2023 9:56:15 AM

View Replies

Latest Reply

DD_Sharma
Databricks Employee

04-14-2023 12:15:03 AM

0 kudos

Autoloader doesn't support changing the source path for running job so if you change your source path your stream fails because the source path has changed. However, if you really want to change the path you can change it by using the new checkpoint ...

0 kudos

04-14-2023 12:15:03 AM

by Raymond_Garcia • Contributor II

11-14-2022 9:19:02 AM

3786 Views
1 replies
1 kudos

Resolved! Problem with Autoloader, S3, and wildcard

Hello, I have an autoloader code and it is pretty standard, we have this variable file path that points to an S3 bucket. example #2 executed successfully and example 1 throws an exception.it seems like source 1 always throws an exception whereas sour...

Machine Learning

Reply

3786 Views
1 replies
1 kudos

11-14-2022 9:19:02 AM

View Replies

Latest Reply

Raymond_Garcia
Contributor II

11-16-2022 7:43:14 AM

1 kudos

The error was more related to a lot of stuff on the AWS side, so we deleted and cleared the SQS and SNS. we also used the CloudFilesAWSResourceManagerval manager = CloudFilesAWSResourceManager .newManager .option("path", filePath) .create...

1 kudos

11-16-2022 7:43:14 AM

by MadelynM • Databricks Employee

10-01-2021 2:10:35 PM

2534 Views
1 replies
7 kudos

2021-07-Webinar--Hassle-Free-Data-Ingestion-Social-1200x628

Thanks to everyone who joined the Hassle-Free Data Ingestion webinar. You can access the on-demand recording here. We're sharing a subset of the phenomenal questions asked and answered throughout the session. You'll find Ingestion Q&A listed first, f...

Machine Learning

Reply

2534 Views
1 replies
7 kudos

10-01-2021 2:10:35 PM

View Replies

Latest Reply

Emily_S
Databricks Employee

11-09-2021 6:32:13 AM

7 kudos

Check out Part 2 of this Data Ingestion webinar to find out how to easily ingest semi-structured data at scale into your Delta Lake, including how to use Databricks Auto Loader to ingest JSON data into Delta Lake.

7 kudos

11-09-2021 6:32:13 AM

Databricks Community

Forum Posts

Building a Data Quality pipeline with alerting

Can you use autoloader with a fixed width file?

Resolved! How to enforce schema with Autoloader?

CloudFilesIllegalStateException: Found mismatched event: key old_file_path doesn't have the prefix: new_file_path

Resolved! Problem with Autoloader, S3, and wildcard

2021-07-Webinar--Hassle-Free-Data-Ingestion-Social-1200x628