cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Kash
by Contributor III
  • 1490 Views
  • 2 replies
  • 1 kudos

Building a Data Quality pipeline with alerting

Hi there,My question is how do we setup a data-quality pipeline with alerting?Background: We would like to setup a data-quality pipeline to ensure the data we collect each day is consistent and complete. We will use key metrics found in our bronze JS...

  • 1490 Views
  • 2 replies
  • 1 kudos
Latest Reply
joarobles
New Contributor III
  • 1 kudos

Hi Kash!I know it might be too late, but if you managed to create this by yourself and you are struggling to scale the solution you could take a look at Rudol Data Quality, it covers up pretty much everything you mentioned with a focus on enabling no...

  • 1 kudos
1 More Replies
david_torres
by New Contributor II
  • 2792 Views
  • 3 replies
  • 4 kudos

Can you use autoloader with a fixed width file?

I have a collection fixed width files that I would like to ingest monthly with autoloader but I can't seem to find an example. I can read the files into Dataframes using a python function to map the index and length of each field with no issues but ...

  • 2792 Views
  • 3 replies
  • 4 kudos
Latest Reply
david_torres
New Contributor II
  • 4 kudos

I found a way to get what I needed and I can apply this to any fixed width file. Will share for anyone trying to do the same thing. I accomplished this in a Python notebook and will explain the code:Import the libraries needed and define a schema.i...

  • 4 kudos
2 More Replies
js54123875
by New Contributor III
  • 3663 Views
  • 4 replies
  • 3 kudos

Resolved! How to enforce schema with Autoloader?

I have a number of csv files that I am working to ingest using autoloader. There is an ID field that I want to require to be a STRING, but using SchemaHints is not working and is instead setting as an INT.The first few csv files have just integer va...

  • 3663 Views
  • 4 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Jennette Shepard​ We haven't heard from you since the last response from @Suteja Kanuri​  . Kindly share the information with us, and in return, we will provide you with the necessary solution.Thanks and Regards

  • 3 kudos
3 More Replies
lurban
by New Contributor
  • 1204 Views
  • 1 replies
  • 0 kudos

CloudFilesIllegalStateException: Found mismatched event: key old_file_path doesn't have the prefix: new_file_path

My team currently uses Autoloader and Delta Live Tables to process incremental data from ADLS storage. We are needing to keep the same table and history, but switch the filepath to a different location in storage. When I test a filepath change, I rec...

  • 1204 Views
  • 1 replies
  • 0 kudos
Latest Reply
DD_Sharma
New Contributor III
  • 0 kudos

Autoloader doesn't support changing the source path for running job so if you change your source path your stream fails because the source path has changed. However, if you really want to change the path you can change it by using the new checkpoint ...

  • 0 kudos
Raymond_Garcia
by Contributor II
  • 2500 Views
  • 1 replies
  • 1 kudos

Resolved! Problem with Autoloader, S3, and wildcard

Hello, I have an autoloader code and it is pretty standard, we have this variable file path that points to an S3 bucket. example #2 executed successfully and example 1 throws an exception.it seems like source 1 always throws an exception whereas sour...

  • 2500 Views
  • 1 replies
  • 1 kudos
Latest Reply
Raymond_Garcia
Contributor II
  • 1 kudos

The error was more related to a lot of stuff on the AWS side, so we deleted and cleared the SQS and SNS. we also used the CloudFilesAWSResourceManagerval manager = CloudFilesAWSResourceManager .newManager .option("path", filePath) .create...

  • 1 kudos
MadelynM
by Databricks Employee
  • 1744 Views
  • 1 replies
  • 7 kudos

2021-07-Webinar--Hassle-Free-Data-Ingestion-Social-1200x628

Thanks to everyone who joined the Hassle-Free Data Ingestion webinar. You can access the on-demand recording here. We're sharing a subset of the phenomenal questions asked and answered throughout the session. You'll find Ingestion Q&A listed first, f...

  • 1744 Views
  • 1 replies
  • 7 kudos
Latest Reply
Emily_S
New Contributor III
  • 7 kudos

Check out Part 2 of this Data Ingestion webinar to find out how to easily ingest semi-structured data at scale into your Delta Lake, including how to use Databricks Auto Loader to ingest JSON data into Delta Lake.

  • 7 kudos
Labels