by
Kash
• Contributor III
- 1864 Views
- 2 replies
- 1 kudos
Hi there,My question is how do we setup a data-quality pipeline with alerting?Background: We would like to setup a data-quality pipeline to ensure the data we collect each day is consistent and complete. We will use key metrics found in our bronze JS...
- 1864 Views
- 2 replies
- 1 kudos
Latest Reply
Hi Kash!I know it might be too late, but if you managed to create this by yourself and you are struggling to scale the solution you could take a look at Rudol Data Quality, it covers up pretty much everything you mentioned with a focus on enabling no...
1 More Replies
- 3348 Views
- 3 replies
- 4 kudos
I have a collection fixed width files that I would like to ingest monthly with autoloader but I can't seem to find an example. I can read the files into Dataframes using a python function to map the index and length of each field with no issues but ...
- 3348 Views
- 3 replies
- 4 kudos
Latest Reply
I found a way to get what I needed and I can apply this to any fixed width file. Will share for anyone trying to do the same thing. I accomplished this in a Python notebook and will explain the code:Import the libraries needed and define a schema.i...
2 More Replies
- 4521 Views
- 4 replies
- 3 kudos
I have a number of csv files that I am working to ingest using autoloader. There is an ID field that I want to require to be a STRING, but using SchemaHints is not working and is instead setting as an INT.The first few csv files have just integer va...
- 4521 Views
- 4 replies
- 3 kudos
Latest Reply
Hi @Jennette Shepard We haven't heard from you since the last response from @Suteja Kanuri . Kindly share the information with us, and in return, we will provide you with the necessary solution.Thanks and Regards
3 More Replies
- 1479 Views
- 1 replies
- 0 kudos
My team currently uses Autoloader and Delta Live Tables to process incremental data from ADLS storage. We are needing to keep the same table and history, but switch the filepath to a different location in storage. When I test a filepath change, I rec...
- 1479 Views
- 1 replies
- 0 kudos
Latest Reply
Autoloader doesn't support changing the source path for running job so if you change your source path your stream fails because the source path has changed. However, if you really want to change the path you can change it by using the new checkpoint ...
- 2964 Views
- 1 replies
- 1 kudos
Hello, I have an autoloader code and it is pretty standard, we have this variable file path that points to an S3 bucket. example #2 executed successfully and example 1 throws an exception.it seems like source 1 always throws an exception whereas sour...
- 2964 Views
- 1 replies
- 1 kudos
Latest Reply
The error was more related to a lot of stuff on the AWS side, so we deleted and cleared the SQS and SNS. we also used the CloudFilesAWSResourceManagerval manager = CloudFilesAWSResourceManager
.newManager
.option("path", filePath)
.create...
- 1927 Views
- 1 replies
- 7 kudos
Thanks to everyone who joined the Hassle-Free Data Ingestion webinar. You can access the on-demand recording here. We're sharing a subset of the phenomenal questions asked and answered throughout the session. You'll find Ingestion Q&A listed first, f...
- 1927 Views
- 1 replies
- 7 kudos
Latest Reply
Check out Part 2 of this Data Ingestion webinar to find out how to easily ingest semi-structured data at scale into your Delta Lake, including how to use Databricks Auto Loader to ingest JSON data into Delta Lake.