cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

david_torres
by New Contributor II
  • 1667 Views
  • 3 replies
  • 4 kudos

Can you use autoloader with a fixed width file?

I have a collection fixed width files that I would like to ingest monthly with autoloader but I can't seem to find an example. I can read the files into Dataframes using a python function to map the index and length of each field with no issues but ...

  • 1667 Views
  • 3 replies
  • 4 kudos
Latest Reply
david_torres
New Contributor II
  • 4 kudos

I found a way to get what I needed and I can apply this to any fixed width file. Will share for anyone trying to do the same thing. I accomplished this in a Python notebook and will explain the code:Import the libraries needed and define a schema.i...

  • 4 kudos
2 More Replies
js54123875
by New Contributor III
  • 2090 Views
  • 4 replies
  • 3 kudos

Resolved! How to enforce schema with Autoloader?

I have a number of csv files that I am working to ingest using autoloader. There is an ID field that I want to require to be a STRING, but using SchemaHints is not working and is instead setting as an INT.The first few csv files have just integer va...

  • 2090 Views
  • 4 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Jennette Shepard​ We haven't heard from you since the last response from @Suteja Kanuri​  . Kindly share the information with us, and in return, we will provide you with the necessary solution.Thanks and Regards

  • 3 kudos
3 More Replies
lurban
by New Contributor
  • 704 Views
  • 1 replies
  • 0 kudos

CloudFilesIllegalStateException: Found mismatched event: key old_file_path doesn't have the prefix: new_file_path

My team currently uses Autoloader and Delta Live Tables to process incremental data from ADLS storage. We are needing to keep the same table and history, but switch the filepath to a different location in storage. When I test a filepath change, I rec...

  • 704 Views
  • 1 replies
  • 0 kudos
Latest Reply
DD_Sharma
New Contributor III
  • 0 kudos

Autoloader doesn't support changing the source path for running job so if you change your source path your stream fails because the source path has changed. However, if you really want to change the path you can change it by using the new checkpoint ...

  • 0 kudos
Raymond_Garcia
by Contributor II
  • 1618 Views
  • 1 replies
  • 1 kudos

Resolved! Problem with Autoloader, S3, and wildcard

Hello, I have an autoloader code and it is pretty standard, we have this variable file path that points to an S3 bucket. example #2 executed successfully and example 1 throws an exception.it seems like source 1 always throws an exception whereas sour...

  • 1618 Views
  • 1 replies
  • 1 kudos
Latest Reply
Raymond_Garcia
Contributor II
  • 1 kudos

The error was more related to a lot of stuff on the AWS side, so we deleted and cleared the SQS and SNS. we also used the CloudFilesAWSResourceManagerval manager = CloudFilesAWSResourceManager .newManager .option("path", filePath) .create...

  • 1 kudos
Kash
by Contributor III
  • 838 Views
  • 1 replies
  • 1 kudos

Building a Data Quality pipeline with alerting

Hi there,My question is how do we setup a data-quality pipeline with alerting?Background: We would like to setup a data-quality pipeline to ensure the data we collect each day is consistent and complete. We will use key metrics found in our bronze JS...

  • 838 Views
  • 1 replies
  • 1 kudos
Latest Reply
User16753725469
Contributor II
  • 1 kudos

Hi @Avkash Kana​  I would suggest using Delta Live Table (DLT) it has the features you are looking for https://docs.databricks.com/workflows/delta-live-tables/index.html

  • 1 kudos
MadelynM
by New Contributor III
  • 1256 Views
  • 1 replies
  • 7 kudos

2021-07-Webinar--Hassle-Free-Data-Ingestion-Social-1200x628

Thanks to everyone who joined the Hassle-Free Data Ingestion webinar. You can access the on-demand recording here. We're sharing a subset of the phenomenal questions asked and answered throughout the session. You'll find Ingestion Q&A listed first, f...

  • 1256 Views
  • 1 replies
  • 7 kudos
Latest Reply
Emily_S
New Contributor III
  • 7 kudos

Check out Part 2 of this Data Ingestion webinar to find out how to easily ingest semi-structured data at scale into your Delta Lake, including how to use Databricks Auto Loader to ingest JSON data into Delta Lake.

  • 7 kudos
Labels