by
aladda
• Honored Contributor II
- 1654 Views
- 2 replies
- 0 kudos
I've reviewed the COPY INTO docs here - https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-copy-into.html#examples but there's only one simple example. Looking for some additional examples that show loading data from CSV - with ...
- 1654 Views
- 2 replies
- 0 kudos
Latest Reply
Here's an example for predefined schemaUsing COPY INTO with a predefined table schema – Trick here is to CAST the CSV dataset into your desired schema in the select statement of COPY INTO. Example below%sql CREATE OR REPLACE TABLE copy_into_bronze_te...
1 More Replies
- 3401 Views
- 5 replies
- 6 kudos
Hello everybody,I am absolutely new in Databricks, so I need your help.Details:Task: merge 12 CSV files in Databricks with the best way.Location of files: I will describe it in details, because I can not good orientate yet. If i go to Data -> Browse ...
- 3401 Views
- 5 replies
- 6 kudos
Latest Reply
It seems that all your csv files are present under one folder and since you are able to union them, all these files must have same schema as well.Given the above conditions, you can simply read all the data by referring the folder name instead of ref...
4 More Replies
- 8936 Views
- 1 replies
- 5 kudos
How do I ingest a .csv file with spaces in column names using Delta Live into a streaming table? All of the fields should be read using the default behavior .csv files for DLT autoloader - as strings. Running the pipeline gives me an error about in...
- 8936 Views
- 1 replies
- 5 kudos
Latest Reply
After additional googling on "withColumnRenamed", I was able to replace all spaces in column names with "_" all at once by using select and alias instead:@dlt.view(
comment=""
)
def vw_raw():
return (
spark.readStream.format("cloudF...
by
Giorgi
• New Contributor III
- 1827 Views
- 2 replies
- 1 kudos
Hello, can I programmatically access artifact file (csv), via artifact_uri and read it?Tried the following, but didn't work, says no such file or directory:mlflow.pyfunc.pandas.read_csv(artifact_uri+'/xgb-classifier-test-8/dataset_statistics.csv')pan...
- 1827 Views
- 2 replies
- 1 kudos
Latest Reply
Maybe there are better solutions, here is what I've found:from mlflow.tracking import MlflowClient
client = MlflowClient()
pd.read_csv(client.download_artifacts(run_id, "xgb-classifier-test-8/dataset_statistics.csv"))
1 More Replies
- 916 Views
- 2 replies
- 0 kudos
We have trined the model in Databricks and Deployed in SageMaker. After deployment, We set the baseline for the model and enable model monitoring. After enabling the data capture for the SageMaker endpoint, we receive the following error when we do t...
- 916 Views
- 2 replies
- 0 kudos
Latest Reply
Hi @Gopichandran N​ , This link might help you:-https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateEndpointConfig.html
1 More Replies
- 1160 Views
- 1 replies
- 7 kudos
Thanks to everyone who joined the Hassle-Free Data Ingestion webinar. You can access the on-demand recording here. We're sharing a subset of the phenomenal questions asked and answered throughout the session. You'll find Ingestion Q&A listed first, f...
- 1160 Views
- 1 replies
- 7 kudos
Latest Reply
Check out Part 2 of this Data Ingestion webinar to find out how to easily ingest semi-structured data at scale into your Delta Lake, including how to use Databricks Auto Loader to ingest JSON data into Delta Lake.