โ08-21-2024 09:05 AM
Hi ,
I done activities Bronze and Silver , after i trying to saving table to Gold container but unable to storing .
i created External database .
I want store data to PARQUET but not supporting ,only DELTA.
only MANAGED LOCATION supporting but unable to create directly location .
can any one please help me on urgent based
Regards,
Praveen.
โ08-22-2024 12:22 AM
from pyspark.sql.functions import col, round, sum
# Step 1: Read the data from the source table
invoice_df = spark.table("invoice_tbl")
# Step 2: Perform the transformation
# Aggregate the data by country and invoice_date
aggregated_df = invoice_df.groupBy("country", "invoice_date") \
.agg(round(sum(col("quantity") * col("unit_price")), 2).alias("total_sales"))
# Step 3: Write the result as a Parquet file
# Define the output path
parquet_path = "abfss://<container-name>@<storage-account-name>.blob.core.windows.net/path/to/gold/location/country_wise_daily_sales.parquet"
# Save the DataFrame to Parquet format
aggregated_df.write.format("parquet").mode("overwrite").save(parquet_path)
print("Table has been created and saved in Parquet format.")
@PraveenReddy21 Try with this , this code is to create the parquet external table for gold layer .
โ08-21-2024 09:19 AM
Hi @PraveenReddy21 can you provide the ss what code you are using while saving it to parquet format and how you are doing to get the more understanding because generally it will not happen , i am sure you are having some issues with your flow .
โ08-21-2024 09:44 AM
hi , please find the below
CREATE DATABASE IF NOT EXISTS sales_dbdb
MANAGED LOCATION 'abfss://unity-catalog-storage@dbstorage7wauw2kjcu3u6.dfs.core.windows.net/107512296614625'
and
create table sales_dbdb.country_wise_daily_salesale
using delta as
select country, invoice_date, round(sum(quantity*unit_price),2) as total_sales from invoice_tbl group by country, invoice_date
I want create PARQUET format and transfer table to BLOB-CONTAINER-GOLD
โ08-21-2024 10:11 AM
Hi Rishabh,
First i tired but not working
i am trying to create
%sql
CREATE DATABASE IF NOT EXISTS sales_dbdb
LOCATION 'abfss://<storagelocation>.dfs.core.windows.net/<container>/sales_dbdb'
and
create table sales_dbdb.country_wise_daily_salesale
using PARQUEST as
select country, invoice_date, round(sum(quantity*unit_price),2) as total_sales from invoice_tbl group by country, invoice_date
not working .
if have steps please tell me .
Shall we create Blob container database directly .
Thank You .
Praveen.
โ08-21-2024 11:51 AM - edited โ08-21-2024 11:55 AM
Why are you using PARQUEST , can you replace it with parquet and let me know what error you are getting exactly .also what i understood from you is that you have done with the bronze and silver layer table now you want to create gold layer and gold table , so before moving out to the final conclusion i have some doubts .
1-Bronze and silver table are managed or external table ?
2-Bronze and silver table are delta table or parquet?
3-You want to create a external gold table in parquet format ?
โ08-21-2024 07:52 PM
Hi,
i am using parquet format only.
Bronze and Silver both are parquet tables , its not managed tables .
---
3-You want to create a external gold table in parquet format ? yes , i want to create table gold blob container.
โ08-22-2024 12:22 AM
from pyspark.sql.functions import col, round, sum
# Step 1: Read the data from the source table
invoice_df = spark.table("invoice_tbl")
# Step 2: Perform the transformation
# Aggregate the data by country and invoice_date
aggregated_df = invoice_df.groupBy("country", "invoice_date") \
.agg(round(sum(col("quantity") * col("unit_price")), 2).alias("total_sales"))
# Step 3: Write the result as a Parquet file
# Define the output path
parquet_path = "abfss://<container-name>@<storage-account-name>.blob.core.windows.net/path/to/gold/location/country_wise_daily_sales.parquet"
# Save the DataFrame to Parquet format
aggregated_df.write.format("parquet").mode("overwrite").save(parquet_path)
print("Table has been created and saved in Parquet format.")
@PraveenReddy21 Try with this , this code is to create the parquet external table for gold layer .
โ08-22-2024 03:14 AM
Thank You Rishabh.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group