โ02-13-2022 08:43 PM
โ02-13-2022 10:24 PM
You can write your ETL logic in notebooks, run the notebook over a cluster and write the data to a location where your S3 bucket is mounted.
Next, you can register that table with Hive MetaStore and access the same table in Databricks SQL.
To see the table, go to Data tab and select your schema/database to see registered tables.
Two ways to do this:
Option 1:
df.write.option("path",<s3-path-of-table>).saveAsTable(tableName)
Option 2
%python
df.write.save(<s3-path-of-table>)
%sql
CREATE TABLE <table-name>
USING DELTA
LOCATION <s3-path-of-table>
:
โ02-13-2022 10:24 PM
You can write your ETL logic in notebooks, run the notebook over a cluster and write the data to a location where your S3 bucket is mounted.
Next, you can register that table with Hive MetaStore and access the same table in Databricks SQL.
To see the table, go to Data tab and select your schema/database to see registered tables.
Two ways to do this:
Option 1:
df.write.option("path",<s3-path-of-table>).saveAsTable(tableName)
Option 2
%python
df.write.save(<s3-path-of-table>)
%sql
CREATE TABLE <table-name>
USING DELTA
LOCATION <s3-path-of-table>
:
โ02-14-2022 03:05 AM
@Aman Sehgalโ so basically you are telling to write the transformed data from Databricks pyspark into ADLS gen2 and then use Data bricks SQL analytics to do below what you said ...
โ02-14-2022 05:11 AM
Right.. Databricks is a platform to perform transformations.. Ideally your should either mount s3 bucket or ADLS gen 2 location in DBFS..
Read/Write/Update/Delete your data and to run SQL analytics from SQL tab, you'll have to register a table and start an endpoint..
You can also query the data via notebooks by using SQL in a cell. The only difference is, you'll have to spin up a cluster instead of an endpoint.
โ02-15-2022 01:09 AM
@Aman Sehgalโ you are making me confused ....we need to spin up the cluster if we use SQL end point right ?
and Can we not use magic commands "%Sql" within same notebook to write the pyspark data to SQL end point as table ?
โ02-15-2022 05:58 AM
When you're in Data Engineering tab in workspace, then you need to spin up a cluster. After spinning up the cluster, you can create a notebook and use %sql to write SQL command and query your table.
When you're in SQL tab in workspace, then you need to spin up a SQL Endpoint. After spinning an end point, go to Queries tab and you can write a SQL query to query your tables.
โ02-18-2022 08:03 AM
@Aman Sehgalโ Can we write data from data engineering workspace to SQL end point in databricks?
โ02-19-2022 03:39 PM
You can write data to a table (eg. default.my_table) and consume data from same table using SQL end point.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group