cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

4kb_nick
by New Contributor III
  • 32 Views
  • 0 replies
  • 0 kudos

Unity Catalog Lineage Not Working on GCP

Hello,We have set up a lakehouse in Databricks for one of our clients. One of the features our client would like to use is the Unity Catalog data lineage view. This is a handy feature that we have used with other clients (in both AWS and Azure) witho...

  • 32 Views
  • 0 replies
  • 0 kudos
shadowinc
by New Contributor
  • 77 Views
  • 0 replies
  • 0 kudos

spark/databricks temporary views and uuid

Hi All,We have a table which has an id column generated by uuid(). For ETL we use databricks/spark sql temporary views. we observed strange behavior between databricks sql temp view (create or replace temporary view) and spark sql temp view (df.creat...

Data Engineering
Databricks SQL
spark sql
temporary views
uuid
  • 77 Views
  • 0 replies
  • 0 kudos
as999
by New Contributor III
  • 7437 Views
  • 8 replies
  • 6 kudos

Databrick hive metastore location?

In databrick, where is hive metastore location is it control plane or data plane? for prod systems In terms of security what preventions should be taken to secure hive metastore?

  • 7437 Views
  • 8 replies
  • 6 kudos
Latest Reply
Prabakar
Esteemed Contributor III
  • 6 kudos

@as999​ The default metastore is managed by Databricks. If you are concerned about security and would like to have your own metastore you can go for the external metastore setup. You have the details steps in the below doc for setting up the external...

  • 6 kudos
7 More Replies
MarkusFra
by New Contributor II
  • 1113 Views
  • 3 replies
  • 0 kudos

Re-establish SparkSession using Databricks connect after cluster restart

Hello,when developing locally using Databricks connect how do I re-establish the SparkSession when the Cluster restarted? getOrCreate() seems to get the old invalid SparkSession even after Cluster restart instead of creating a new one or am I missing...

Data Engineering
databricks-connect
  • 1113 Views
  • 3 replies
  • 0 kudos
Latest Reply
Michael_Chein
New Contributor
  • 0 kudos

If anyone encounters this problem, the solution that worked for me was to restart the Jupyter kernel. 

  • 0 kudos
2 More Replies
dbengineer516
by New Contributor
  • 130 Views
  • 1 replies
  • 0 kudos

/api/2.0/preview/sql/queries API only returning certain queries

Hello,When using /api/2.0/preview/sql/queries to list out all available queries, I noticed that certain queries were being shown while others were not. I did a small test on my home workspace, and it was able to recognize certain queries when I defin...

  • 130 Views
  • 1 replies
  • 0 kudos
Latest Reply
brockb
New Contributor III
  • 0 kudos

Hi,How many queries were returned in the API call in question? The List Queries documentation describes this endpoint as supporting pagination with a default page size of 25, is that how many you saw returned? Query parameters page_size integer <= 10...

  • 0 kudos
prabhu26
by New Contributor
  • 105 Views
  • 1 replies
  • 0 kudos

Unable to enforce schema on data read from jsonl file in Azure Databricks using pyspark

I'm tring to build a ETL pipeline in which I'm reading the jsonl files from the azure blob storage, then trying to transform and load it to delta tables in databricks. I have created the below schema for loading my data :  schema = StructType([ S...

  • 105 Views
  • 1 replies
  • 0 kudos
Latest Reply
DataEngineer
New Contributor II
  • 0 kudos

Try this.Add option("multiline","true")

  • 0 kudos
MarkD
by New Contributor II
  • 443 Views
  • 8 replies
  • 0 kudos

SET configuration in SQL DLT pipeline does not work

Hi,I'm trying to set a dynamic value to use in a DLT query, and the code from the example documentation does not work.SET startDate='2020-01-01'; CREATE OR REFRESH LIVE TABLE filtered AS SELECT * FROM my_table WHERE created_at > ${startDate};It is g...

Data Engineering
Delta Live Tables
dlt
sql
  • 443 Views
  • 8 replies
  • 0 kudos
Latest Reply
Hkesharwani
Contributor
  • 0 kudos

Hi @MarkD ,You may use  set variable_name.var= '1900-01-01'to set the value of variable and in order to use the value of variable use ${automated_date.var} Example: set automated_date.var= '1800-01-01' select * from my table where date = CAST(${autom...

  • 0 kudos
7 More Replies
pshuk
by New Contributor III
  • 160 Views
  • 2 replies
  • 1 kudos

upload file/table to delta table using CLI

Hi,I am using CLI to transfer local files to Databricks Volume. At the end of my upload, I want to create a meta table (storing file name, location, and some other information) and have it as a table on databricks Volume. I am not sure how to create ...

  • 160 Views
  • 2 replies
  • 1 kudos
Latest Reply
Ayushi_Suthar
Honored Contributor
  • 1 kudos

Hi @pshuk , Greetings!  We understand that you are looking for a CLI command to create a Table but at this moment Databricks doesn't support CLI command to create the table but you can use SQL Execution API -https://docs.databricks.com/api/workspace/...

  • 1 kudos
1 More Replies
JOFinancial
by New Contributor
  • 73 Views
  • 1 replies
  • 0 kudos

No Data for External Table from Blob Storage

Hi All,I am trying to create an external table from a Azure Blob storage container.  I recieve no errors, but there is no data in the table.  The Blob Storage contains 4 csv files with the same columns and about 10k rows of data.  Am I missing someth...

  • 73 Views
  • 1 replies
  • 0 kudos
Latest Reply
Hkesharwani
Contributor
  • 0 kudos

Hi, The code looks completely fine. please check if you have any other delimiter other than , .If your CSV files use a different delimiter, you can specify it in the table definition using the OPTIONS clause.Just to confirm I created a sample table a...

  • 0 kudos
Labels
Top Kudoed Authors