cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

User16765131552
by Databricks Employee
  • 8998 Views
  • 5 replies
  • 1 kudos

How to register a JDBC Spark dialect in Python?

I am trying to read from a databricks table. I have used the url from a cluster in the databricks. I am getting this error: java.sql.SQLDataException: [Simba][JDBC](10140) Error converting value to int.After these statements:jdbcConnUrl= "jdbc:spark:...

  • 8998 Views
  • 5 replies
  • 1 kudos
Latest Reply
KKDataEngineer
New Contributor III
  • 1 kudos

is there a solution for this?

  • 1 kudos
4 More Replies
stephansmit
by New Contributor III
  • 26266 Views
  • 3 replies
  • 11 kudos

How do I access the account console of Databricks in Azure?

To create a Unity metastore the docs refer me to the account console in Databricks, see:https://docs.microsoft.com/en-us/azure/databricks/data-governance/unity-catalog/create-metastoreHowever when I go to manage account, I get redirected to select wo...

  • 26266 Views
  • 3 replies
  • 11 kudos
Latest Reply
Anonymous
Not applicable
  • 11 kudos

Please refer here - https://community.databricks.com/s/question/0D58Y000098lIqgSAE/unity-catalog-azure-account-console-how-to-accessYou must be an Azure Databricks account admin.The first Azure Databricks account admin must be an Azure Active Directo...

  • 11 kudos
2 More Replies
raduq
by Contributor
  • 50255 Views
  • 10 replies
  • 12 kudos

How to efficiently process a 50Gb JSON file and store it in Delta?

Hi, I'm a fairly new user and I am using Azure Databricks to process a ~50Gb JSON file containing real estate data. I uploaded the JSON file to Azure Data Lake Gen2 storage and read the JSON file into a dataframe.df = spark.read.option('multiline', '...

image image image
  • 50255 Views
  • 10 replies
  • 12 kudos
Latest Reply
Renzer
Databricks Partner
  • 12 kudos

The spark connector is super slow. I found loading json into Azure cosmos dB then writing queries to get sections of data out was 25x times faster because cosmos dB indexes the json. You can stream read data from cosmosdb. You can find python code sn...

  • 12 kudos
9 More Replies
Fredolebeau80
by New Contributor II
  • 2453 Views
  • 2 replies
  • 1 kudos

Refresh delta

How refresh delta table with New raw from CDC Json file. 

  • 2453 Views
  • 2 replies
  • 1 kudos
Latest Reply
Vinay_M_R
Databricks Employee
  • 1 kudos

To refresh a delta table with new raw data from a CDC JSON file, you can use change data capture (CDC) to update tables based on changes in source data. Here are the steps:1. Create a streaming table using the CREATE OR REFRESH STREAMING TABLE statem...

  • 1 kudos
1 More Replies
Manasi_Sarang
by New Contributor II
  • 6260 Views
  • 4 replies
  • 1 kudos

Facing issue while creating Delta Live Table on top of csv file

Hello Everyone,I am trying to create Delta Live Table on top of csv file using below syntax:CREATE OR REFRESH LIVE TABLE employee_bronze_dltCOMMENT "The bronze employee dataset, ingested from /mnt/lakehouse/PoC/DLT/Source/."AS SELECT * FROM csv.`/mnt...

image
  • 6260 Views
  • 4 replies
  • 1 kudos
Latest Reply
pvignesh92
Honored Contributor
  • 1 kudos

Hi @Manasi_Sarang ,I believe the Delta is unable to infer the schema as you are using select statement to read entire content from csv file and I think the inferschema won't work here.  Instead you can try to create a temp live table or live view wit...

  • 1 kudos
3 More Replies
Anonymous
by Not applicable
  • 5229 Views
  • 2 replies
  • 0 kudos

INTERNAL ERROR

I have the following query;select  customer_id,    first(if(name_type = 'Official', name, null),true) official_name,    first(if(name_type = 'Preferred', name, null),true) preferred_namefrom(    select  customer_id,        ifnull(name_type, 'Official...

  • 5229 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

I experienced similar issues from time to time.  What helped is to refresh the browser page.If that does not work, restart the sql warehouse.The internal error indeed is pretty vague, but my experience is that this is not related to a wrong SQL scrip...

  • 0 kudos
1 More Replies
Zhudocode
by New Contributor II
  • 14730 Views
  • 1 replies
  • 2 kudos

Resolved! Difference between using DBT and data bricks's lineage toolol

So my team is using DBT for a lot of data lineage items but then at the data summit it was shown that data bricks also has a similar tool that is in fact better because it does lineage on columns. So what's the main draw of DBT at this point?

  • 14730 Views
  • 1 replies
  • 2 kudos
Latest Reply
Dk_1802
New Contributor III
  • 2 kudos

DBT (Data Build Tool) remains popular for its extensive templating capabilities, modularity, and open-source nature, which allows for customization and integration with various data platforms. While Databricks may offer more advanced lineage features...

  • 2 kudos
Furro33
by New Contributor
  • 973 Views
  • 0 replies
  • 0 kudos

2023 summit feedback

Event covered everything a data engineer would dream of.My favorite discussions:- SparkConnect- AI on top unity catalog- delta live tables pipelines for streaming #Summit23 

  • 973 Views
  • 0 replies
  • 0 kudos
Atius
by New Contributor
  • 738 Views
  • 0 replies
  • 0 kudos

Expo experience

Great partners and SaaS solutions to jump start on floor 

  • 738 Views
  • 0 replies
  • 0 kudos
Sappy
by New Contributor
  • 863 Views
  • 0 replies
  • 0 kudos

Delta sharing

What are the prerequisites for enabling delta sharing between multiple cloud dwh 

  • 863 Views
  • 0 replies
  • 0 kudos
Deepeshn1988
by New Contributor
  • 951 Views
  • 0 replies
  • 0 kudos

Databricks summit

Data+AI event was too good. Kudos to everyone involved!here’s a glimpse â€ƒ

85BE9AA8-756B-4A72-9938-62DF667BA3C7.jpeg
Data Engineering
dataaisummit
Databricks
  • 951 Views
  • 0 replies
  • 0 kudos
Labels