cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Michael42
by New Contributor III
  • 776 Views
  • 2 replies
  • 1 kudos

Would like to start a discussion regarding techniques for joining two relatively large tables of roughly equal size on a daily basis. I realize this may be a bit of a conundrum with databricks, but review the details.

Input Data:One batch load of a daily dataset, roughly 10 million items a day of transactions.Another daily batch load of roughly the same size.Each row in one dataset should have a corresponding row in the other dataset.Problem to solve:The problem i...

  • 776 Views
  • 2 replies
  • 1 kudos
Latest Reply
Lennart
New Contributor II
  • 1 kudos

I've dealt with something similar in the past.There was an order system that had order items that was supposed to be matched up against corresponding products in another system that acted as a master and handled invoicing.As for unqiue considerations...

  • 1 kudos
1 More Replies
labromb
by Contributor
  • 3891 Views
  • 6 replies
  • 8 kudos

Resolved! Create Databricks tables dynamically

Hi, I would like to be able to do something like this...create table if not exists table1using parquetlocation = '/mnt/somelocationsome location needs to be a concatenation of static and code generated string. Documentation suggests that location onl...

  • 3891 Views
  • 6 replies
  • 8 kudos
Latest Reply
Kaniz
Community Manager
  • 8 kudos

Hi @Brian Labrom​ â€‹, We haven’t heard from you since the last response from @Prasanth Mathesh​ and @Pat Sienkiewicz​, and I was checking back to see if you have a resolution yet. If you have any solution, please share it with the community, as it can...

  • 8 kudos
5 More Replies
hisham1
by New Contributor
  • 1100 Views
  • 2 replies
  • 2 kudos

Resolved! unable to read Csv files from Databricks Database Tables

I amTrying to read a csv file stored in database tables of databricks, but getting error . It is runnin gfine for dbfs but same format not working for Database Tables.

error_db
  • 1100 Views
  • 2 replies
  • 2 kudos
Latest Reply
Kaniz
Community Manager
  • 2 kudos

Hi @Sayed Ali​ , We haven’t heard from you on the last response from me and I was checking back to see if my suggestions helped you. Or else, If you have any solution, please do share that with the community as it can be helpful to others.Also, Pleas...

  • 2 kudos
1 More Replies
Barb
by New Contributor III
  • 2379 Views
  • 2 replies
  • 0 kudos

How do I get a list of the tables that I personally created?

I know that I can get a list of all of the table names in a given 'database' by using (if the 'database' was named "scratch"): show tables from scratchHow do I get a list just like that, but that only lists the tables that I created?

  • 2379 Views
  • 2 replies
  • 0 kudos
Latest Reply
Vidula
Honored Contributor
  • 0 kudos

Hi @Barb Krienke​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thank...

  • 0 kudos
1 More Replies
Constantine
by Contributor III
  • 1323 Views
  • 2 replies
  • 5 kudos

Resolved! Delta Table created on s3 has all null values

I have data in a Spark Dataframe and I write it to an s3 location. It has some complex datatypes like structs etc. When I create the table on top on the s3 location by using CREATE TABLE IF NOT EXISTS table_name USING DELTA LOCATION 's3://.../...';Th...

  • 1323 Views
  • 2 replies
  • 5 kudos
Latest Reply
Kaniz
Community Manager
  • 5 kudos

Hi @John Constantine​ , Did you try the above suggestions?

  • 5 kudos
1 More Replies
alejandrofm
by Valued Contributor
  • 1241 Views
  • 3 replies
  • 1 kudos

Resolved! Recommendations to execute OPTIMIZE on tables

Hi, have Databricks running on AWS, I'm looking for a way to know when is a good time to run optimize on partitioned tables. Taking into account that it's an expensive process, especially on big tables, how could I know if it's a good time to run it ...

  • 1241 Views
  • 3 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

@Alejandro Martinez​ - If Jose's answer resolved your question, would you be happy to mark his answer as best? That helps other members find the answer more quickly.

  • 1 kudos
2 More Replies
BenzDriver
by New Contributor II
  • 1310 Views
  • 2 replies
  • 1 kudos

Resolved! SQL command FSCK is not found

Hello there,I currently have the problem of deleted files still being in the transaction log when trying to call a delta table. What I found was this statement:%sql FSCK REPAIR TABLE table_name [DRY RUN]But using it returned following error:Error in ...

  • 1310 Views
  • 2 replies
  • 1 kudos
Latest Reply
RKNutalapati
Valued Contributor
  • 1 kudos

Remove square brackets and try executing the command%sqlFSCK REPAIR TABLE table_name DRY RUN

  • 1 kudos
1 More Replies
Nilave
by New Contributor III
  • 2679 Views
  • 4 replies
  • 2 kudos

Resolved! Solution for API hosted on Databricks

I'm using Azure Databricks Python notebooks. We are preparing a front end to display the Databricks tables via API to query the tables. Is there a solution from Databricks to host callable APIs for querying its table and sending it as response to fro...

  • 2679 Views
  • 4 replies
  • 2 kudos
Latest Reply
Nilave
New Contributor III
  • 2 kudos

@Prabakar Ammeappin​  Thanks for the linkAlso was wondering for web page front end will it be more effective to query from SQL Database or from Azure Databricks tables. If from Azure SQL database, is there any efficient way to sync the tables from Az...

  • 2 kudos
3 More Replies
William_Scardua
by Valued Contributor
  • 4231 Views
  • 5 replies
  • 12 kudos

The database and tables disappears when I delete the cluster

Hi guys,I have a trial databricks account, I realized that when I shutdown the cluster my databases and tables is disappear .. that is correct or thats is because my account is trial ?

  • 4231 Views
  • 5 replies
  • 12 kudos
Latest Reply
Prabakar
Esteemed Contributor III
  • 12 kudos

@William Scardua​ if it's an external hive metastore or Glue catalog you might be missing the configuration on the cluster. https://docs.databricks.com/data/metastores/index.htmlAlso as mentioned by @Hubert Dudek​ , if it's a community edition then t...

  • 12 kudos
4 More Replies
aimas
by New Contributor III
  • 4055 Views
  • 8 replies
  • 5 kudos

Resolved! error creating tables using UI

Hi, i try to create a table using UI, but i keep getting the error "error creating table <table name> create a cluster first" even when i have a cluster alread running. what is the problem?

  • 4055 Views
  • 8 replies
  • 5 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 5 kudos

Be sure that cluster is selected (arrow in database) and at least there is Default database.

  • 5 kudos
7 More Replies
User16826987838
by Contributor
  • 1060 Views
  • 1 replies
  • 0 kudos

Refreshing external tables

After I vacuum the tables, do i need to update the manifest table and parquet table to refresh my external tables for integrations to work?

  • 1060 Views
  • 1 replies
  • 0 kudos
Latest Reply
Taha
New Contributor III
  • 0 kudos

Manifest files need to be re-created when partitions are added or altered. Since a VACUUM only deletes all historical versions, you shouldn't need to create an updated manifest file unless you are also running an OPTIMIZE.

  • 0 kudos
User16826992666
by Valued Contributor
  • 1478 Views
  • 1 replies
  • 0 kudos

When using Delta Live Tables, how do I set a table to be incremental vs complete using Python?

When using SQL, I can use the Create Live Table command and the Create Incremental Live Table command to set the run type I want the table to use. But I don't seem to have that same syntax for python. How can I set this table type while using Python?

  • 1478 Views
  • 1 replies
  • 0 kudos
Latest Reply
sajith_appukutt
Honored Contributor II
  • 0 kudos

The documentation at https://docs.databricks.com/data-engineering/delta-live-tables/delta-live-tables-user-guide.html#mixing-complete-tables-and-incremental-tables has an example the first two functions load data incrementally and the last one loads...

  • 0 kudos
Labels