Input Data:One batch load of a daily dataset, roughly 10 million items a day of transactions.Another daily batch load of roughly the same size.Each row in one dataset should have a corresponding row in the other dataset.Problem to solve:The problem i...
I've dealt with something similar in the past.There was an order system that had order items that was supposed to be matched up against corresponding products in another system that acted as a master and handled invoicing.As for unqiue considerations...
Hi, I would like to be able to do something like this...create table if not exists table1using parquetlocation = '/mnt/somelocationsome location needs to be a concatenation of static and code generated string. Documentation suggests that location onl...
Hi @Brian Labrom​ ​, We haven’t heard from you since the last response from @Prasanth Mathesh​ and @Pat Sienkiewicz​, and I was checking back to see if you have a resolution yet. If you have any solution, please share it with the community, as it can...
I amTrying to read a csv file stored in database tables of databricks, but getting error . It is runnin gfine for dbfs but same format not working for Database Tables.
Hi @Sayed Ali​ , We haven’t heard from you on the last response from me and I was checking back to see if my suggestions helped you. Or else, If you have any solution, please do share that with the community as it can be helpful to others.Also, Pleas...
I know that I can get a list of all of the table names in a given 'database' by using (if the 'database' was named "scratch"): show tables from scratchHow do I get a list just like that, but that only lists the tables that I created?
Hi @Barb Krienke​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thank...
I have data in a Spark Dataframe and I write it to an s3 location. It has some complex datatypes like structs etc. When I create the table on top on the s3 location by using CREATE TABLE IF NOT EXISTS table_name
USING DELTA
LOCATION 's3://.../...';Th...
Hi, have Databricks running on AWS, I'm looking for a way to know when is a good time to run optimize on partitioned tables. Taking into account that it's an expensive process, especially on big tables, how could I know if it's a good time to run it ...
@Alejandro Martinez​ - If Jose's answer resolved your question, would you be happy to mark his answer as best? That helps other members find the answer more quickly.
Hello there,I currently have the problem of deleted files still being in the transaction log when trying to call a delta table. What I found was this statement:%sql
FSCK REPAIR TABLE table_name [DRY RUN]But using it returned following error:Error in ...
I'm using Azure Databricks Python notebooks. We are preparing a front end to display the Databricks tables via API to query the tables. Is there a solution from Databricks to host callable APIs for querying its table and sending it as response to fro...
@Prabakar Ammeappin​ Thanks for the linkAlso was wondering for web page front end will it be more effective to query from SQL Database or from Azure Databricks tables. If from Azure SQL database, is there any efficient way to sync the tables from Az...
Hi guys,I have a trial databricks account, I realized that when I shutdown the cluster my databases and tables is disappear .. that is correct or thats is because my account is trial ?
@William Scardua​ if it's an external hive metastore or Glue catalog you might be missing the configuration on the cluster. https://docs.databricks.com/data/metastores/index.htmlAlso as mentioned by @Hubert Dudek​ , if it's a community edition then t...
Hi, i try to create a table using UI, but i keep getting the error "error creating table <table name> create a cluster first" even when i have a cluster alread running. what is the problem?
Manifest files need to be re-created when partitions are added or altered. Since a VACUUM only deletes all historical versions, you shouldn't need to create an updated manifest file unless you are also running an OPTIMIZE.
When using SQL, I can use the Create Live Table command and the Create Incremental Live Table command to set the run type I want the table to use. But I don't seem to have that same syntax for python. How can I set this table type while using Python?
The documentation at https://docs.databricks.com/data-engineering/delta-live-tables/delta-live-tables-user-guide.html#mixing-complete-tables-and-incremental-tables has an example the first two functions load data incrementally and the last one loads...