cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Nick_Hughes
by New Contributor III
  • 3287 Views
  • 3 replies
  • 1 kudos

Best way to generate fake data using underlying schema

HiWe are trying to generate fake data to run our tests. For example, we have a pipeline that creates a gold layer fact table form 6 underlying source tables in our silver layer. We want to generate the data in a way that recognises the relationships ...

  • 3287 Views
  • 3 replies
  • 1 kudos
Latest Reply
RonanStokes_DB
New Contributor III
  • 1 kudos

Hi @Nick_Hughes This may be late for your scenario - but hopefully others facing similar issues will find it useful.You can specify how data is generated in `dbldatagen` using rules in the data generation spec. If rules are specified for data generat...

  • 1 kudos
2 More Replies
AnuVat
by New Contributor III
  • 15661 Views
  • 7 replies
  • 12 kudos

How to read data from a table into a dataframe outside of Databricks environment?

Hi, I am working on an ML project and I need to access the data in tables hosted in my Databricks cluster through a notebook that I am running locally. This has been very easy while I run the notebooks in Databricks but I cannot figure out how to do ...

  • 15661 Views
  • 7 replies
  • 12 kudos
Latest Reply
chakri
New Contributor III
  • 12 kudos

We can use Apis and pyodbc to achieve this. Once go through the official documentation of databricks that might be helpful to access outside of the databricks environment.

  • 12 kudos
6 More Replies
NimaiAhl
by New Contributor II
  • 669 Views
  • 1 replies
  • 0 kudos

External Tables - SQL

To create external tables we need to use the location keyword and use the link for the storage location, in reference to that does the user need to have permission for the storage location if not then will we use storage credentials to provide the ac...

  • 669 Views
  • 1 replies
  • 0 kudos
Latest Reply
Shikamaru
New Contributor II
  • 0 kudos

Hi Nimai, That's partially right. You can grant permissions directly on the storage credential, but Databricks recommends that you reference it in an external location and grant permissions to that instead. An external location combines a storage cre...

  • 0 kudos
DBX-Beginer
by New Contributor
  • 2706 Views
  • 2 replies
  • 0 kudos

Display count of records in all tables in hive meta store based on one of the column value.

I have a DB name called Test in Hive meta store of data bricks. This DB contains around 100 tables. Each table has the column name called sourcesystem and many other columns. Now I need to display the count of records in each table group by source sy...

  • 2706 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Krish K​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 0 kudos
1 More Replies
Upendra_Kumar
by New Contributor
  • 637 Views
  • 3 replies
  • 0 kudos

Not able to perform update in delta table in databricks using 3 tables

Hi,I am able to perform merge from 2 tables but have requirement to update table based on 3 tables like following query.update a set a.name=b.namefrom table1 a inner join table2 b on a.id=b.idinner join table3 c on a.id=c.idThanks in advance..

  • 637 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @upendra kumar sharma​ Help us build a vibrant and resourceful community by recognizing and highlighting insightful contributions. Mark the best answers and show your appreciation!Thanks and Regards

  • 0 kudos
2 More Replies
RafaelGomez61
by New Contributor
  • 1881 Views
  • 2 replies
  • 0 kudos

Can't access delta tables under SQL Warehouse cluster. Getting Error while using path .../_delta_log/000000000.checkpoint

In our Databricks workspace, we have several delta tables available in the hive_metastore catalog. we are able to access and query the data via Data Science & Engineering persona clusters with no issues. The cluster have the credential passthrough en...

  • 1881 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Rafael Gomez​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so ...

  • 0 kudos
1 More Replies
shiva12494
by New Contributor II
  • 1533 Views
  • 2 replies
  • 2 kudos

Issue with reading exported tables stored in parquet

Hi All, I am exported all tables from postgres snapshot into S3 in parquet format. I am trying to read the table using databricks and i am unable to do so. I get the following error: "Unable to infer schema for Parquet. It must be specified manually....

  • 1533 Views
  • 2 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @shiva charan velichala​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that bes...

  • 2 kudos
1 More Replies
SRK
by Contributor III
  • 1457 Views
  • 5 replies
  • 0 kudos

Delta Live Tables data quality rules application.

I have a requirement, where I need to apply inverse DQ rule on a table to track the invalid data. For which I can use the following approach:import dltrules = {}quarantine_rules = {}rules["valid_website"] = "(Website IS NOT NULL)"rules["valid_locatio...

  • 1457 Views
  • 5 replies
  • 0 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 0 kudos

You can get additional info from DLT event log which is in delta so you can load it as table https://docs.databricks.com/workflows/delta-live-tables/delta-live-tables-event-log.html#data-quality

  • 0 kudos
4 More Replies
Databrickguy
by New Contributor II
  • 1384 Views
  • 2 replies
  • 2 kudos

How to check/list the tables which has CDF enabled?

How to list all the tables which has CDF enabled?I can review a table to find out if CDF is enable with below code.SHOW TBLPROPERTIES tableA(delta.enableChangeDataFeed)The return key ...

  • 1384 Views
  • 2 replies
  • 2 kudos
Latest Reply
jose_gonzalez
Moderator
  • 2 kudos

Hhi @Tim zhang​,Just a friendly follow-up. Did any of the responses help you to resolve your question? if it did, please mark it as best. Otherwise, please let us know if you still need help.

  • 2 kudos
1 More Replies
Mahesh777k
by New Contributor
  • 1297 Views
  • 3 replies
  • 2 kudos

How to delete duplicate tables?

Hi Everyone,Accidently imported duplicate tables, guide me how to delete themusing data bricks community edition  

image
  • 1297 Views
  • 3 replies
  • 2 kudos
Latest Reply
Kaniz
Community Manager
  • 2 kudos

Hi @Mahesh Babu Uppala​ (Customer)​ , We haven’t heard from you on the last response from @Uma Maheswara Rao Desula​ â€‹ and @Ratna Chaitanya Raju Bandaru​ â€‹, and I was checking back to see if their suggestions helped you. Or else, If you have any solu...

  • 2 kudos
2 More Replies
Spauk
by New Contributor II
  • 7765 Views
  • 5 replies
  • 7 kudos

Resolved! Best Practices for naming Tables and Databases in Databricks

We moved in Databricks since few months from now, and before that we were in SQL Server.So, all our tables and databases follow the "camel case" rule.Apparently, in Databricks the rule is "lower case with underscore".Where can we find an official doc...

  • 7765 Views
  • 5 replies
  • 7 kudos
Latest Reply
LandanG
Honored Contributor
  • 7 kudos

Hi @Salah KHALFALLAH​ , looking at the documentation it appears that Databricks' preferred naming convention is lowercase and underscores as you mentioned.The reason for this is most likely because Databricks uses Hive Metastore, which is case insens...

  • 7 kudos
4 More Replies
Viren123
by Contributor
  • 1876 Views
  • 6 replies
  • 4 kudos

API to write into Databricks tables

Hello Experts,Is there any API in databricks that allows to write the data in the Databricks tables. I would like to send small size Logs information to Databricks tables from other service. What are my options?Thank you very much.

  • 1876 Views
  • 6 replies
  • 4 kudos
Latest Reply
Kaniz
Community Manager
  • 4 kudos

Hi @Viren Devi​, It would mean a lot if you could select the "Best Answer" to help others find the correct answer faster.This makes that answer appear right after the question, so it's easier to find within a thread.It also helps us mark the question...

  • 4 kudos
5 More Replies
Kopal
by New Contributor II
  • 2768 Views
  • 3 replies
  • 3 kudos

Resolved! Data Engineering - CTAS - External Tables - Limitations of CTAS for external tables - can or cannot use options and location

Data Engineering - CTAS - External TablesCan someone help me understand why In chapter 3.3, we cannot not directly use CTAS with OPTIONS and LOCATION to specify delimiter and location of CSV?Or I misunderstood?Details:In Data Engineering with Databri...

  • 2768 Views
  • 3 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

The 2nd statement CTAS will not be able to parse the csv in any manner because it's just the from statement that points to a file. It's more of a traditional SQL statement with select and from. It will create a Delta Table. This just happens to b...

  • 3 kudos
2 More Replies
same213
by New Contributor III
  • 2483 Views
  • 5 replies
  • 9 kudos

Is it possible to create a sqlite database and export it?

I am trying to create a sqlite database in databricks and add a few tables to it. Ultimately, I want to export this using Azure. Is this possible?

  • 2483 Views
  • 5 replies
  • 9 kudos
Latest Reply
same213
New Contributor III
  • 9 kudos

@Hubert Dudek​  We currently have a process in place that reads in a SQLite file. We recently transitioned to using Databricks. We were hoping to be able to create a SQLite file so we didn't have to alter the current process we have in place.

  • 9 kudos
4 More Replies
xiangzhu
by Contributor
  • 2165 Views
  • 3 replies
  • 2 kudos

Could jobs do everything delta live tables do ?

Hello,I've read the posts:Jobs - Delta Live tables difference (databricks.com)andDifference between Delta Live Tables and Multitask Jobs (databricks.com)My understanding is that delta live tables are more like a DSL that simplfies the workflow defini...

  • 2165 Views
  • 3 replies
  • 2 kudos
Latest Reply
xiangzhu
Contributor
  • 2 kudos

@Landan George​ "Jobs won't be able to do what DLT does", I read some blogs, and watched some videos too, but I still cannot figure out the difference between jobs vs DLT. Does it mean without Databricks DLT, Databricks jobs cannot handle delta table...

  • 2 kudos
2 More Replies
Labels