Data Engineering

Forum Posts

Sorted by:

by Arnold_Souza • New Contributor III

03-22-2023 2:56:49 PM

5080 Views
4 replies
1 kudos

Connect Databricks to a database protected by a firewall

We a facing a situation and I would like to understand from the Databricks side what is the best practice regarding that. Question: Is it possible to have a cluster with a fixed Global IP on Databricks?DetailsWe have a vendor that has a SQL Server da...

Data Engineering

5080 Views
4 replies
1 kudos

03-22-2023 2:56:49 PM

View Replies

Latest Reply

Anonymous
Not applicable

04-01-2023 10:18:01 AM

1 kudos

@Arnold Souza If you file a support to Azure support they can help customize the Vnet by unlocking it as the Azure Databricks resources are deployed in a managed resource group. Your plan B also should be the way to go if option 1 does not work as e...

1 kudos

04-01-2023 10:18:01 AM

3 More Replies

by Mado • Valued Contributor II

03-07-2023 4:30:25 AM

2328 Views
4 replies
0 kudos

Medallion architecture, how to update Gold tables?

Assume that I have a data source that is ingested to a few bronze tables, and transformed to a silver table. Ans next, a gold table is created by aggregating the silver table. If new records arrive in the data source, bronze and silver tables are upd...

Data Engineering

2328 Views
4 replies
0 kudos

03-07-2023 4:30:25 AM

View Replies

Latest Reply

Mado
Valued Contributor II

03-31-2023 10:47:33 PM

0 kudos

Hi @Vidula Khanna The answer didn't fit my question. In the case of using Merge, I found a good article here:https://medium.com/@avnishjain22/simplify-optimise-and-improve-your-data-pipelines-with-incremental-etl-on-the-lakehouse-61b279afadea

0 kudos

03-31-2023 10:47:33 PM

3 More Replies

by hv • New Contributor

03-30-2023 9:43:07 AM

3042 Views
1 replies
0 kudos

Error-"'Column' object is not callable".

I am trying to lowercase one of the columns(A_description) of a dataframe(df) and getting the error-"'Column' object is not callable".Code: def new_desc(): for line in df: line = df['A_description'].map(str.lower) return line new_desc()Have used...

Data Engineering

3042 Views
1 replies
0 kudos

03-30-2023 9:43:07 AM

View Replies

Latest Reply

Chaitanya_Raju
Honored Contributor

03-31-2023 8:16:11 PM

0 kudos

Hi @Himadri Verma Hope this below suggestion will help you in pyspark.Please let me know if you are looking for something elseHappy Learning!!

0 kudos

03-31-2023 8:16:11 PM

by CM1 • New Contributor

03-30-2023 3:08:39 AM

5318 Views
1 replies
0 kudos

Can you migrate me from Customer Academy to Partner Academy

HelloI registered using my work email on the Customer Academy, but I should be on Partner Academy.Can you migrate my account as you have done on other posts, iehttps://community.databricks.com/s/question/0D53f00001fcieKCAQ/cannot-sign-in-at-databrick...

Data Engineering

5318 Views
1 replies
0 kudos

03-30-2023 3:08:39 AM

View Replies

Latest Reply

Chaitanya_Raju
Honored Contributor

03-31-2023 7:27:39 PM

0 kudos

Hi @Chris M For any issue with Academy learnings/certifications, you can raise a ticket in the below link, sharing it with you for your future reference as well.https://help.databricks.com/s/contact-us?ReqType=trainingHappy Learning!!

0 kudos

03-31-2023 7:27:39 PM

by Vladif1 • New Contributor II

03-29-2023 9:43:41 PM

5183 Views
4 replies
1 kudos

Error when reading delta lake files with Auto Loader

Hi,When reading Delta Lake file (created by Auto Loader) with this code: df = ( spark.readStream .format('cloudFiles') .option("cloudFiles.format", "delta") .option("cloudFiles.schemaLocation", f"{silver_path}/_checkpoint") .load(bronz...

Data Engineering

5183 Views
4 replies
1 kudos

03-29-2023 9:43:41 PM

View Replies

Latest Reply

Anonymous
Not applicable

03-31-2023 7:20:43 PM

1 kudos

Hi @Vlad Feigin Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so w...

1 kudos

03-31-2023 7:20:43 PM

3 More Replies

by RafaelGomez61 • New Contributor

03-30-2023 11:16:52 AM

2581 Views
2 replies
0 kudos

Can't access delta tables under SQL Warehouse cluster. Getting Error while using path .../_delta_log/000000000.checkpoint

In our Databricks workspace, we have several delta tables available in the hive_metastore catalog. we are able to access and query the data via Data Science & Engineering persona clusters with no issues. The cluster have the credential passthrough en...

Data Engineering

2581 Views
2 replies
0 kudos

03-30-2023 11:16:52 AM

View Replies

Latest Reply

Anonymous
Not applicable

03-31-2023 7:10:58 PM

0 kudos

Hi @Rafael Gomez Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so ...

0 kudos

03-31-2023 7:10:58 PM

1 More Replies

by jerry-xu-sa • New Contributor II

03-06-2023 11:45:02 PM

1900 Views
2 replies
1 kudos

Order of a dataframe is not perserved after calling cache() and limit()

Here are the simple steps to reproduce it. Note that col "foo" and "bar" are just redundant cols to make sure the dataframe doesn't fit into a single partition. // generate a random df val rand = new scala.util.Random val df = (1 to 3000).map(i => (r...

Data Engineering

1900 Views
2 replies
1 kudos

03-06-2023 11:45:02 PM

View Replies

Latest Reply

Anonymous
Not applicable

03-31-2023 5:58:05 PM

1 kudos

Hi @Jerry Xu Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedback wil...

1 kudos

03-31-2023 5:58:05 PM

1 More Replies

by wschoi • New Contributor III

03-07-2023 5:18:58 PM

2191 Views
4 replies
1 kudos

Resolved! How can I cluster-install a c-Python library (pyRFC)?

If possible, how can one go about installing a Python library with SDK dependencies like pyRFC? (https://github.com/SAP/PyRFC)The SDK dependencies depend on the type of OS, and since we're running Databricks out of AWS, I assume one would have to mat...

Data Engineering

2191 Views
4 replies
1 kudos

03-07-2023 5:18:58 PM

View Replies

Latest Reply

Anonymous
Not applicable

03-31-2023 5:57:48 PM

1 kudos

Hi @Wonseok Choi Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedback...

1 kudos

03-31-2023 5:57:48 PM

3 More Replies

by ramz • New Contributor II

03-07-2023 12:40:33 AM

2480 Views
4 replies
1 kudos

High driver memory usage on loading parquet file

Hi, I am using pyspark and i am reading a bunch of parquet files and doing the count on each of them. Driver memory shoots up about 6G to 8G. My setup:I have a cluster of 1 driver node and 2 worker node (all of them 16 core 128 GB RAM). This is th...

Data Engineering

2480 Views
4 replies
1 kudos

03-07-2023 12:40:33 AM

View Replies

Latest Reply

Anonymous
Not applicable

03-31-2023 5:57:17 PM

1 kudos

Hi @ramz siva Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedback wi...

1 kudos

03-31-2023 5:57:17 PM

3 More Replies

by pepe • New Contributor II

03-09-2023 1:09:32 PM

8066 Views
2 replies
1 kudos

Why can't I install python libraries when i update cluster runtime from 10.1 to 12.1?

This same question was asked here 9 months ago without any answer:https://community.databricks.com/s/question/0D58Y000096VjKrSAK/managedlibraryinstallfailed-when-changing-databricks-runtime-version-from-91-to-110I was using runtime 9.1, and then upgr...

Data Engineering

8066 Views
2 replies
1 kudos

03-09-2023 1:09:32 PM

View Replies

Latest Reply

Anonymous
Not applicable

03-31-2023 5:54:18 PM

1 kudos

Hi @JOSE RODRIGUEZ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us s...

1 kudos

03-31-2023 5:54:18 PM

1 More Replies

by Ondrej_Lostak • New Contributor

03-10-2023 1:23:09 AM

1023 Views
2 replies
0 kudos

Visulization only from sample of data

When I display dataframe and add visualization, I can see a preview from only a sample of data, and when I confirm it, it is counted from all of the data. Until now, everything is fine. However, when I change the dataframe, the visualization is incon...

Data Engineering

1023 Views
2 replies
0 kudos

03-10-2023 1:23:09 AM

View Replies

Latest Reply

Anonymous
Not applicable

03-31-2023 5:53:13 PM

0 kudos

Hi @Ondrej Lostak Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so...

0 kudos

03-31-2023 5:53:13 PM

1 More Replies

by thushar • Contributor

03-08-2023 11:57:42 PM

2371 Views
4 replies
0 kudos

Delta file partitions

Have one function to create files with partitions, in that the partitions are created based on metadata (getPartitionColumns) that we are keeping. In a table we have two columns that are mentioned as partition columns, say 'Team' and 'Speciality'. Wh...

Data Engineering

2371 Views
4 replies
0 kudos

03-08-2023 11:57:42 PM

View Replies

Latest Reply

Anonymous
Not applicable

03-31-2023 5:52:51 PM

0 kudos

Hi @Thushar R Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we ...

0 kudos

03-31-2023 5:52:51 PM

3 More Replies

by sedat • New Contributor II

03-06-2023 4:07:41 PM

3463 Views
2 replies
0 kudos

Rust support (?) in databricks

Hi, for kafka streams and integration, I have seen some presentations and documents Rust is a good alternative to Spark. Is there a native support for RUST in databricks or what is best method to connect to kafka resources within Databricks.thanks fo...

Data Engineering

3463 Views
2 replies
0 kudos

03-06-2023 4:07:41 PM

View Replies

Latest Reply

Anonymous
Not applicable

03-31-2023 5:51:37 PM

0 kudos

Hi @Sedat EKSI Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we...

0 kudos

03-31-2023 5:51:37 PM

1 More Replies

by Anjum • New Contributor II

03-06-2023 9:30:17 PM

3624 Views
6 replies
1 kudos

PGP encryption and decryption using gnupg

Hi,We are using python-gnupg==0.4.8 package for encryption and decryption and this was working as expected when we are using Databricks runtime : 9.1 LTS but when we upgarded our runtime to 12.1, it stopped working with error "gnupghome should be a d...

Data Engineering

3624 Views
6 replies
1 kudos

03-06-2023 9:30:17 PM

View Replies

Latest Reply

Anonymous
Not applicable

03-31-2023 5:50:37 PM

1 kudos

Hi @Anjum Aara Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we...

1 kudos

03-31-2023 5:50:37 PM

5 More Replies

by Prasann_gupta • New Contributor

03-09-2023 10:40:52 PM

5635 Views
3 replies
0 kudos

SQL CONTAINS Function is not working on Databricks

I am trying to use sql CONTAINS function in my sql query but it is throwing the below error :AnalysisException: Undefined function: 'CONTAINS'. This function is neither a registered temporary function nor a permanent function registered in the databa...

Data Engineering

5635 Views
3 replies
0 kudos

03-09-2023 10:40:52 PM

View Replies

Latest Reply

Anonymous
Not applicable

03-31-2023 5:47:41 PM

0 kudos

Hi @Prasann Gupta Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Than...

0 kudos

03-31-2023 5:47:41 PM

2 More Replies

User

Count

1603

744

348

285

247

Databricks Community

Forum Posts

Connect Databricks to a database protected by a firewall

Medallion architecture, how to update Gold tables?

Error-"'Column' object is not callable".

Can you migrate me from Customer Academy to Partner Academy

Error when reading delta lake files with Auto Loader

Can't access delta tables under SQL Warehouse cluster. Getting Error while using path .../_delta_log/000000000.checkpoint

Order of a dataframe is not perserved after calling cache() and limit()

Resolved! How can I cluster-install a c-Python library (pyRFC)?

High driver memory usage on loading parquet file

Why can't I install python libraries when i update cluster runtime from 10.1 to 12.1?

Visulization only from sample of data

Delta file partitions

Rust support (?) in databricks

PGP encryption and decryption using gnupg

SQL CONTAINS Function is not working on Databricks

Compute Policy Does Not Install Libraries

Is there a way to let the DLT pipeline retry by it...

Can't create Catalog on Databricks on AWS

Executing Notebooks - Run All Cells vs Run All Bel...

getting Status code: 301 Moved Permanently error