cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

CM1
by New Contributor
  • 1331 Views
  • 1 replies
  • 0 kudos

Can you migrate me from Customer Academy to Partner Academy

HelloI registered using my work email on the Customer Academy, but I should be on Partner Academy.Can you migrate my account as you have done on other posts, iehttps://community.databricks.com/s/question/0D53f00001fcieKCAQ/cannot-sign-in-at-databrick...

  • 1331 Views
  • 1 replies
  • 0 kudos
Latest Reply
Chaitanya_Raju
Honored Contributor
  • 0 kudos

Hi @Chris M​ For any issue with Academy learnings/certifications, you can raise a ticket in the below link, sharing it with you for your future reference as well.https://help.databricks.com/s/contact-us?ReqType=trainingHappy Learning!!

  • 0 kudos
Vladif1
by New Contributor II
  • 4752 Views
  • 4 replies
  • 1 kudos

Error when reading delta lake files with Auto Loader

Hi,When reading Delta Lake file (created by Auto Loader) with this code: df = (    spark.readStream    .format('cloudFiles')    .option("cloudFiles.format", "delta")    .option("cloudFiles.schemaLocation", f"{silver_path}/_checkpoint")    .load(bronz...

  • 4752 Views
  • 4 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Vlad Feigin​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so w...

  • 1 kudos
3 More Replies
RafaelGomez61
by New Contributor
  • 2263 Views
  • 2 replies
  • 0 kudos

Can't access delta tables under SQL Warehouse cluster. Getting Error while using path .../_delta_log/000000000.checkpoint

In our Databricks workspace, we have several delta tables available in the hive_metastore catalog. we are able to access and query the data via Data Science & Engineering persona clusters with no issues. The cluster have the credential passthrough en...

  • 2263 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Rafael Gomez​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so ...

  • 0 kudos
1 More Replies
jerry-xu-sa
by New Contributor II
  • 1655 Views
  • 2 replies
  • 1 kudos

Order of a dataframe is not perserved after calling cache() and limit()

Here are the simple steps to reproduce it. Note that col "foo" and "bar" are just redundant cols to make sure the dataframe doesn't fit into a single partition. // generate a random df val rand = new scala.util.Random val df = (1 to 3000).map(i => (r...

  • 1655 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Jerry Xu​ Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedback wil...

  • 1 kudos
1 More Replies
wschoi
by New Contributor III
  • 1876 Views
  • 4 replies
  • 1 kudos

Resolved! How can I cluster-install a c-Python library (pyRFC)?

If possible, how can one go about installing a Python library with SDK dependencies like pyRFC? (https://github.com/SAP/PyRFC)The SDK dependencies depend on the type of OS, and since we're running Databricks out of AWS, I assume one would have to mat...

  • 1876 Views
  • 4 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Wonseok Choi​ Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedback...

  • 1 kudos
3 More Replies
ramz
by New Contributor II
  • 2211 Views
  • 4 replies
  • 1 kudos

High driver memory usage on loading parquet file

Hi, I am using pyspark and i am reading a bunch of parquet files and doing the count on each of them. Driver memory shoots up about 6G to 8G. My setup:I have a cluster of 1 driver node and 2 worker node (all of them 16 core 128 GB RAM). This is th...

  • 2211 Views
  • 4 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @ramz siva​ Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedback wi...

  • 1 kudos
3 More Replies
pepe
by New Contributor II
  • 3804 Views
  • 2 replies
  • 1 kudos

Why can't I install python libraries when i update cluster runtime from 10.1 to 12.1?

This same question was asked here 9 months ago without any answer:https://community.databricks.com/s/question/0D58Y000096VjKrSAK/managedlibraryinstallfailed-when-changing-databricks-runtime-version-from-91-to-110I was using runtime 9.1, and then upgr...

  • 3804 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @JOSE RODRIGUEZ​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us s...

  • 1 kudos
1 More Replies
Ondrej_Lostak
by New Contributor
  • 896 Views
  • 2 replies
  • 0 kudos

Visulization only from sample of data

When I display dataframe and add visualization, I can see a preview from only a sample of data, and when I confirm it, it is counted from all of the data. Until now, everything is fine. However, when I change the dataframe, the visualization is incon...

  • 896 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Ondrej Lostak​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so...

  • 0 kudos
1 More Replies
thushar
by Contributor
  • 1933 Views
  • 4 replies
  • 0 kudos

Delta file partitions

Have one function to create files with partitions, in that the partitions are created based on metadata (getPartitionColumns) that we are keeping. In a table we have two columns that are mentioned as partition columns, say 'Team' and 'Speciality'. Wh...

  • 1933 Views
  • 4 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Thushar R​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we ...

  • 0 kudos
3 More Replies
sedat
by New Contributor II
  • 3198 Views
  • 2 replies
  • 0 kudos

Rust support (?) in databricks

Hi, for kafka streams and integration, I have seen some presentations and documents Rust is a good alternative to Spark. Is there a native support for RUST in databricks or what is best method to connect to kafka resources within Databricks.thanks fo...

  • 3198 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Sedat EKSI​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we...

  • 0 kudos
1 More Replies
Anjum
by New Contributor II
  • 3206 Views
  • 6 replies
  • 1 kudos

PGP encryption and decryption using gnupg

Hi,We are using python-gnupg==0.4.8 package for encryption and decryption and this was working as expected when we are using Databricks runtime : 9.1 LTS but when we upgarded our runtime to 12.1, it stopped working with error "gnupghome should be a d...

  • 3206 Views
  • 6 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Anjum Aara​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we...

  • 1 kudos
5 More Replies
Prasann_gupta
by New Contributor
  • 4801 Views
  • 3 replies
  • 0 kudos

SQL CONTAINS Function is not working on Databricks

I am trying to use sql CONTAINS function in my sql query but it is throwing the below error :AnalysisException: Undefined function: 'CONTAINS'. This function is neither a registered temporary function nor a permanent function registered in the databa...

  • 4801 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Prasann Gupta​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Than...

  • 0 kudos
2 More Replies
Abhradwip
by New Contributor II
  • 2374 Views
  • 3 replies
  • 0 kudos

How to create Delta Live table from Json files using Custom schema? I am getting the below error for the attached code # Error org.apache.spark.sql.AnalysisException: Table has a user-specified schema that is incompatible with the schema

#### Code# CodeImport DataTypefrom pyspark.sql.types import StructType, StructField, TimestampType, IntegerType, StringType, FloatType, BooleanType, LongType# Define Custom Schemacall_schema = StructType(  [    StructField("RecordType", StringType(),...

  • 2374 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Abhradwip Mukherjee​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from yo...

  • 0 kudos
2 More Replies
Siebert_Looije
by Contributor
  • 1210 Views
  • 2 replies
  • 0 kudos

How to fix 'An error occurred while rendering this editor' in github databricks?

How to fix the error 'An error occurred while rendering this editor.' in the github UI from databricks?Kind regards,Siebert Looije

image
  • 1210 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Siebert Looije​ Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedba...

  • 0 kudos
1 More Replies
najmead
by Contributor
  • 3398 Views
  • 2 replies
  • 1 kudos

Spark Settings in SQL Warehouse

I'm running a query, trying to parse a string into a map, and I get the following error;org.apache.spark.SparkRuntimeException: Duplicate map key was found, please check the input data. If you want to remove the duplicated keys, you can set "spark.s...

  • 3398 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Nicholas Mead​ Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedbac...

  • 1 kudos
1 More Replies
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels