cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

NimaiAhl
by New Contributor II
  • 1427 Views
  • 1 replies
  • 0 kudos

External Tables - SQL

To create external tables we need to use the location keyword and use the link for the storage location, in reference to that does the user need to have permission for the storage location if not then will we use storage credentials to provide the ac...

  • 1427 Views
  • 1 replies
  • 0 kudos
Latest Reply
Shikamaru
Databricks Employee
  • 0 kudos

Hi Nimai, That's partially right. You can grant permissions directly on the storage credential, but Databricks recommends that you reference it in an external location and grant permissions to that instead. An external location combines a storage cre...

  • 0 kudos
Kotofosonline
by New Contributor III
  • 2662 Views
  • 1 replies
  • 2 kudos

Query with distinct sort and alias produces error column not found

I’m trying to use sql query on azure-databricks with distinct sort and aliasesSELECT DISTINCT album.ArtistId AS my_alias FROM album ORDER BY album.ArtistIdThe problem is that if I add an alias then I can not use not aliased name in the order by cla...

  • 2662 Views
  • 1 replies
  • 2 kudos
Latest Reply
User16756723392
Databricks Employee
  • 2 kudos

SELECT album.ArtistId ,DISTINCT album.ArtistId AS my_alias FROM album ORDER BY album.ArtistIdCan you try this

  • 2 kudos
UmaMahesh1
by Honored Contributor III
  • 2509 Views
  • 1 replies
  • 2 kudos

Checkpoint issue when loading data from confluent kafka

I have a streaming notebook which fetches messages from confluent Kafka topic and loads them into adls. It is a streaming notebook with the trigger as continuous processing. Before loading the message (which is in Avro format), I'm flattening out the...

  • 2509 Views
  • 1 replies
  • 2 kudos
Latest Reply
Avinash_94
New Contributor III
  • 2 kudos

Best approach is to not to depend on Kafka’s commit mechanism! We can store processing result and message offset to external data store in the same database transaction. So, if the database transaction fails, both commit and processing will fail and ...

  • 2 kudos
Himanshu1
by New Contributor II
  • 2790 Views
  • 1 replies
  • 3 kudos

How to read XML files in delta live tables?

Even after maven library installation using the Auto installation.spark.read.option("rowTag", "tag").xml("dbfs:/mnt/dev/bronze/xml/fileName.xml")not working.

image.png
  • 2790 Views
  • 1 replies
  • 3 kudos
Latest Reply
DD_Sharma
New Contributor III
  • 3 kudos

At present DLT does not support installing the maven library from the DLT pipeline. In the future this feature will come for sure so please wait for some time and keep checking data bricks runtime release docs https://docs.databricks.com/release-note...

  • 3 kudos
samruddhi
by New Contributor
  • 1862 Views
  • 1 replies
  • 0 kudos

Issue while creating Workspace in databricks using AWS

I am trying to configure databricks with AWS, I have configured the cloud resources as described in this https://docs.databricks.com/administration-guide/account-api/iam-role.html#language-Databricks%C2%A0VPC I have selected "Your VPC Default" as the...

image.png
  • 1862 Views
  • 1 replies
  • 0 kudos
Latest Reply
Abishek
Databricks Employee
  • 0 kudos

@samruddhi ChitnisCan you please check the below troubleshooting guide : Credentials configuration error messages: Malformed request: Failed credential configuration validation checksThe list of permissions checks in the error message indicate the li...

  • 0 kudos
sajith_appukutt
by Honored Contributor II
  • 2668 Views
  • 2 replies
  • 1 kudos
  • 2668 Views
  • 2 replies
  • 1 kudos
Latest Reply
AdrianRojas
New Contributor II
  • 1 kudos

a bit old, but I just faced the same issue, specifying a custom EncryptionMaterialsProvider (as described in the previous post) did the trick for me but I did had to also specify my kms endpoint, just because my region:"fs.s3.cse.kms.endpoint" -> "km...

  • 1 kudos
1 More Replies
Samit110978
by New Contributor II
  • 2732 Views
  • 3 replies
  • 1 kudos

Passing Parameter from SSRS to Databricks user defined function

I am trying to pass parameter from SSRS to User Defined Function in Databricks which in turn will return table that will be shown as output in report.I tried below calling function from SSRS, but it looks like parameter value is not passed. I have di...

  • 2732 Views
  • 3 replies
  • 1 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 1 kudos

can you share full code and dataset by that we can also debug this

  • 1 kudos
2 More Replies
Raghu_Bindingan
by New Contributor III
  • 13634 Views
  • 2 replies
  • 0 kudos

Resolved! SQL Merge Statement not working

Hi I am trying to use the SQL Merge statement on databricksMERGE INTO targetUSING sourceON source.key = target.keyWHEN MATCHED UPDATE SET *WHEN NOT MATCHED INSERT *WHEN NOT MATCHED BY SOURCE DELETEThis is failing with the error [PARSE_SYNTAX_ERROR...

  • 13634 Views
  • 2 replies
  • 0 kudos
Latest Reply
Raghu_Bindingan
New Contributor III
  • 0 kudos

I was missing the THEN before UPDATE, INSERT and DELETE. This keyword is missing from the documentation on Databricks https://learn.microsoft.com/en-us/azure/databricks/delta/mergeIt now works. Thanks

  • 0 kudos
1 More Replies
Rexton
by New Contributor
  • 5967 Views
  • 3 replies
  • 2 kudos

AWS Databricks Pyspark - Unable to connect to Azure MySQL - Shows "SSL Connection is required"

Even after specifying SSL options, unable to connect to MySQL. What could have gone wrong? Could anyone experience similar issues? df_target_master = spark.read.format("jdbc")\.option("driver", "com.mysql.jdbc.Driver")\.option("url", host_url)\.optio...

  • 5967 Views
  • 3 replies
  • 2 kudos
Latest Reply
a2barbosa
New Contributor II
  • 2 kudos

Hey,Here the solution: The correct option for ssl is "useSSL" and not just "ssl".This code below could works:df_target_master = spark.read.format("jdbc")\.option("driver", "com.mysql.jdbc.Driver")\.option("url", host_url)\.option("dbtable", supply_ma...

  • 2 kudos
2 More Replies
Punnu
by New Contributor II
  • 1963 Views
  • 1 replies
  • 1 kudos

Error while running spark.catalog.listDatabases()

I am running steps mentioned in https://github.com/databrickslabs/splunk-integration/blob/master/notebooks/source/push_to_splunk.pyWhen I am running spark.catalog.listDatabases()getting error py4j.security.Py4JSecurityException: Method public java.l...

  • 1963 Views
  • 1 replies
  • 1 kudos
Latest Reply
pvignesh92
Honored Contributor
  • 1 kudos

Hi @Purnima Bhatia​ , I faced a similar error for a different command when I was using a wrong type of cluster access mode. You can try to create a different cluster with different access mode and check. I might be wrong but try and check this.

  • 1 kudos
Manju1202
by New Contributor II
  • 3150 Views
  • 3 replies
  • 1 kudos

Saving Number field as String in Databricks

Do we see any risk of saving a Number field as String? Will we use any functionality/feature if we save as String ? Will it have any impact on performance ?

  • 3150 Views
  • 3 replies
  • 1 kudos
Latest Reply
pvignesh92
Honored Contributor
  • 1 kudos

Hi @Manju Chugani​. Yes. In Short, it is not really recommended to save the columns as string if all the values are expected to be numbers.Here are some of them Storage Space: Storing numbers as strings can take up more storage space than storing the...

  • 1 kudos
2 More Replies
nicole_wong
by Databricks Employee
  • 12710 Views
  • 10 replies
  • 7 kudos

Resolved! Can Terraform be used to set configurations in Admin / workspace settings?

I am posting this on behalf of my customer. They are currently working on the deployment & config of their workspace on AWS via Terraform.Is it possible to set some configs in the Admin/workspace settings via TF? According to the Terraform module, it...

  • 12710 Views
  • 10 replies
  • 7 kudos
Latest Reply
francly
New Contributor II
  • 7 kudos

Hi, can I get a full list of the latest configurable supported workspace_conf on tf, I can't find the list on tf registry site.

  • 7 kudos
9 More Replies
johnb1
by Contributor
  • 2834 Views
  • 3 replies
  • 0 kudos

Cluster Configuration for ML Model Training

Hi!I am training a Random Forest (pyspark.ml.classification.RandomForestClassifier) on Databricks with 1,000,000 training examples and 25 features. I employ a cluster with one driver (16 GB Memory, 4 Cores), 2-6 workers (32-96 GB Memory, 8-24 Cores),...

  • 2834 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @John B​ Hope everything is going great.Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we can...

  • 0 kudos
2 More Replies
ALIDI
by New Contributor II
  • 2268 Views
  • 3 replies
  • 3 kudos

training_set.load_df().toPandas() fails with the new pandas version (2.0.0)

pandas 2.0.0 was released on 4.3.2023 and was pushed to my cluster on the same day. The day after I tried using training_set.load_df().toPandas() and it failed. Reverting to pandas 1.5.3. fixed the problem.

  • 2268 Views
  • 3 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Al IDI​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your q...

  • 3 kudos
2 More Replies
RDD1
by New Contributor III
  • 1774 Views
  • 3 replies
  • 0 kudos

Hi, I have completed lakehouse fundamentals accreditation, but did not receive the badge yet, only have the certificate of completion.

Hi, I have completed lakehouse fundamentals accreditation, but did not receive the badge yet, only have the certificate of completion.

  • 1774 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @RD DO​ Apologies as we have an issue with our credentials app. We are working with the vendor to resolve it. We expect to be able to grant your badge soon.Thank you!

  • 0 kudos
2 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels