cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

theSoyf
by New Contributor II
  • 5819 Views
  • 2 replies
  • 1 kudos

How to write to Salesforce object using Spark Salesforce Library

Hi I'm facing an issue when writing to a salesforce object. I'm using the springml/spark-salesforce library. I have the above libraries installed as recommended based on my research.I try to write like this:(_sqldf .write .format("com.springml.spar...

Screen Shot 2022-12-14 at 8.18.07 AM
  • 5819 Views
  • 2 replies
  • 1 kudos
Latest Reply
Gauthy
Databricks Partner
  • 1 kudos

Im facing the same issue while trying to write to Salesforce, if you have found a resolution could you please share it ?

  • 1 kudos
1 More Replies
knawara
by Contributor
  • 5641 Views
  • 4 replies
  • 1 kudos

Delta Live Tables: reading from output

I'm trying to implement an incremental ingestion logic in the following way:database tables have DbUpdatedDate columnDuring initial load I perform a full copy of the database tableDuring incremental load I:scan the data already in the DLT to see what...

  • 5641 Views
  • 4 replies
  • 1 kudos
Latest Reply
fecavalc08
New Contributor III
  • 1 kudos

Hi @Chris Nawara​, I had the same issue you had. I was trying to avoid the apply_changes but we in the end I implemented it and I'm happier that I expected heheand if you have any additional standardization columns that you need to implement, you can...

  • 1 kudos
3 More Replies
Bie1234
by New Contributor III
  • 3605 Views
  • 2 replies
  • 3 kudos

Resolved! accidently delete paquet file in dbfs

I accidently  delete manual paquet file in dbfs how can I recovery this recovery this file

  • 3605 Views
  • 2 replies
  • 3 kudos
Latest Reply
Ajay-Pandey
Databricks MVP
  • 3 kudos

Hi @pansiri panaudom​ ,There is no option restore deleted files in databricks .

  • 3 kudos
1 More Replies
Mado
by Valued Contributor II
  • 3873 Views
  • 1 replies
  • 1 kudos

Resolved! How to query Databricks audit logs?

Hi,I would like to ask where the Databricks Audit Log files are stored on the DBFS.And is there any way that I can query log files?Thanks.

  • 3873 Views
  • 1 replies
  • 1 kudos
Latest Reply
Ajay-Pandey
Databricks MVP
  • 1 kudos

Hi @Mohammad Saber​ ,I think first you need to configure audit log in databricks then you use it.Please refer below blog that will help you in this.Configure audit logging | Databricks on AWS

  • 1 kudos
Vinayak_s
by New Contributor II
  • 3068 Views
  • 4 replies
  • 1 kudos

Need help to understand Databricks workspace service principle token expire calculate

Hi Team, Need assistance to understand Databricks workspace service principle token expire calculation. Issue : when I am creating a token I have given lifetime =3600, but when I doing get token I am getting unexpected expiry number and even when I ...

  • 3068 Views
  • 4 replies
  • 1 kudos
Latest Reply
Vinayak_s
New Contributor II
  • 1 kudos

Hi Team, Please help on my issue,Is there any way to find expiry of token, i mean still how much time have token to expiry. creation_time - expiry_time  is not giving me exact output.Kindly let me know if there is any way to find as soon as possibleT...

  • 1 kudos
3 More Replies
vinaykumar
by Databricks Partner
  • 6679 Views
  • 3 replies
  • 1 kudos

Resolved! Run databricks job instantly without waiting job cluster get active

when we run databricks job it take some time to get job cluster active . I created pool also and attached with job cluster but still it take time to attached the cluster and job cluster get active to start the job run. is there any way - we can run d...

  • 6679 Views
  • 3 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

If you want instant processing, you will have to have a cluster running all the time.As mentioned above, Databricks is testing serverless compute for data engineering workloads (comparable to serverless SQL). This fires up a cluster in a few seconds...

  • 1 kudos
2 More Replies
tinendra
by New Contributor III
  • 6217 Views
  • 7 replies
  • 8 kudos

Can we run pandas dataframe inside databricks?

Hi, I want to run df=pd.read_csv('/dbfs/FileStore/airlines1.csv') while trying to run getting error likeFileNotFoundError: [Errno 2] No such file or directory: '/dbfs/FileStore/airlines1.csv'Could you please help me out how to run pandas dataframe in...

  • 6217 Views
  • 7 replies
  • 8 kudos
Latest Reply
Anonymous
Not applicable
  • 8 kudos

Hi @Tinendra Kumar​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Tha...

  • 8 kudos
6 More Replies
self-employed
by Contributor
  • 7090 Views
  • 2 replies
  • 6 kudos

Resolved! Can anyone help me to understand one question in PracticeExam-DataEngineerAssociate?

It is the practice exam for data engineer associateThe question is:A data engineering team has created a series of tables using Parquet data stored in an external system. The team is noticing that after appending new rows to the data in the external ...

  • 7090 Views
  • 2 replies
  • 6 kudos
Latest Reply
suny
New Contributor II
  • 6 kudos

Not an answer, just asking the databricks folks to clarify:I would also like to understand this. If there is no event emitted from the external parquet table (push) , and no active pulling or refreshing from the delta table side (pull), how is the un...

  • 6 kudos
1 More Replies
Jfoxyyc
by Valued Contributor
  • 9728 Views
  • 4 replies
  • 6 kudos

Disable dbutils.fs.put() write to console "Wrote x bytes"

Hey all, does anyone know how to suppress the output of dbutils.fs.put() ?

  • 9728 Views
  • 4 replies
  • 6 kudos
Latest Reply
Anonymous
Not applicable
  • 6 kudos

Hi @Jordan Fox​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 6 kudos
3 More Replies
jose_herazo
by New Contributor III
  • 5575 Views
  • 5 replies
  • 5 kudos

Databricks doesn't stop compute resources in GCP

I started using Databricks in Google Cloud but it charges some unexpected costs. When I create a cluster I notice some compute resources being created in GCP but when I stop the cluster these resources are still up and never shut down. This issue res...

  • 5575 Views
  • 5 replies
  • 5 kudos
Latest Reply
antquinonez
New Contributor II
  • 5 kudos

The answer to the question about the kubernetes cluster regardless of dbx compute and dwh resources running is provided in this thread: https://community.databricks.com/s/question/0D58Y00009TbWqtSAF/auto-termination-for-clusters-jobs-and-delta-live-t...

  • 5 kudos
4 More Replies
thains
by New Contributor III
  • 5241 Views
  • 6 replies
  • 2 kudos

Setting up my first DLT Pipeline with 3rd party JSON data

I'm getting an error when I try to create a DLT Pipeline from a bunch of third-party app-usage data we have. Here's the error message:Found invalid character(s) among ' ,;{}()\n\t=' in the column names of your schema. Please upgrade your Delta table ...

  • 5241 Views
  • 6 replies
  • 2 kudos
Latest Reply
thains
New Contributor III
  • 2 kudos

I found this other forum thread that looks potentially useful, but I can’t figure out either how to translate it to SQL to handle JSON, nor how to get the pipeline I’m working with to interpret the Python. When I switch to Python, it complains about ...

  • 2 kudos
5 More Replies
jonathan-dufaul
by Valued Contributor
  • 5113 Views
  • 4 replies
  • 0 kudos

What determines whether an experiment's "Rename/Permissions/Delete" context menu is active or grayed out in the Experiments page?

I have a couple of experiments in the machine learning workspace. Some I want to delete since they are clutter/were created just to test out the platform. However I can't because the options to delete them is grayed out (see pictures below)I was wond...

image.png image.png
  • 5113 Views
  • 4 replies
  • 0 kudos
Latest Reply
Harrison_S
Databricks Employee
  • 0 kudos

It looks like from the documentation the user should have the 'can Manage' permission:https://docs.databricks.com/security/access-control/workspace-acl.html#mlflow-experiment-permissions-1

  • 0 kudos
3 More Replies
Murthy1
by Contributor II
  • 9885 Views
  • 5 replies
  • 4 kudos

Send custom logs to AWS cloudwatch from Notebook

I would like to send some custom logs (in Python) from my Databricks notebook to AWS Cloudwatch. For example: df = spark.read.json(".......................")logger.info("Successfully ingested data from json")Has someone succeeded in doing this before...

  • 9885 Views
  • 5 replies
  • 4 kudos
Latest Reply
Debayan
Databricks Employee
  • 4 kudos

Hi, You can integrate, please refer: https://aws.amazon.com/blogs/mt/how-to-monitor-databricks-with-amazon-cloudwatch/ and also you can configure audit logging to S3 and redirect it to cloudwatch from AWS. , refer: https://aws.amazon.com/blogs/mt/how...

  • 4 kudos
4 More Replies
Andrei_Radulesc
by Contributor III
  • 2065 Views
  • 1 replies
  • 2 kudos

Resolved! " Please migrate to `databricks_group_role` "

With Databricks Terraform connector version 1.2.0, I use the following to make the AWS instance profile available to all users in the workspace:// Create AWS instance profileresource "aws_iam_instance_profile" "this" { name = "${var.prefix}_instance_...

  • 2065 Views
  • 1 replies
  • 2 kudos
Latest Reply
TMD
Contributor
  • 2 kudos

Hello, I opened a support case a couple of months ago specifically about this. The answer I got was"Terraform team will revert the deprecated resource and update the document accordingly.", which has not happened so far. Either provide documentation ...

  • 2 kudos
Erik_L
by Contributor II
  • 3887 Views
  • 3 replies
  • 4 kudos

Resolved! Data size inflates massively while ingesting

GoalImport and consolidate GBs / TBs of local data in 20-mb chunk parquet files into Databricks / Delta lake / partitioned tables.What I've DoneI took a small subset of data, roughly 72.5 GB and ingested using streaming below. The data is already seq...

  • 3887 Views
  • 3 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hi @Erik Louie​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

  • 4 kudos
2 More Replies
Labels