cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

MarcoData01
by New Contributor III
  • 2797 Views
  • 6 replies
  • 4 kudos

Resolved! Is there the possibility to protect Init script folder on DBFS

Hi everyone,We are looking for a way to protect the folder where init script is hosted from editing.This because we have implemented inside init script a parameter that blocks the download file from R Studio APP Emulator and we would like to avoid th...

  • 2797 Views
  • 6 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hi @Marco Data​ Thank you for sending in your question. It is awesome that you found a solution. Would you like to mark the answer as best so others can find the solution quickly?Cheers!

  • 4 kudos
5 More Replies
ChriChri
by New Contributor II
  • 4464 Views
  • 2 replies
  • 4 kudos

Azure Databricks Delta live table tab is missing

In my Azure Databricks workspace UI I do not have the tab "Delta live tables". In the documentation it says that there is a tab after clicking on Jobs in the main menu. I just created this Databricks resource in Azure and from my understanding the DL...

  • 4464 Views
  • 2 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hi @Chr Jon​ How are you doing? Thanks for posting your question. Just checking in to see if one of the answers helped, would you let us know?

  • 4 kudos
1 More Replies
Mark1
by New Contributor II
  • 1972 Views
  • 2 replies
  • 2 kudos

Resolved! Using Delta Tables without Time Travel features?

Hi Everyone / Experts,is it possible to use Delta Tables without the Time Travel features? We are primarily interested in using the DML Features (delete, update, merge into, etc)Thanks,Mark

  • 1972 Views
  • 2 replies
  • 2 kudos
Latest Reply
Mark1
New Contributor II
  • 2 kudos

Thank you Hubert

  • 2 kudos
1 More Replies
haseebkhan1421
by New Contributor
  • 13480 Views
  • 2 replies
  • 1 kudos

Resolved! How can I access python variable in Spark SQL?

I have python variable created under %python in my jupyter notebook file in Azure Databricks. How can I access the same variable to make comparisons under %sql. Below is the example:%python RunID_Goal = sqlContext.sql("SELECT CONCAT(SUBSTRING(RunID,...

  • 13480 Views
  • 2 replies
  • 1 kudos
Latest Reply
Nirupam
New Contributor III
  • 1 kudos

You can use {} in spark.sql() of pyspark/scala instead of making a sql cell using %sql.This will result in a dataframe. If you want you can create a view on top of this using createOrReplaceTempView()Below is an example to use a variable:-# A variab...

  • 1 kudos
1 More Replies
Sugumar_Sriniva
by New Contributor III
  • 7328 Views
  • 11 replies
  • 5 kudos

Resolved! Data bricks cluster creation is failing while running the Cron job scheduling script through init script method from azure data bricks.

Dear connections,I'm unable to run a shell script which contains scheduling a Cron job through init script method on Azure Data bricks cluster nodes.Error from Azure Data bricks workspace:"databricks_error_message": "Cluster scoped init script dbfs:/...

  • 7328 Views
  • 11 replies
  • 5 kudos
Latest Reply
User16764241763
Honored Contributor
  • 5 kudos

Hello @Sugumar Srinivasan​  Could you please enable cluster log delivery and inspect the INIT script logs in the below path dbfs:/cluster-logs/<clusterId>/init_scripts path.https://docs.databricks.com/clusters/configure.html#cluster-log-delivery-1

  • 5 kudos
10 More Replies
sannycse
by New Contributor II
  • 3688 Views
  • 4 replies
  • 6 kudos

Resolved! read the csv file as shown in description

Project_Details.csvProjectNo|ProjectName|EmployeeNo100|analytics|1100|analytics|2101|machine learning|3101|machine learning|1101|machine learning|4Find each employee in the form of list working on each project?Output:ProjectNo|employeeNo100|[1,2]101|...

  • 3688 Views
  • 4 replies
  • 6 kudos
Latest Reply
User16764241763
Honored Contributor
  • 6 kudos

@SANJEEV BANDRU​  You can simply do thisJust change the file path CREATE TEMPORARY VIEW readcsv USING CSV OPTIONS ( path "dbfs:/docs/test.csv", header "true", delimiter "|", mode "FAILFAST");select ProjectNo, collect_list(EmployeeNo) Employeesfrom re...

  • 6 kudos
3 More Replies
weldermartins
by Honored Contributor
  • 3491 Views
  • 5 replies
  • 13 kudos

Hello everyone, I have a directory with 40 files. File names are divided into prefixes. I need to rename the prefix k3241 according to the name in the...

Hello everyone, I have a directory with 40 files.File names are divided into prefixes. I need to rename the prefix k3241 according to the name in the last prefix.I even managed to insert the csv extension at the end of the file. but renaming files ba...

Template
  • 3491 Views
  • 5 replies
  • 13 kudos
Latest Reply
Anonymous
Not applicable
  • 13 kudos

Hi @welder martins​ How are you doing?Thank you for posting that question. We are glad you could resolve the issue. Would you want to mark an answer as the best solution?Cheers

  • 13 kudos
4 More Replies
cristianc
by Contributor
  • 2544 Views
  • 5 replies
  • 3 kudos

Is it required to run OPTIMIZE after doing GDPR DELETEs?

Greetings,I have been reading the excellent article from https://docs.databricks.com/security/privacy/gdpr-delta.html?_ga=2.130942095.1400636634.1649068106-1416403472.1644480995&_gac=1.24792648.1647880283.CjwKCAjwxOCRBhA8EiwA0X8hi4Jsx2PulVs_FGMBdByBk...

  • 2544 Views
  • 5 replies
  • 3 kudos
Latest Reply
cristianc
Contributor
  • 3 kudos

@Hubert Dudek​ thanks for the hint, exactly as written in the article VACUUM is required after the GDPR delete operation, however do we need to OPTIMIZE ZSORT again the table or is the ordering maintained?

  • 3 kudos
4 More Replies
kdkoa
by New Contributor III
  • 2689 Views
  • 4 replies
  • 2 kudos

Resolved! Random SMTP authentication failures to Office 365 (Exchange)

Hey all-I have a python script running in databricks notebook which uses smtplib to connect and send email via our Exchange online server. At random times, it will start getting authentication failures and I can't figure out why. I've confirmed that ...

  • 2689 Views
  • 4 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

If message is "'bad username or password.'" my guess is that it is on Exchange side.

  • 2 kudos
3 More Replies
athjain
by New Contributor III
  • 5180 Views
  • 5 replies
  • 7 kudos

Resolved! How to query deltatables stored in s3 through databricks SQL Endpoint?

the delta tables after ETL are stored in s3 in csv or parquet format, so now question is how to allow databricks sql endpoint to run query over s3 saved files

  • 5180 Views
  • 5 replies
  • 7 kudos
Latest Reply
Anonymous
Not applicable
  • 7 kudos

Hey @Athlestan Jain​ How are you doing?Thanks for posting your question. Do you think you were able to resolve the issue?We'd love to hear from you.

  • 7 kudos
4 More Replies
_Orc
by New Contributor
  • 2760 Views
  • 2 replies
  • 1 kudos

Resolved! Checkpoint is getting created even the though the microbatch append has failed

Use caseRead data from source table using structured spark streaming(Round the clock).Apply transformation logic etc etc and finally merge the dataframe in the target table.If there is any failure during transformation or merge ,databricks job should...

  • 2760 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Om Singh​ Hope you are doing well. Just wanted to check in and see if you were able to find a solution to your question?Cheers

  • 1 kudos
1 More Replies
Databricks_7045
by New Contributor III
  • 3537 Views
  • 2 replies
  • 4 kudos

Resolved! Connecting Delta Tables from any Tools

Hi Team,To access SQL Tables we use tools like TOAD , SQL SERVER MANAGEMENT STUDIO (SSMS).Is there any tool to connect and access Databricks Delta tables.Please let us know.Thank you

  • 3537 Views
  • 2 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hi @Rajesh Vinukonda​ Hope you are doing well. Thanks for sending in your question. Were you able to find a solution to your query?

  • 4 kudos
1 More Replies
Krishscientist
by New Contributor III
  • 2121 Views
  • 3 replies
  • 2 kudos

Resolved! Py Spark Pandas Code diff

Hi Can you help me why Pandas code not working..but Pyspark is working..import pandas as pdpdf = pd.read_csv('/FileStore/tables/new.csv',sep=',')Error : No such file exists...below is worked..df = spark.read.csv("/FileStore/tables/new.csv", sep=",", ...

  • 2121 Views
  • 3 replies
  • 2 kudos
Latest Reply
RRO
Contributor
  • 2 kudos

It might has to do with the path as @Hubert Dudek​  already mentioned: df = spark.read.csv("dbfs:/FileStore/tables/new.csv", sep=",", header='True')

  • 2 kudos
2 More Replies
RRO
by Contributor
  • 30299 Views
  • 6 replies
  • 7 kudos

Resolved! Performance for pyspark dataframe is very slow after using a @pandas_udf

Hello,I am currently working on a time series forecasting with FBProphet. Since I have data with many time series groups (~3000) I use a @pandas_udf to parallelize the training. @pandas_udf(schema, PandasUDFType.GROUPED_MAP) def forecast_netprofit(pr...

  • 30299 Views
  • 6 replies
  • 7 kudos
Latest Reply
RRO
Contributor
  • 7 kudos

Thank you for the answers. Unfortunately this did not solve the performance issue.What I did now is I saved the results into a table:results.write.mode("overwrite").saveAsTable("db.results") This is probably not the best solution but after I do that ...

  • 7 kudos
5 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels