cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

user_b22ce5eeAl
by New Contributor II
  • 1736 Views
  • 2 replies
  • 0 kudos

pandas udf type grouped map fails

Hello, I am trying to get the shap values for my whole dataset using pandas udf for each category of a categorical variable. It runs well when I run it on a few categories but when I want to run the function on the whole dataset my job fails. I see ...

  • 1736 Views
  • 2 replies
  • 0 kudos
Latest Reply
Jackson
New Contributor II
  • 0 kudos

I want to use data.groupby.apply() to apply a function to each row of my Pyspark Dataframe per group.I used The Grouped Map Pandas UDFs. However I can't figure out how to add another argument to my function. DGCustomerFirst SurveyI tried using the ar...

  • 0 kudos
1 More Replies
StephanieAlba
by Databricks Employee
  • 2137 Views
  • 1 replies
  • 0 kudos

Is it possible to turn off the redaction of secrets? Is there a better way to solve this?

As part of our Azure Data Factory pipeline, we utilize Databricks to run some scripts that identify which files we need to load from a certain source. This list of files is then passed back into Azure Data Factory utilizing the Exit status from the n...

  • 2137 Views
  • 1 replies
  • 0 kudos
Latest Reply
StephanieAlba
Databricks Employee
  • 0 kudos

No, it is not possible to turn off redaction. No, there is not another way to return values from a notebook.1) Using a native Databricks feature such as Autoloader is suggested.2) They could write the list of files to be processed to a delta table an...

  • 0 kudos
guruv
by New Contributor III
  • 783 Views
  • 0 replies
  • 0 kudos

Transactional approach to write to Azure ADLS gen2 storage

Hi, what is the recommended way to read data from delta table and write to ADLS gen2 storage in parquet format. In my case i use a notebook to read data do some processing and write it to storage and update delta table with detail of last written da...

  • 783 Views
  • 0 replies
  • 0 kudos
MalachiBunn
by New Contributor II
  • 1618 Views
  • 0 replies
  • 0 kudos

Toggle titles to show by default for a user or notebook

I find titles to be useful in organizing my notebooks, but I don't like having to toggle the title display for each cell in order to add a title. Is there a way to toggle the UI to show titles by default for a user/notebook? This would be a good fea...

  • 1618 Views
  • 0 replies
  • 0 kudos
MohitAnchlia
by New Contributor II
  • 1495 Views
  • 0 replies
  • 0 kudos

Accessing databricks from Presto SSQL

What's the best way to federate a query to delta lake or the databricks from presto sql without having to create external tables? PrestoSQL doesn't have access to S3. Can PrestoSQL be configured with jdbc driver or plugin?

  • 1495 Views
  • 0 replies
  • 0 kudos
User16826992666
by Valued Contributor
  • 1718 Views
  • 2 replies
  • 0 kudos

Resolved! Can I convert parquet files to Delta?

I am already storing my data as parquet files and have registered them as a table in Databricks. If I want to convert the table to be a Delta table, do I have to do a full read of the data and rewrite it in the Delta format?

  • 1718 Views
  • 2 replies
  • 0 kudos
Latest Reply
User16752244127
Contributor
  • 0 kudos

more details and programmatic options can be found in the Porting Guide.

  • 0 kudos
1 More Replies
MoJaMa
by Databricks Employee
  • 2332 Views
  • 2 replies
  • 0 kudos
  • 2332 Views
  • 2 replies
  • 0 kudos
Latest Reply
User16752244127
Contributor
  • 0 kudos

Kinesis streams is the Kinesis streaming service. Select this!Kinesis Firehose reads data from a Kinesis stream and writes it e.g. to S3 or Redshift. or Splunk (more details here)

  • 0 kudos
1 More Replies
User16826994223
by Honored Contributor III
  • 1491 Views
  • 2 replies
  • 0 kudos

What is the differentiator between delta sharing and other cloud sharing platform

What is differentiator between delta sharing and other cloud sharing platform.

  • 1491 Views
  • 2 replies
  • 0 kudos
Latest Reply
User16752244127
Contributor
  • 0 kudos

Also, unlike other servers, Delta Sharing internally uses pre-signed URLs to S3, GCS, or ADSL, so data transfer from a client happens at the bandwidth of the underlying cloud object-store. This way the Delta Sharing server scales extremely well and d...

  • 0 kudos
1 More Replies
MatthewLau
by New Contributor
  • 887 Views
  • 0 replies
  • 0 kudos

Logging Lifetime Plot_history_alive as a model

Hi Databricks Community, I have followed the CLV Databricks accelator (https://databricks.com/notebooks/CLV_Part_1_Customer_Lifetimes.html) to do an initial CLV analysis. Thank you for sharing this with the community. My question is that in the note...

0693f000007OoRXAA0
  • 887 Views
  • 0 replies
  • 0 kudos
NOOR_BASHASHAIK
by Contributor
  • 897 Views
  • 0 replies
  • 0 kudos

Read metadata through JDBC driver

Dear all, The Spark JDBC driver (SparkJDBC42.jar) is unable to capture certain information from the below table structure: 1. table level comment 2. the TBLPROPERTIES key-value pair information 3. PARTITION BY information However, it captures the co...

  • 897 Views
  • 0 replies
  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels