cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

AEM
by New Contributor
  • 1468 Views
  • 0 replies
  • 0 kudos

How to set charset encoding in SQL view?

Hi! I have a SQL query that has a where-clause that checks a string attribute not being equal to e.g. 'シミュレータに接続されていません' (Japanese). This works fine when running the query in SQL Editor ad hoc, but creating a view with the same query, the special cha...

  • 1468 Views
  • 0 replies
  • 0 kudos
Aviral-Bhardwaj
by Esteemed Contributor III
  • 8429 Views
  • 5 replies
  • 8 kudos

Resolved! MCQ of The Week (Data Engineer Associate Preparation)

A data engineer, User A, has promoted a new pipeline to production by using the REST API to programmatically create several jobs. A DataOps engineer, User B, has configured an external orchestration tool to trigger job runs through the REST API. Both...

  • 8429 Views
  • 5 replies
  • 8 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 8 kudos

@Ajay Pandey​  II really appreciate your efforts and you are right in terms of UI, but when we carefully see the question we foundWhich statement describes the contents of the workspace audit logs concerning these events?audit logs are generated and...

  • 8 kudos
4 More Replies
Siva3079
by New Contributor
  • 1112 Views
  • 0 replies
  • 0 kudos

Analytics workbench

How to implement analytics workbench in Data-bricks to access live marketplace datasets of snowflake.

  • 1112 Views
  • 0 replies
  • 0 kudos
gdoron
by New Contributor
  • 1774 Views
  • 2 replies
  • 0 kudos

using pyspark can I write to an s3 path I don't have GetObject permission to?

After spark finishes writing the dataframe to S3, it seems like it checks the validity of the files it wrote with: `getFileStatus` that is `HeadObject` behind the scenes.What if I'm only granted write and list objects permissions but not GetObject? I...

  • 1774 Views
  • 2 replies
  • 0 kudos
Latest Reply
Lakshay
Databricks Employee
  • 0 kudos

It is not possible in my opinion.

  • 0 kudos
1 More Replies
Veeru245
by New Contributor
  • 1254 Views
  • 0 replies
  • 0 kudos

Autoloader Solution for Binary files

We have solution implemented for ingesting binary file ( .ZIP ) into delta lake, Currently we are using the below solution within our pipeline.Unzip the file and extract the XML file.Parse the XML using python libraries.Flatten the nested xml columns...

  • 1254 Views
  • 0 replies
  • 0 kudos
Sweetnesh
by New Contributor
  • 2225 Views
  • 2 replies
  • 0 kudos

Not able to read S3 object through AssumedRoleCredentialProvider

SparkSession spark = SparkSession.builder() .appName("SparkS3Example") .master("local[1]") .getOrCreate(); spark.sparkContext().hadoopConfiguration().set("fs.s3a.access.key", S3_ACCOUNT_KEY); spark.sparkContext().hadoopConf...

  • 2225 Views
  • 2 replies
  • 0 kudos
Latest Reply
Vartika
Databricks Employee
  • 0 kudos

Hi @Sweetnesh Dholariya​,Does @Debayan Mukherjee​'s response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?Thanks!

  • 0 kudos
1 More Replies
NathanLaw
by New Contributor III
  • 2196 Views
  • 1 replies
  • 4 kudos

Workspace Managed Resource Group Storage Account GRS Configuration

The Azure Create Workspace function, configures the Managed Resource Group Storage Account as GRS. How can I create the workspace with the Storage Account as Local Redundant Storage (LRS)? What is the downside if code is stored in GitHub and the...

  • 2196 Views
  • 1 replies
  • 4 kudos
Latest Reply
Tejal_116985
New Contributor II
  • 4 kudos

https://learn.microsoft.com/en-us/azure/databricks/administration-guide/workspace/workspace-storage-redundancy

  • 4 kudos
vijaykumarbotla
by New Contributor III
  • 10827 Views
  • 4 replies
  • 0 kudos

Resolved! Failed to merge fields 'LIFNR' and 'LIFNR'. Failed to merge incompatible data types IntegerType and StringType

I am have imported a csv file using spark.read method, i have used custom schema and declared the type of the column as string.i have delta table and the type of the column in the table is also string. I am getting failed to merge fields errors in sp...

  • 10827 Views
  • 4 replies
  • 0 kudos
Latest Reply
vijaykumarbotla
New Contributor III
  • 0 kudos

Hi All,the issue is resolved, i have executed column conversion and from next run the code is working fine.df = spark.read.format("delta").load("/mnt/dev/deltav2/X")df= df.withColumn("LIFNR", df.LIFNR.cast("string"))df.write.format('delta').option("o...

  • 0 kudos
3 More Replies
Dhana
by New Contributor
  • 778 Views
  • 0 replies
  • 0 kudos

Databricks and Kafka connectivity

I am trying to read data from Kafka, which is installed on my local system. I am using Databricks Community Edition with a cluster version of 12.2. However, I am unable to read data from Kafka. My use case is to read data from Kafka installed on my l...

  • 778 Views
  • 0 replies
  • 0 kudos
NOOR_BASHASHAIK
by Contributor
  • 1783 Views
  • 1 replies
  • 1 kudos

Unity Catalog - addition of account groups/users to workspaces

Hi allwe have set-up metastore, and were doing certain activities as part of MVP.we realized in a particular databricks workspace that was enabled with UC, in admin settings > "Add Groups" section, user groups from other platforms/projects which leve...

  • 1783 Views
  • 1 replies
  • 1 kudos
Latest Reply
Ismail1
New Contributor III
  • 1 kudos

Good question, didn't think of it that way, from my understanding UC uses users pushed from the account console and not from workspaces, One way to restrict would be to restrict other workspaces from using said catalog and also control ACLs with the ...

  • 1 kudos
del1000
by New Contributor III
  • 836 Views
  • 0 replies
  • 0 kudos

Problem with sparkContext.parallelize and volatile functions?

I have a code:from time import sleep from random import random from operator import add   def f(a: int) -> float: sleep(0.1) return random() rdd1 = sc.parallelize(range(20), 2) rdd2 = sc.parallelize(range(20), 2) rdd3 = sc.parallelize(rang...

  • 836 Views
  • 0 replies
  • 0 kudos
MohamedThanveer
by New Contributor II
  • 1555 Views
  • 2 replies
  • 0 kudos

Databricks Certified Associate Developer for Apache Spark 3.0 - Python refund

I have scheduled an examination on 1st June 2023 and due to personal reason, I have cancelled the examination on 26th May 2023 (more than 72 hours) but I am yet to receive the refund amount. In the auto generated mail it is mentioned that the refund ...

  • 1555 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Mohamed Thanveer​ Thank you for reaching out! Please submit a ticket to our Training Team here: https://help.databricks.com/s/contact-us?ReqType=training  and our team will get back to you shortly. 

  • 0 kudos
1 More Replies
yhyhy3
by New Contributor III
  • 18296 Views
  • 1 replies
  • 2 kudos

Best way to make a scrollable DataFrame in a ipywidgets.Output?

I have a pretty complex Jupyter widgets UI in a databricks notebook that has a dataframe that1. will be modified by some Jupyter widget callbacks2. needs to be displayed to the user and updated as it is modified3. is large and needs to support vertic...

  • 18296 Views
  • 1 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Yushi Homma​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers y...

  • 2 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels