cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

bradm0
by New Contributor III
  • 2895 Views
  • 3 replies
  • 3 kudos

Resolved! Use of badRecordsPath in COPY INTO SQL command

I'm trying to use the badRecordsPath to catch improperly formed records in a CSV file and continue loading the remainder of the file. I can get the option to work using python like thisdf = spark.read\ .format("csv")\ .option("header","true")\ .op...

  • 2895 Views
  • 3 replies
  • 3 kudos
Latest Reply
bradm0
New Contributor III
  • 3 kudos

Thanks. It was the inferSchema setting. I tried it with and without the SELECT and it worked both ways when I added inferSchemaBoth of these workeddrop table my_db.t2; create table my_db.t2 (col1 int,col2 int); copy into my_db.t2 from (SELECT cast(...

  • 3 kudos
2 More Replies
Meghala
by Valued Contributor II
  • 1676 Views
  • 2 replies
  • 2 kudos
  • 1676 Views
  • 2 replies
  • 2 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 2 kudos

Hi @S Meghala​ ,Please go through this Github link you will get good amount of data here ,this way you can learn morehttps://github.com/AlexIoannides/pyspark-example-projectPlease select my answer as best answer if your query is fulfilled ThanksAvira...

  • 2 kudos
1 More Replies
hello_world
by New Contributor III
  • 3804 Views
  • 7 replies
  • 3 kudos

What happens if I have both DLTs and normal tables in a single notebook?

I've just learned Delta Live Tables on Databricks Academy and have no environment to try it out.I'm wondering what happens to the pipeline if the notebook consists of both normal tables and DLTs. For exampleTable ADLT A that reads and cleans Table AT...

  • 3804 Views
  • 7 replies
  • 3 kudos
Latest Reply
Rishabh-Pandey
Esteemed Contributor
  • 3 kudos

hey ,@S L​  According to you , you have normal table table A and DLT table Table B , so it will give thrown an error that your upstream table is not streaming Live table and you need to create streaming live table Table a , if you want to use the ou...

  • 3 kudos
6 More Replies
THIAM_HUATTAN
by Valued Contributor
  • 2226 Views
  • 2 replies
  • 2 kudos

Subquery does not work in Databricks Community version?

I am testing some SQL code based on the book SQL Cookbook Second Edition, available from https://downloads.yugabyte.com/marketing-assets/O-Reilly-SQL-Cookbook-2nd-Edition-Final.pdfBased on Page 43, I am OK with the left join, as shown here:However, w...

image image
  • 2226 Views
  • 2 replies
  • 2 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 2 kudos

it must have some github link check there or you cans hare your code and data we can help you

  • 2 kudos
1 More Replies
MC006
by New Contributor III
  • 6874 Views
  • 4 replies
  • 2 kudos

Resolved! java.lang.NoSuchMethodError after upgrade to Databricks Runtime 11.3 LTS

Hi,  I am using Databricks and want to upgrade to Databricks runtime version 11.3 LTS which uses Spark 3.3 now. Current system enviroment:Operating System: Ubuntu 20.04.4 LTSJava: Zulu 8.56.0.21-CA-linux64Python: 3.8.10Delta Lake: 1.1.0Target system ...

  • 6874 Views
  • 4 replies
  • 2 kudos
Latest Reply
Meghala
Valued Contributor II
  • 2 kudos

Hi everyone this data was helped me thanks ​

  • 2 kudos
3 More Replies
uzadude
by New Contributor III
  • 11307 Views
  • 5 replies
  • 3 kudos

Adding to PYTHONPATH in interactive Notebooks

I'm trying to set PYTHONPATH env variable in the cluster configuration: `PYTHONPATH=/dbfs/user/blah`. But in the driver and executor envs it is probably getting overridden and i don't see it.`%sh echo $PYTHONPATH` outputs:`PYTHONPATH=/databricks/spar...

  • 11307 Views
  • 5 replies
  • 3 kudos
Latest Reply
uzadude
New Contributor III
  • 3 kudos

Update:At last found a (hacky) solution!in the driver I can dynamically set the sys.path in the workers with:`spark._sc._python_includes.append("/dbfs/user/blah/")`combine that with, in the driver:```%load_ext autoreload%autoreload 2```and setting: `...

  • 3 kudos
4 More Replies
rocky5
by New Contributor III
  • 2186 Views
  • 2 replies
  • 2 kudos

DLT UDF and c#

Hello, can I create spark function in .net and use it in DLT table? I would like to encrypt some data, in documentation scala code is being used as an example, but would it be possible to write decryption/encryption function using C# and use it withi...

  • 2186 Views
  • 2 replies
  • 2 kudos
Latest Reply
Meghala
Valued Contributor II
  • 2 kudos

It's not possible. SQL Server 2008 contains SQL CLR runtime that runs .NET languages.

  • 2 kudos
1 More Replies
Aravind_P04
by New Contributor II
  • 3607 Views
  • 3 replies
  • 4 kudos

Clarification on merging multiple notebooks and other

1. Do we have any feature like merge the cells from one or more notebooks into other notebook.2. Do we have any feature like multiple cells from excel is copied it into multiple cells in a notebook . Generally all excel data is copied it into one cel...

  • 3607 Views
  • 3 replies
  • 4 kudos
Latest Reply
youssefmrini
Databricks Employee
  • 4 kudos

1) We can't merge cells right now2)We don't have this feature as well3) We don't have multiple editing right now4)You will know only if you face an error. A Notification will pop up5)You can"t keep running the execution because the cells can be linke...

  • 4 kudos
2 More Replies
Aviral-Bhardwaj
by Esteemed Contributor III
  • 2944 Views
  • 6 replies
  • 30 kudos

DLT PipeLine Understanding

Hey, guys, I hope you are doing very well today I was going through some databricks documentation and I found dlt documentation but when I am trying to implement it, it is not working very well can anyone can share with me whole code step by step and...

  • 2944 Views
  • 6 replies
  • 30 kudos
Latest Reply
Meghala
Valued Contributor II
  • 30 kudos

even Im also going through some databricks documentation

  • 30 kudos
5 More Replies
hello_world
by New Contributor III
  • 3703 Views
  • 3 replies
  • 2 kudos

What exact difference does Auto Loader make?

New to Databricks and here is one thing that confuses me.Since Spark Streaming is already capable of incremental loading by checkpointing. What difference does it make by enabling Auto Loader?

  • 3703 Views
  • 3 replies
  • 2 kudos
Latest Reply
Meghala
Valued Contributor II
  • 2 kudos

Auto Loader provides a Structured Streaming source called cloudFiles. Given an input directory path on the cloud file storage, the cloudFiles source automatically processes new files as they arrive, with the option of also processing existing files i...

  • 2 kudos
2 More Replies
KuldeepChitraka
by New Contributor III
  • 7869 Views
  • 4 replies
  • 6 kudos

Error handling/exception handling in NOtebook

What is a common practice to to write notebook which includes error handling/exception handling.Is there any example which depicts how notebook should be written to include error handling etc.

  • 7869 Views
  • 4 replies
  • 6 kudos
Latest Reply
Meghala
Valued Contributor II
  • 6 kudos

runtime looks for handlers (try-catch) that are registered to handle such exceptions

  • 6 kudos
3 More Replies
Aviral-Bhardwaj
by Esteemed Contributor III
  • 10511 Views
  • 3 replies
  • 25 kudos

Understanding Joins in PySpark/Databricks In PySpark, a `join` operation combines rows from two or more datasets based on a common key. It allows you ...

Understanding Joins in PySpark/DatabricksIn PySpark, a `join` operation combines rows from two or more datasets based on a common key. It allows you to merge data from different sources into a single dataset and potentially perform transformations on...

  • 10511 Views
  • 3 replies
  • 25 kudos
Latest Reply
Meghala
Valued Contributor II
  • 25 kudos

very informative

  • 25 kudos
2 More Replies
FranPérez
by New Contributor III
  • 11205 Views
  • 7 replies
  • 4 kudos

set PYTHONPATH when executing workflows

I set up a workflow using 2 tasks. Just for demo purposes, I'm using an interactive cluster for running the workflow. { "task_key": "prepare", "spark_python_task": { "python_file": "file...

  • 11205 Views
  • 7 replies
  • 4 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 4 kudos

Hi @Fran Pérez​,Just a friendly follow-up. Did any of the responses help you to resolve your question? if it did, please mark it as best. Otherwise, please let us know if you still need help.

  • 4 kudos
6 More Replies
Jessica1
by New Contributor
  • 575 Views
  • 0 replies
  • 0 kudos

There are a number of risks associated with using social networking sites for Dating services, including the possible exploitation of minors, the pote...

There are a number of risks associated with using social networking sites for Dating services, including the possible exploitation of minors, the potential for human trafficking, and the possibility of illegal activities such as money laundering traf...

  • 575 Views
  • 0 replies
  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels