cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

ossinova
by Contributor II
  • 2070 Views
  • 1 replies
  • 1 kudos

Jobs failing with repl error

Recently my Databricks jobs have failed with the error message:Failure starting repl. Try detaching and re-attaching the notebook.   java.lang.Exception: Python repl did not start in 30 seconds seconds. at com.databricks.backend.daemon.driver.Ipyker...

  • 2070 Views
  • 1 replies
  • 1 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 1 kudos

Yes, you can use re-try if still it's not resolve raise a support ticket to databricks

  • 1 kudos
User16826992666
by Valued Contributor
  • 19547 Views
  • 2 replies
  • 2 kudos

Can I query my Delta tables with PowerBI?

I would like to connect to the Delta tables I have created with PowerBI to use for reporting. Is it possible to do this with Databricks or do I have to write my data to some other serving layer?

  • 19547 Views
  • 2 replies
  • 2 kudos
Latest Reply
gbrueckl
Contributor II
  • 2 kudos

if you want to read your Delta Lake table directly from the storage without the need of having a Databricks cluster up and running you can also use the official connector Power BI connector for Delta Lake https://github.com/delta-io/connectors/tree/m...

  • 2 kudos
1 More Replies
KVNARK
by Honored Contributor II
  • 1329 Views
  • 1 replies
  • 5 kudos

Resolved! Trigger another .py file by uisng 2 .py files.

Hi,I have 3 .py files - a.py, b.py & c.py files. By joining a.py & b.py, based on the output that I get need to trigger the c.py file.

  • 1329 Views
  • 1 replies
  • 5 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 5 kudos

Hi @KVNARK .​ refer below link this will help in thisLink

  • 5 kudos
dulu
by New Contributor III
  • 4131 Views
  • 2 replies
  • 6 kudos

Is there a function similar to split_part, json_extract_scalar?

I am using spark_sql version 3.2.1. Is there a function that can replacesplit_part,json_extract_scalarare not?

  • 4131 Views
  • 2 replies
  • 6 kudos
Latest Reply
Ankush
New Contributor II
  • 6 kudos

pyspark.sql.functions.get_json_object(col, path)[source]Extracts json object from a json string based on json path specified, and returns json string of the extracted json object. It will return null if the input json string is invalid.​

  • 6 kudos
1 More Replies
Rey
by New Contributor
  • 1218 Views
  • 1 replies
  • 0 kudos

Hi Nadia, Avail Free Exam Vouchers

Hi Nadia,I am preparing for multiple databricks certifications. Could you please provide me any events links to my email address "databrickscertificates.2022.23@gmail.com" so that I can register to the event and avail any FREE vouchers for exams.

  • 1218 Views
  • 1 replies
  • 0 kudos
Latest Reply
Nadia1
Databricks Employee
  • 0 kudos

Hello Rey, There are currently no events running that are offering free vouchers. We are offering 75% vouchers. Please check out our events page for future events: https://www.databricks.com/learn/training/homeThank you!

  • 0 kudos
avenu
by New Contributor
  • 2446 Views
  • 1 replies
  • 0 kudos

AutoLoader - process multiple files

I need to process files of different schema coming to different folders in ADLS using Autoloader. Do I need to start a separate read stream for each file type / folder or can this be handled using a single stream ?When I tried using a single stream, ...

  • 2446 Views
  • 1 replies
  • 0 kudos
Latest Reply
Wassim
New Contributor III
  • 0 kudos

As you are talking about different schemas ,perhaps schemaevolutionmode, infercolumntypes, or schemahints may help?? Check out this- 32min onward - https://youtu.be/8a38Fv9cpd8 ​Hope it helps, do let know how you solve it if you can.​

  • 0 kudos
Wassim
by New Contributor III
  • 2923 Views
  • 2 replies
  • 1 kudos

Resolved! Cancelling the exam- need to know whats policy if had scheduled the exam with voucher

I have my exam scheduled for next month ,but I am going to cancel it( i have regestered this exam using a voucher, In future i may schedule other exam ,would i be able to utilize that voucher that i used for the exam am gonna cancel? I mean could tha...

  • 2923 Views
  • 2 replies
  • 1 kudos
Latest Reply
Harun
Honored Contributor
  • 1 kudos

No, once redeemed means you cannot use the voucher again, better reschedule the exam now itself.

  • 1 kudos
1 More Replies
Sujitha
by Databricks Employee
  • 1567 Views
  • 1 replies
  • 4 kudos

Documentation Update  Databricks documentation provides how-to guidance and reference information for data analysts, data scientists, and data enginee...

Documentation Update Databricks documentation provides how-to guidance and reference information for data analysts, data scientists, and data engineers working in the Databricks Data Science & Engineering, Databricks Machine Learning, and Databricks ...

  • 1567 Views
  • 1 replies
  • 4 kudos
Latest Reply
Harun
Honored Contributor
  • 4 kudos

Thanks for sharing @Sujitha Ramamoorthy​ 

  • 4 kudos
bernardocouto
by New Contributor II
  • 1743 Views
  • 1 replies
  • 4 kudos

Resolved! Databricks SQL Connector Abstraction for Python

Databricks SQL framework, easy to learn, fast to code, ready for production.I built an abstraction of the databricks-sql-connector in order to follow a pattern closer to the concepts of ORM tools, in addition to facilitating the adoption of the data ...

  • 1743 Views
  • 1 replies
  • 4 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 4 kudos

Sure, will try and provide feedback for same

  • 4 kudos
kskistad
by New Contributor III
  • 3192 Views
  • 1 replies
  • 2 kudos

Resolved! Identity column in DLT using Python

How would I implement the Identity column in Delta Live Tables using Python syntax?GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY [ ( [ START WITH start ] [ INCREMENT BY step ] ) ] 

  • 3192 Views
  • 1 replies
  • 2 kudos
Latest Reply
LaurentLeturgez
Databricks Employee
  • 2 kudos

Hi @Kory Skistad​ Please find below the table schema definition to use in a python dlt pipeline. You can see it mentions the identity column definition. @dlt.table( comment="Raw data on sales", schema=""" customer_id STRING, customer_name STR...

  • 2 kudos
Bartek
by Contributor
  • 3033 Views
  • 2 replies
  • 1 kudos

Resolved! Spark UI simulator is not available online

About 2 weeks ago I started course on "Optimizing Apache Spark on Databricks" from official Databricks academy. It is heavily based on Spark UI simulator experiments that were available here: https://www.databricks.training/spark-ui-simulator and for...

  • 3033 Views
  • 2 replies
  • 1 kudos
Latest Reply
LandanG
Databricks Employee
  • 1 kudos

Hi @Bartosz Maciejewski​ ,Can you try loading the website without https and instead just http like http://www.databricks.training/spark-ui-simulator/ ?

  • 1 kudos
1 More Replies
Gilg
by Contributor II
  • 6097 Views
  • 4 replies
  • 5 kudos

Avro Deserialization from Event Hub capture and Autoloader

Hi All,I am getting data from Event Hub capture in Avro format and using Auto Loader to process it.I get into the point where I can read the Avro by casting the Body into a string.Now I wanted to deserialized the Body column so it will in table forma...

image image
  • 6097 Views
  • 4 replies
  • 5 kudos
Latest Reply
UmaMahesh1
Honored Contributor III
  • 5 kudos

If you still want to go with the above approach and don't want to provide schema manually, then you can fetch a tiny batch with 1 record and build the schema into a variable using a .schema option. Once done, you can add a new Body column by providin...

  • 5 kudos
3 More Replies
kskistad
by New Contributor III
  • 2117 Views
  • 0 replies
  • 1 kudos

Set and use variables in DLT pipeline notebooks

Using DLT, I have two streaming sources coming from autoloader. Source1 contains a single row of data in the file and Source2 has thousands of rows. There is a common key column between the two sources to join them together. So far, so good.I have a ...

  • 2117 Views
  • 0 replies
  • 1 kudos
mikaellognseth
by New Contributor III
  • 13608 Views
  • 7 replies
  • 0 kudos

Resolved! Databricks cluster start-up: Self Bootstrap Failure

When attempting to deploy/start an Azure Databricks cluster through the UI, the following error consistently occurs: { "reason": { "code": "SELF_BOOTSTRAP_FAILURE", "parameters": { "databricks_error_message": "Self-bootstrap failure d...

  • 13608 Views
  • 7 replies
  • 0 kudos
Latest Reply
mikaellognseth
New Contributor III
  • 0 kudos

Hi, in our case the issue turned out to be DNS... As the DNS servers set on the Databricks workspace vnet are only available when peering the "management" vnet in our setup. Took a while to figure out as the error didn't exactly give a lot of clarity...

  • 0 kudos
6 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels