Hi all,Using databricks-connect 11.3.19, I get an "java.lang.ClassCastException" when attempting to timetravel. The exact same statement works fine when executed in the databricks GUI directly. Any ideas on what's going on? Is this a known limitation...
0I have a Spark application (using Java library) which needs to replicate data from one blob storage to another. I have created a readStream() within it which is listening continuously to a Kafka topic for incoming events. The corresponding writeStre...
The problem was indeed with the way ClassLoader was being set in the ForkJoinPool (common Pool used) thread. Spark in SparkClassUtils uses Thread.currentThread().getContextClassLoader which might behave differently in another thread.To solve it I cre...
Hello Everyone, Is there a way we can execute the delete query from azure notebook on azure synapse database.I tried using read api method with option "query" but getting error like jdbc connector not able to handle code.Can any suggest how we can de...
Just launched: The Big Book of Data Warehousing and BI, a new hands-on guide focused on real-world use cases from governance, transformation, analytics and AI.As the demand for data becomes insatiable in every company, the data infrastructure has bec...
lol beans It used to take me a long time to regain my equilibrium, but recently I learned that a website really leads this layout when you may find delight after a stressful day here. Since then, I've been able to find my equilibrium much more quickl...
Is is possible to create new notebooks from a notevbook in databricks? I have tried this code. But all of them are generic files, not notebooks.notebook_str = """# Databricks notebook source
import pyspark.sql.functions as F
import numpy as np
# CO...
Unfortunaly %run does not help me since I can't %run a .py file. I still need my code in notebooks.I am transpiling propriatary code to python using jinja templates. I would like to have the output as notebooks since those are most convenient to edit...
Hi,This is my first databricks project. I am loading data from a UC external volume in ADLS into tables and then split one of the tables into two tables based on a column. When I create a pipeline, the tables don't have any dependencies and this is...
While re-implementing my pipeline to publish to dev/test/prod instead of bronze/silver/gold, I think I found the answer. The downstream tables need to use the LIVE schema.
I am loading data from CSV into live tables. I have a live delta table with data like this:WaterMeterID, ReadingDateTime1, ReadingValue1, ReadingDateTime2, ReadingValue2It needs to be unpivoted into this:WaterMeterID, ReadingDateTime1, ReadingValue1...
Delve into the transformative realm of AI applications, where innovation merges seamlessly with technology's limitless possibilities.Explore the multifaceted landscape of AI uses and its dynamic impact on diverse industries at StackOfTuts.
Hi @Kroy , When it comes to shared compute resources in Databricks, there are some best practices and options you can consider:
Shared Access Mode for Clusters:
Databricks allows you to create clusters in shared access mode. This means that multipl...
Hi all! I am having the following issue with a couple of pyspark streams. I have some notebooks running each of them an independent file structured streaming using delta bronze table (gzip parquet files) dumped from kinesis to S3 in a previous job....
Hi @patojo94, You're encountering an issue with malformed records in your PySpark streams.
Let's explore some potential solutions:
Malformed Record Handling:
The error message indicates that there are malformed records during parsing. By default...
Hi @Cert-Team ,My Databricks exam got suspended on December 9, 2023, at 11:30, and it is still in the suspended state.During the exam, it was initially paused due to poor lighting, but after addressing that, it worked fine. However, after some time, ...
Hi @Jay_adb I'm sorry to hear you had this issue. Thanks for filing a ticket with the support team. I have sent a message to them to look into your ticket and resolve asap.
Is there a way to perform a dry-run with "bundle deploy" in order to see the job configuration changes for an environment without actually deploying the changes?
Hello Community Members,
We value your experience and want to make it even better! Help us shape the future by sharing your thoughts through our quick Survey.
Ready to have your voice heard? Click here and take a few moments to complete the surv...
Hi Databricks Gurus !I am trying to run a very simple snippet :data_emp=[["1","sarvan","1"],["2","John","2"],["3","Jose","1"]]emp_columns=["EmpId","Name","Dept"]df=spark.createDataFrame(data=data_emp, schema=emp_columns)df.show() --------Based on a g...
I want to express my gratitude for your effort in selecting the most suitable solution. It's great to hear that your query has been successfully resolved. Thank you for your contribution.
We are trying to capture the query executed by spark .We are trying to use df.queryExecution.redactedSql to get the SQL from query execution but it is not working in sqlListener