cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

117074
by New Contributor III
  • 759 Views
  • 3 replies
  • 0 kudos

Notebook Visualisations suddenly not working

Hi all,I have a python script which runs SQL code against our Delta Live Tables and returns a pandas dataframe. I do this multiple times and then use 'display(pandas_dataframe)'. Once this displays I then create a visualization from the UI which is t...

  • 759 Views
  • 3 replies
  • 0 kudos
Latest Reply
117074
New Contributor III
  • 0 kudos

Thank you for the detailed response Kaniz, I appreciate it! I do think it may have been cache issues due to there being no spark computation when running them when the error occured.It did lead me down a train of thought.. is it possible to extract t...

  • 0 kudos
2 More Replies
RaccoonRadio
by New Contributor
  • 2388 Views
  • 2 replies
  • 0 kudos

Resolved! Difference @dlt.table and @dlt.create_table decorator

Hi!I'm currently trying to stream data from an Azure Event Hub (kafka) using DLT. The provided example (https://learn.microsoft.com/en-us/azure/databricks/delta-live-tables/event-hubs) works well.I saw in different examples the usage of two different...

  • 2388 Views
  • 2 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @RaccoonRadio ,  In Databricks Delta Live Tables (DLT), both @dlt.table and @dlt.create_table decorators are used, but they serve slightly different purposes. Here's the distinction: @dlt.table: This decorator is used to define a Delta ...

  • 0 kudos
1 More Replies
Coders
by New Contributor II
  • 962 Views
  • 4 replies
  • 0 kudos

Feedback on the data quality and consistency checks in Spark

I'm seeking validation from experts regarding the data quality and consistency checks we're implementing as part of a data migration using Spark and Databricks.Our migration involves transferring data from Azure Data Lake to a different data lake. As...

  • 962 Views
  • 4 replies
  • 0 kudos
Latest Reply
Coders
New Contributor II
  • 0 kudos

Hi,Thank you for the response. When we say we are copying, it's a data migration from one data lake to another. Not performing any kind of DDL or DML queries using spark SQL on top it. It's a straightforward merge from one data lake to another using ...

  • 0 kudos
3 More Replies
AshR
by New Contributor III
  • 518 Views
  • 2 replies
  • 0 kudos

Lakehouse Fundamentals Certificate/Badge not received/appearing.

Hello @Nadia1  , Ref :  request (00445798)  I raise a ticket for Lakehouse Fundamentals Certificate/Badge not appearing.I got a response that an email has been sent.Still,  I have not received badge email,  also its not reflecting in academy page (ht...

  • 518 Views
  • 2 replies
  • 0 kudos
Latest Reply
Nadia1
Honored Contributor
  • 0 kudos

Hello, Please submit a support ticket and our team will help you asap. Thank you

  • 0 kudos
1 More Replies
MichTalebzadeh
by Contributor
  • 554 Views
  • 2 replies
  • 1 kudos

Working with a text file that is both compressed by bz2 followed by zip in PySpark

 I have downloaded Am azon reviews for sentiment analysis from here. The file is not particularly large (just over 500MB) but comes in the following formattest.ft.txt.bz2.zipSo it is a text file that is compressed by bz2 followed by zip. Now I like t...

Data Engineering
bz2
pyspark
zip
  • 554 Views
  • 2 replies
  • 1 kudos
Latest Reply
MichTalebzadeh
Contributor
  • 1 kudos

Thanks for your reply @Kaniz On the face of it spark can handle both .bz2 and .zip . It practice it does not work with both at the same time. You end up with ineligible characters as text. I suspect it handles decompression of outer layer (in this ca...

  • 1 kudos
1 More Replies
spark_user1
by New Contributor
  • 208 Views
  • 1 replies
  • 0 kudos

Whitelisting GraphFrame Jar files does not work for shared compute.

Hello,I'm encountering a Py4JSecurityException while using the GraphFrames jar library in a job task with shared compute. Despite following all documentation to whitelist my jar libraries in Volumes and ensuring compatibility with my Spark and Scala ...

  • 208 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @spark_user1, I understand that you’re facing a Py4JSecurityException while working with the GraphFrames jar library in a job task with shared compute. Let’s tackle this issue step by step: Whitelisting JAR Libraries: You mentioned that you’v...

  • 0 kudos
Miro_ta
by New Contributor III
  • 6389 Views
  • 9 replies
  • 4 kudos

Resolved! Can't query delta tables, token missing required scope

Hello,I've correctly set up a stream from kinesis, but I can't read anything from my delta tableI'm actually reproducing the demo from Frank Munz: https://github.com/fmunz/delta-live-tables-notebooks/tree/main/motion-demoand I'm running the following...

  • 6389 Views
  • 9 replies
  • 4 kudos
Latest Reply
holly
New Contributor III
  • 4 kudos

Hello, I also had this issue. It was because I was trying to read a DLT table with a Machine Learning Runtime. At time of writing, Machine Learning Runtimes are not compatible with shared access mode, so I ended up setting up two clusters, one MLR as...

  • 4 kudos
8 More Replies
SamDataWalk
by New Contributor III
  • 679 Views
  • 3 replies
  • 1 kudos

Resolved! Databricks bug with show tblproperties - redacted - Azure databricks

I am struggling to report what is a fairly fundamental bug. Can anyone help? Ideally someone from Databricks themselves. Or others who can confirm they can replicate it.There is a bug where databricks seems to be hiding “any” properties which have th...

  • 679 Views
  • 3 replies
  • 1 kudos
Latest Reply
SamDataWalk
New Contributor III
  • 1 kudos

I managed to get a response back from support at databricks.Admittedly it is a bit nuclear, but there is a way of switching it off.spark.conf.set("spark.databricks.behaviorChange.SC102534CommandRedactProperties.enabled", False)So, I have managed to u...

  • 1 kudos
2 More Replies
Ela
by New Contributor III
  • 639 Views
  • 1 replies
  • 1 kudos

Checking for availability of dynamic data masking functionality in SQL.

I am looking forward for functionality similar to snowflake which allows attaching masking to a existing column. Documents found related to masking with encryption but my use case is on the existing table. Solutions using views along with Dynamic Vie...

  • 639 Views
  • 1 replies
  • 1 kudos
Latest Reply
sivankumar86
New Contributor II
  • 1 kudos

Unity catalog provide similar feature https://docs.databricks.com/en/data-governance/unity-catalog/row-and-column-filters.html

  • 1 kudos
thethirtyfour
by New Contributor III
  • 815 Views
  • 3 replies
  • 1 kudos

Resolved! error installing the igraph and networkD3 library

Hi!I am trying to install the igraph and networkD3 CRAN packages for use within a notebook. However, I am receiving the below installation error when attempting to do so.Could someone please assist? Thank you!* installing *source* package ‘igraph’ .....

  • 815 Views
  • 3 replies
  • 1 kudos
Latest Reply
haleight-dc
New Contributor III
  • 1 kudos

Hi! I just figured this out myself. I'm not sure why this is suddenly occurring, since igraph has always loaded fine for me in databricks but didn't this week. I found that the following solution worked.In your notebook before installing your R libra...

  • 1 kudos
2 More Replies
159312
by New Contributor III
  • 905 Views
  • 3 replies
  • 2 kudos

How to write log entries from a Delta Live Table pipeline.

From a notebook I can import the log4j logger from cs and write to a log like so:log4jLogger = sc._jvm.org.apache.log4jLOGGER = log4jLogger.LogManager.getLogger(__name__)LOGGER.info("pyspark script logger initialized")But this does not work in a Delt...

  • 905 Views
  • 3 replies
  • 2 kudos
Latest Reply
Kaniz
Community Manager
  • 2 kudos

Hi @Ben Bogart​, The event log for each pipeline is stored in a Delta table in DBFS. You can view event log entries in the Delta Live Tables user interface, the Delta Live Tables API, or by directly querying the Delta table. This article focuses on q...

  • 2 kudos
2 More Replies
addy
by New Contributor III
  • 750 Views
  • 3 replies
  • 2 kudos

Reading a table from a catalog that is in a different/external workspace

I am trying to read a table that is hosted on a different workspace. We have been told to establish a connection to said workspace using a table and consume the table.Code I am using isfrom databricks import sqlconnection = sql.connect(server_hostnam...

Data Engineering
catalog
Databricks
sql
  • 750 Views
  • 3 replies
  • 2 kudos
Latest Reply
Kaniz
Community Manager
  • 2 kudos

Hey there! Thanks a bunch for being part of our awesome community!  We love having you around and appreciate all your questions. Take a moment to check out the responses – you'll find some great info. Your input is valuable, so pick the best solution...

  • 2 kudos
2 More Replies
Data_Engineer3
by Contributor II
  • 861 Views
  • 3 replies
  • 0 kudos

live spark driver log analysis

In databricks, if we want to see the live log of the exuction we can able to see it from the driver log page of the cluster.But in that we can't able to search by key word instead of that we need to download every one hour log file and live logs are ...

  • 861 Views
  • 3 replies
  • 0 kudos
Latest Reply
Data_Engineer3
Contributor II
  • 0 kudos

Hi @shan_chandra ,It is like we are putting our driver log into another cloud platform, But here I want to check the live log in local machine tools, is this possible? 

  • 0 kudos
2 More Replies
akhileshp
by New Contributor III
  • 741 Views
  • 6 replies
  • 0 kudos

Query Serverless SQL Warehouse from Spark Submit Job

I am trying to load data from a table in SQL warehouse using spark.sql("SELECT * FROM <table>") in a spark submit job, but the job is failing with [TABLE_OR_VIEW_NOT_FOUND] The table or view . The same statement is working in notebook but not in a jo...

  • 741 Views
  • 6 replies
  • 0 kudos
Latest Reply
Wojciech_BUK
Contributor III
  • 0 kudos

- when you query table manually and running job - do both those actions happens in same Databricks Workspace- what is job configuration - who is job Owner or Run As Account -> do this principal/persona has access to the table ?

  • 0 kudos
5 More Replies
User16826987838
by Contributor
  • 1103 Views
  • 2 replies
  • 0 kudos

Convert pdf's is into structured data

Is there anything on Databricks to help read PDF (payment invoices and receipts for example) and convert it to structured data?

  • 1103 Views
  • 2 replies
  • 0 kudos
Latest Reply
SoniaFoster
New Contributor II
  • 0 kudos

Thanks! Converting PDF format is sometimes a difficult task as not all converters provide accuracy. I want to share with you one interesting tool I recently discovered that can make your work even more efficient. I recently came across an amazing onl...

  • 0 kudos
1 More Replies
Labels
Top Kudoed Authors