Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
Hi All,my question is in regard to how data in salesforce data cloud gets unified based on client profiles. Can similar action be done on data in databricks. i believe unity catalog just provides unified layer for security and governance. is there a ...
You want to identify actual persons based on one or more profiles (based on e-mail address etc). That is something that is not available out-of-the box in Databricks. The 'unified' in Databricks means you have a single platform for several data top...
Why does something like df.style.hide_index() turn out so ugly in Databricks? That command should show the dataframe pretty like always, but simply with the index column concealed. Instead, here's an image of what happens instead (displayi...
I am trying to move a file from repo local directory to volumes, but I am getting directory not found issue. Can some one guide me.tried using dbfs (dbfs/:Volumes/folder/ , /dbfs/Volumes/folder/) and without dbfs (/Volumes/folder/). None worked.@Reti...
I have become really frustrated because I can't copy and paste cells in Databricks notebook. The keyboard shortcuts of Command + C and Command + V doesn't seem to work. I couldn't find a way to change the keyboard shortcut either.As a data scientist ...
Hi all,I hope you could help me to figure out what I am missing.I'm trying to do a simple thing. To read the data from the data ingestion zone (csv files saved to Azure Storage Account) using the Delta Live Tables pipeline and share the resulting tab...
I'm curious if Databricks plans to address this. We use delta live streaming tables extensively and also planned on using delta sharing to get our data from our production unity catalog (different region). Duplicating the data as a workaround is no...
How to create Delta live tables in Silver layerHi DB Experts,Having basic questions :I am working on Madalian Architecture (B, S, G) Layers.on B i am getting Delta files (Parq) format. with log folders. One folder for one table, multiple files are ge...
Dear Kaniz,Thank you for addressing question :I am getting following error if i follow above: pyspark.errors.exceptions.captured.IllegalArgumentException: Reading from a Delta table is not supported with this syntax. If you would like to consume data...
Thanks to every one of you, the Databricks Community has reached an incredible milestone: 100,000 members and over 50,000 posts! Your dedication, expertise and passion have made this possible. Whether you're a seasoned data professional, a coding en...
I am trying to follow along with a training course, but I am consistently running into an error loading a CSV with Spark from DBFS. Specifically, I keep getting an "Invalid format detected error". Has anyone else encountered this and found a soluti...
Well your error message is telling you that Spark is encountering a Delta table conflict while trying to read a CSV file. The file path dbfs:/mnt/dbacademy... points to a CSV file. This is where the fun begins. Spark detects a Delta transaction log d...
Hi,This is my sample JSON data which is generated from api response and it is all coming in a single row. I want to split this in multiple rows and store it in a dataframe.[{"transaction_id":"F6001EC5-528196D1","corrects_transaction_id":null,"transac...
Yes indeed, it was datatype issue. After changing it to Longtype in the schema definition, it is working now. Thanks once again for all your inputs and time. Much appreciated !!!
When I use show tblproperties on a view/table to see the metadata, it will redact any value which has "userid" anywhere put in to it.And it is not just through the visual interface, when I query it through python directly, it contains the redacted va...
I understand that yours is a View. For my case, it's a Table so I could use `desc detail <schema_name>.<table_name>` to get the table properties info that are not redacted in the `properties` column from the `desc detail` output.
Hello Team,I encountered Pathetic experience while attempting my DataBricks Data engineer certification. Abruptly, Proctor asked me to show my desk, after showing he/she asked multiple times.. wasted my time and then suspended my exam.I want to file ...
Hello, @sirishavemula20 It's a general practice for a proctor to ask the test taker to pan the room(as part of security measures) and its the responsibility of the test taker to make sure the surroundings are clear of any other objects whilst attempt...
Hello,I've been trying to serve registered MLflow models at GPU Model Serving Endpoint, which works except for the models using bitsandbytes library. The library is used to quantise the LLM models into 4-bit/ 8-bit (e.g. Mistral-7B), however, it runs...
Hi Databricks Community,I am looking for a formula/way to calculate the estimated cost for a job run, for which I have a few questions:1. Is there any formula to calculate the cost of any job like -> [(EC2 per hr cost) * (total time job ran)]and when...
Hi,I am trying to read a csv file into a Spark DataFrame using sparklyr::spark_read_csv. I am receiving a 403 access denied error.I have stored my AWS credentials as environment variables, and can successfully read the file as an R dataframe using ar...
In our Delta Live Table pipeline I am simply joining two streaming tables to a new streaming table.We use the following code: @Dlt.create_table()
def fact_event_faults():
events = dlt.read_stream('event_list').withWatermark('TimeStamp', '4 hours'...