cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Srikanth_Gupta_
by Databricks Employee
  • 5307 Views
  • 2 replies
  • 0 kudos

How to process images and video through structured streaming using Delta Lake?

Can we scan though videos and identify and alert in real time if something goes wrong? what are best practices for this kind of use case?

  • 5307 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Maybe I'm a little off topic, but can you recommend companies that are engaged in video production? I want to make an explanatory video for my site.

  • 0 kudos
1 More Replies
Loki
by New Contributor III
  • 8888 Views
  • 10 replies
  • 3 kudos

Apache Log4J Vulnerability

Hi Community, We got an email from our IT Team regarding Apache Log4J Vulnerability. Just wanted to understand if our implementation will be affected by this or not. We are using the following library or package in our notebooksimport org.apache.log4...

  • 8888 Views
  • 10 replies
  • 3 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 3 kudos

On most databricks distributions log4j version is 1.2.17

  • 3 kudos
9 More Replies
Mariusz_Cyp
by New Contributor II
  • 8086 Views
  • 3 replies
  • 11 kudos

When the billing time starts for the cluster?

Hi All, I'm just wondering when exactly the billing time starts for the DataBricks cluster? Is starting time included? If cluster creation time takes 3 minutes and query execution only 2, will I pay for 2 or 5?​Thanks in advance! MC

  • 8086 Views
  • 3 replies
  • 11 kudos
Latest Reply
franco_patano
Databricks Employee
  • 11 kudos

Billing for databricks DBUs starts when Spark Context becomes available. Billing for the cloud provider starts when the request for compute is received and the VMs are starting up.

  • 11 kudos
2 More Replies
Soma
by Valued Contributor
  • 3740 Views
  • 2 replies
  • 1 kudos

Resolved! AutoLoader with Custom Queue

Hi Everyone can someone help with creating custom queue for auto loader as given here as default FlushwithClose event is not getting created when my data is uploaded to blob as given in azure DB docscloudFiles.queueNameThe name of the Azure queue. If...

  • 3740 Views
  • 2 replies
  • 1 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 1 kudos

you need to setup notification service for blob/adls like here https://docs.databricks.com/spark/latest/structured-streaming/auto-loader-gen2.html#cloud-resource-managementsetUpNotificationServices will return queue name which later can be used in au...

  • 1 kudos
1 More Replies
mrvi2310
by New Contributor II
  • 6890 Views
  • 4 replies
  • 3 kudos

what is the difference between weekday and dayofweek function in spark SQL?

dayofweek: https://docs.databricks.com/sql/language-manual/functions/dayofweek.htmlweekday : https://docs.databricks.com/sql/language-manual/functions/weekday.htmlAccording to the documentation , they both are synonym functions. But when I use it I n...

weekday vs dayofweek
  • 6890 Views
  • 4 replies
  • 3 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 3 kudos

That's correct for weekday moday=0 for dayofweek Sunday=1.​​You can also look for documentation here https://spark.apache.org/docs/latest/api/python/reference/api/pyspark.sql.functions.dayofweek.html​https://spark.apache.org/docs/latest/api/sql/index...

  • 3 kudos
3 More Replies
herry
by New Contributor III
  • 4752 Views
  • 6 replies
  • 4 kudos

CVE-2021-44228

Hi,Any affect of CVE-2021-44228 problem on Databricks platform?Is there any action that needs to be done by Databricks customer related to CVE-2021-44228?

  • 4752 Views
  • 6 replies
  • 4 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 4 kudos

On most databricks distributions log4j version is 1.2.17

  • 4 kudos
5 More Replies
Mohit_m
by Valued Contributor II
  • 3073 Views
  • 1 replies
  • 5 kudos

How to find out the users who accessed Databricks and from which location

How to find out the users who accessed Databricks and from which location

  • 3073 Views
  • 1 replies
  • 5 kudos
Latest Reply
Mohit_m
Valued Contributor II
  • 5 kudos

You can use Audit logs to fetch this dataQuery:%sqlSELECT DISTINCT userIdentity.email, sourceIPAddressFROM audit_logsWHERE serviceName = "accounts" AND actionName LIKE "%login%"Please find below the docs to analyse the Audit logshttps://docs.databric...

  • 5 kudos
-werners-
by Esteemed Contributor III
  • 2996 Views
  • 3 replies
  • 14 kudos

Notebook fails in job but not in interactive mode

I have this notebook which is scheduled by Data Factory on a daily basis.It works fine, up to today. All of a sudden I keep on getting NullpointerException when writing the data.After some searching online, I disabled AQE. But this does not help.Th...

  • 2996 Views
  • 3 replies
  • 14 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 14 kudos

After some tests it seems that if I run the notebook on an interactive cluster, I only get 80% of load (Ganglia metrics).If I run the same notebook on a job cluster with the same VM types etc (so the only difference is interactive vs job), I get over...

  • 14 kudos
2 More Replies
pjp94
by Contributor
  • 2189 Views
  • 4 replies
  • 9 kudos

Databrick Job - Notebook Execution

Question - When you set a reoccuring job to simply update a notebook, does databricks clear the state of the notebook prior to executing the notebook? If not, can I configure it to make sure it clears the state before running?

  • 2189 Views
  • 4 replies
  • 9 kudos
Latest Reply
Anonymous
Not applicable
  • 9 kudos

@Paras Patel​ - Would you be happy to mark Hubert's answer as best so that other members can find the solution more easily?Thanks!

  • 9 kudos
3 More Replies
morganmazouchi
by Databricks Employee
  • 7313 Views
  • 7 replies
  • 2 kudos

Resolved! Incremental updates in Delta Live Tables

What happens if we change the logic for the delta live tables and we do an incremental update. Does the table get reset (refresh) automatically or would it only apply the logic to new incoming data? would we have to trigger a reset in this case?

  • 7313 Views
  • 7 replies
  • 2 kudos
Latest Reply
morganmazouchi
Databricks Employee
  • 2 kudos

Here is my finding on when to refresh (reset) the table: If it is a complete table all the changes would be apply automatically. If the table is incremental table, you need to do a manually reset (full refresh).

  • 2 kudos
6 More Replies
Kody_Devl
by New Contributor II
  • 5128 Views
  • 3 replies
  • 2 kudos

%SQL Append null values into a SQL Table

Hi All, I am new to Databricks and am writing my first program.Note: Code Shown Below:I am creating a table with 3 columns to store data. 2 of the columns will be appended in from data that I have in another table.When I run my append query into the...

  • 5128 Views
  • 3 replies
  • 2 kudos
Latest Reply
Kody_Devl
New Contributor II
  • 2 kudos

Hi Hubert,Your answer moves me closer to being able to update pieces of a 26 field MMR_Restated table in pieces are the correct fields values are calculated Thru the process. I have been looking for a way to be able to update in "pieces"...... 2 fie...

  • 2 kudos
2 More Replies
RiyazAli
by Valued Contributor II
  • 12214 Views
  • 7 replies
  • 4 kudos

Issue while trying to read a text file in databricks using Local File API's instead of Spark API.

I'm trying to read a small txt file which is added as a table to the default db on Databricks. While trying to read the file via Local File API, I get a `FileNotFoundError`, but I'm able to read the same file as Spark RDD using SparkContext.Please fi...

  • 12214 Views
  • 7 replies
  • 4 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 4 kudos

can you try with /dbfs/Filestore/tables/boringwords.txt?

  • 4 kudos
6 More Replies
ak09
by New Contributor
  • 865 Views
  • 0 replies
  • 0 kudos

Triggering Notebook in Azure Repos via Azure DevOps

I have been using Databricks workspace for all my data science projects in my firm. In my current project, I have built a CI pipeline using databricks-cli & Azure DevOps. Using databricks-cli I can trigger the Notebook which is present in my workspa...

  • 865 Views
  • 0 replies
  • 0 kudos
tarente
by New Contributor III
  • 3625 Views
  • 3 replies
  • 3 kudos

Partitioned parquet table (folder) with different structure

Hi,We have a parquet table (folder) in Azure Storage Account.The table is partitioned by column PeriodId (represents a day in the format YYYYMMDD) and has data from 20181001 until 20211121 (yesterday).We have a new development that adds a new column ...

  • 3625 Views
  • 3 replies
  • 3 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 3 kudos

I think problem is in overwrite as when you overwrite it overwrites all folders. Solution is to mix append with dynamic overwrite so it will overwrite only folders which have data and doesn't affect old partitions:spark.conf.set("spark.sql.sources.pa...

  • 3 kudos
2 More Replies
Khaled
by New Contributor III
  • 3770 Views
  • 4 replies
  • 2 kudos

Uploading CSV to Databricks community edition

When I upload a csv file of size 1 GB from my PC the in the upload place, it is upload​ing untill the file reach some point and disappear for example it reach 600 MB and disappear from that place

  • 3770 Views
  • 4 replies
  • 2 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 2 kudos

Hi @Khaled ALZHARANI​ ,I would also recommend to split up your CSV files into smaller files.

  • 2 kudos
3 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels