cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

brickster
by New Contributor II
  • 9530 Views
  • 3 replies
  • 0 kudos

How to trigger workflow job tasks from Autoloader

I have configured a File Notification Autoloader that monitors S3 bucket for binary files. I want to integrate autoloader with workflow job so that whenever a file is placed in S3 bucket, the pipeline job notebook tasks can pick-up new file and start...

  • 9530 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Saravanan Ponnaiah​ Hope everything is going great.Does @odoll odoll​  response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

  • 0 kudos
2 More Replies
bradlindblad
by New Contributor II
  • 3509 Views
  • 2 replies
  • 1 kudos

Resolved! Font in Databricks Notebook is Greyed Out - Glitchy

The monospaced/code font in my databricks notebooks is greyed out, both in light and dark theme. I've tried playing with all the notebook settings, etc. and nothing will make the font 'normal'. I've tried Chrome and Edge, and the results are the same...

db
  • 3509 Views
  • 2 replies
  • 1 kudos
Latest Reply
klaapbakken
New Contributor III
  • 1 kudos

I was having this exact same issue. I fixed it by uninstalling the Source Code Pro font from my Windows machine.

  • 1 kudos
1 More Replies
Gk
by New Contributor III
  • 5845 Views
  • 10 replies
  • 1 kudos

DataBricks

How to find Mountpoints definitions

  • 5845 Views
  • 10 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hi @Govardhana Reddy​ Glad to hear!Please mark the answer as best, it will be highly appreciable.Have a great day!Regards

  • 1 kudos
9 More Replies
sanjay
by Valued Contributor II
  • 4582 Views
  • 4 replies
  • 1 kudos

Resolved! How can I get date when autoloader processes the file

Hi,I am running autoloader which is running continuously and checks for new file every 1 minute. I need to store when file was received/processed but its giving me date when autoloader started. Here is my code.df = (spark   .readStream   .format("clo...

  • 4582 Views
  • 4 replies
  • 1 kudos
Latest Reply
Lakshay
Databricks Employee
  • 1 kudos

Hi @Sanjay Jain​ , You can use the File Metadata column functionality to collect that information.Ref doc:- https://docs.databricks.com/ingestion/file-metadata-column.html

  • 1 kudos
3 More Replies
u2dragon
by New Contributor III
  • 20052 Views
  • 5 replies
  • 0 kudos

Resolved! Can't install python library

I'm trying to install a python library but I'm not able, the status won't change from "pending". I get this message when I click on the library under the cluster's Libraries tab: "Library installation has been attempted on the driver node but has not...

  • 20052 Views
  • 5 replies
  • 0 kudos
Latest Reply
u2dragon
New Contributor III
  • 0 kudos

Ok, looks like I was able to solve my problem.First, I needed to install all the required libraries one by one. These are the followings:pandassixrequestspyspnegocryptographykrb5requests-kerberosAfter that I was able to install the webAPI library.

  • 0 kudos
4 More Replies
Merchiv
by New Contributor III
  • 20231 Views
  • 4 replies
  • 3 kudos

Resolved! How can I add a duration in milliseconds to a timestamp?

Let's say I have a DataFrame with a timestamp and an offset column in milliseconds respectively in the timestamp and long format. E.g.from datetime import datetime df = spark.createDataFrame( [ (datetime(2021, 1, 1), 1500, ), (dat...

  • 20231 Views
  • 4 replies
  • 3 kudos
Latest Reply
Merchiv
New Contributor III
  • 3 kudos

Although @Lakshay Goel​'s solution works, we've been using an alternative approach, that we found to be a bit more readable:from pyspark.sql import Column, functions as f     def make_dt_interval_sec(col: Column): return f.expr(f"make_dt_interval...

  • 3 kudos
3 More Replies
bchaubey
by Contributor II
  • 3007 Views
  • 2 replies
  • 0 kudos

voucher

Did you receive your voucher?

  • 3007 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Kashish Khetarpaul​ Thank you for reaching out! Please submit a ticket to our Training Team here: https://help.databricks.com/s/contact-us?ReqType=training  and our team will get back to you shortly. 

  • 0 kudos
1 More Replies
Databrick_begin
by New Contributor
  • 5478 Views
  • 1 replies
  • 0 kudos

Databrick notebook to Azure SQL server connection using private ip because Public access is Denied in Azure SQL database, and Databrick and Azure SQL both in same subscription but different Virtual Network.

We have created private endpoint for Azure SQL database which has private ip. and by making host file entry in my system i am able to resolve Ip for Azure sql server from my system and connect to Server. but unable to connect from Azure Databrick not...

  • 5478 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ryoma
New Contributor II
  • 0 kudos

If vnet injection is not used, the connection could be established by setting up an init script with azure private resolver as nameserver.​#!/bin/bashmv /etc/resolv.conf /etc/resolv.conf.origecho nameserver <your dns server ip> | sudo tee --append /e...

  • 0 kudos
THIAM_HUATTAN
by Valued Contributor
  • 13834 Views
  • 5 replies
  • 6 kudos

Is catalog a feature in the community version?

%sql create catalog if not exists catalog1I tried above, but it gives me error as below:com.databricks.backend.common.rpc.DatabricksExceptions$SQLExecutionException: org.apache.spark.sql.AnalysisException: Catalog namespace is not supported. at com.d...

  • 13834 Views
  • 5 replies
  • 6 kudos
Latest Reply
BradSheridan
Valued Contributor
  • 6 kudos

For me, selecting Runtime 11.1 did not work (i.e. 'unity catalog' didn't show up on the right-hand side under Summary). But when I selected Runtime 11.2, it popped up. Going to start playing with it now

  • 6 kudos
4 More Replies
Dataengineer_mm
by New Contributor
  • 2772 Views
  • 2 replies
  • 0 kudos

Databricks workflow migration to higher environments

How do we migrate the databricks workflows to higher environment ? I do see an option for calling the tasks (notebooks,python) from the github repositories. But as such how do we migrate the entire workflow jobs to other environment ?

  • 2772 Views
  • 2 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

Hi @Menaka Murugesan​,Just a friendly follow-up. Did any of the responses help you to resolve your question? if it did, please mark it as best. Otherwise, please let us know if you still need help.

  • 0 kudos
1 More Replies
rbricks
by New Contributor
  • 1941 Views
  • 2 replies
  • 0 kudos

numSourceRows greater than expected

HeyI am doing an upsert of a source DataFrame into a target table. Before said upsert, I print out the source DataFrame's row count, which is a bit smaller than what `numSourceRows` says after the operation completes and I check the operationMetrics....

  • 1941 Views
  • 2 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

could you share your code snippet please? also share the expected output.

  • 0 kudos
1 More Replies
771407
by New Contributor II
  • 3151 Views
  • 3 replies
  • 3 kudos

Resolved! R code that works perfectly on Rstudio does not run here

Hi,I have a "simple" R script that I need to import into Databricks and am running into errors.For example:TipoB <- Techtb %>% dplyr::filter(grepl('being evaluated', Comentarios))#TipoB$yearsSpec <- NATipoB$yearsSpec <- str_replace(TipoB$Comentarios,...

  • 3151 Views
  • 3 replies
  • 3 kudos
Latest Reply
771407
New Contributor II
  • 3 kudos

R studio version 2022.12.0.R latest version available on 08/FEB/2023. I don't know where to find the DBR version and configuration. Can you direct me?

  • 3 kudos
2 More Replies
MikeJohnsonZa
by New Contributor
  • 3620 Views
  • 3 replies
  • 0 kudos

Resolved! Importing irregularly formatted json files

HiI'm importing a large collection of json files, the problem is that they are not what I would expect a well-formatted json file to be (although probably still valid), each file consists of only a single record that looks something like this (this i...

  • 3620 Views
  • 3 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

Hi @Michael Johnson​,I would like to share the following notebook which contains examples on how to process complex data types, like JSON. Please check the following link and let us know if you still need help https://docs.databricks.com/optimization...

  • 0 kudos
2 More Replies
youssefmrini
by Databricks Employee
  • 3216 Views
  • 1 replies
  • 4 kudos
  • 3216 Views
  • 1 replies
  • 4 kudos
Latest Reply
youssefmrini
Databricks Employee
  • 4 kudos

You can now use cluster policies to restrict the number of clusters a user can create. For more information https://docs.databricks.com/administration-guide/clusters/policies.html#cluster-limit

  • 4 kudos
Labels