cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

manish_tanwar
by New Contributor III
  • 3588 Views
  • 5 replies
  • 4 kudos

Databricks streamlit app for data ingestion in a table

I am using this code in a notebook to save data row on table. And it is working perfectly. And now I am using the same function to save data from a chatbot in streamlit chatbot application of databricks and I am getting error for ERROR ##############...

  • 3588 Views
  • 5 replies
  • 4 kudos
Latest Reply
pradeepvatsvk
New Contributor III
  • 4 kudos

Hi @manish_tanwar  how can we work with streamlit apps in databricks , i have a use case where i want to ingest data from different csv files and ingest it into delta tables 

  • 4 kudos
4 More Replies
harman
by New Contributor II
  • 1910 Views
  • 3 replies
  • 0 kudos

Serverless Compute

Hi Team,We are using Azure Databricks Serverless Compute to execute workflows and notebooks. My question is :Does serverless compute support Maven library installations?I appreciate any insights or suggestions you might have. Thanks in advance for yo...

  • 1910 Views
  • 3 replies
  • 0 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 0 kudos

So, it appears that the there is conflicting documentation on this topic.  I checked with our internal documention and what I found was that you CANNOT install JDBC or ODBC drivers on Serverless.  See limitations here: https://docs.databricks.com/aws...

  • 0 kudos
2 More Replies
annagriv
by New Contributor II
  • 6201 Views
  • 6 replies
  • 5 kudos

Resolved! How to get git commit ID of the repository the script runs on?

I have a script in a repository on DataBricks. The script should log the current git commit ID of the repository. How can that be implemented? I tried various command, for example: result = subprocess.run(['git', 'rev-parse', 'HEAD'], stdout=subproce...

  • 6201 Views
  • 6 replies
  • 5 kudos
Latest Reply
bestekov
New Contributor II
  • 5 kudos

Here is a version of @vr 's solution that can be run from any folder within the rep. It uses regex to extract the root from the path in the form of \Repos\<username>\<some-repo:import os import re from databricks.sdk import WorkspaceClient w = Worksp...

  • 5 kudos
5 More Replies
Vasu_Kumar_T
by Databricks Partner
  • 1204 Views
  • 3 replies
  • 0 kudos

Default Code generated by Bladebridge converter

Hello all ,1. What is the default code generated by Bladebridge converter.for eg : When we migrate Teradat, Oracle to Databricks using Bladebridge whats the default code base.2.If the generated code is PYSPARK, do I have any control over the generate...

  • 1204 Views
  • 3 replies
  • 0 kudos
Latest Reply
RiyazAliM
Honored Contributor
  • 0 kudos

Hello @Vasu_Kumar_T - We've used Bladebridge to convert from Redshift to Databricks. Bladebridge can definetly convert to Spark SQL, not sure about Scala Spark though.

  • 0 kudos
2 More Replies
AsgerLarsen
by New Contributor III
  • 2296 Views
  • 7 replies
  • 0 kudos

Using yml variables as table owner through SQL

I'm trying to change the ownership of a table in the Unity Catalog created through a SQL script. I want to do this though code.I'm using a standard databricks bundle setup, which uses three workspaces: dev, test and prod.I have created a variable in ...

  • 2296 Views
  • 7 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

I guess that is a safe bet.Good luck!

  • 0 kudos
6 More Replies
Aatma
by New Contributor
  • 5523 Views
  • 3 replies
  • 1 kudos

Resolved! DABs require library dependancies from GitHub private repository.

developing a python wheel file using DABs which require library dependancies from GitHub private repository. Please help me understand how to setup the git user and token in the resource.yml file and how to authenticate the GitHub package.pip install...

  • 5523 Views
  • 3 replies
  • 1 kudos
Latest Reply
sandy311
New Contributor III
  • 1 kudos

Could you please give a detailed example?how to define env varaibles? BUNDLE_VAR?

  • 1 kudos
2 More Replies
minhhung0507
by Valued Contributor
  • 895 Views
  • 1 replies
  • 0 kudos

Handling Hanging Pipelines in Real-Time Environments: Leveraging Databricks’ Idle Event Monitoring

Hi everyone,I’m running multiple real-time pipelines on Databricks using a single job that submits them via a thread pool. While most pipelines are running smoothly, I’ve noticed that a few of them occasionally get “stuck” or hang for several hours w...

  • 895 Views
  • 1 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

may I ask why you use threadpools?  with jobs you can define multiple tasks which do the same.I'm asking because threadpools and spark resource management can intervene with each other.

  • 0 kudos
rodrigocms
by New Contributor
  • 3541 Views
  • 1 replies
  • 0 kudos

Get information from Power BI via XMLA

Hello everyone I am trying to get information from Power BI semantic models via XMLA endpoint using PySpark in Databricks.Can someone help me with that?tks

  • 3541 Views
  • 1 replies
  • 0 kudos
Latest Reply
CacheMeOutside
New Contributor II
  • 0 kudos

I would like to see this too. 

  • 0 kudos
PunithRaj
by New Contributor
  • 7101 Views
  • 2 replies
  • 2 kudos

How to read a PDF file from Azure Datalake blob storage to Databricks

I have a scenario where I need to read a pdf file from "Azure Datalake blob storage to Databricks", where connection is done through AD access.Generating the SAS token has been restricted in our environment due to security issues. The below script ca...

  • 7101 Views
  • 2 replies
  • 2 kudos
Latest Reply
Mykola_Melnyk
New Contributor III
  • 2 kudos

@PunithRaj You can try to use  PDF DataSource for Apache Spark for read pdf files directly to the DataFrame. So you will have extracted text and rendered page as image in output. More details here: https://stabrise.com/spark-pdf/df = spark.read.forma...

  • 2 kudos
1 More Replies
Kamal2
by Databricks Partner
  • 27956 Views
  • 5 replies
  • 7 kudos

Resolved! PDF Parsing in Notebook

I have pdf files stored in azure adls.i want to parse pdf files in pyspark dataframeshow can i do that ?

  • 27956 Views
  • 5 replies
  • 7 kudos
Latest Reply
Mykola_Melnyk
New Contributor III
  • 7 kudos

PDF Data Source works now on Databricks.Instruction with example: https://stabrise.com/blog/spark-pdf-on-databricks/

  • 7 kudos
4 More Replies
isaac_gritz
by Databricks Employee
  • 29453 Views
  • 7 replies
  • 7 kudos

Local Development on Databricks

How to Develop Locally on Databricks with your Favorite IDEdbx is a Databricks Labs project that allows you to develop code locally and then submit against Databricks interactive and job compute clusters from your favorite local IDE (AWS | Azure | GC...

  • 29453 Views
  • 7 replies
  • 7 kudos
Latest Reply
kmodelew
New Contributor III
  • 7 kudos

Hi, You can use any of existing IDE. I'm using pycharm. I have created my own utils to run code on databricks. In .env file I have environmental variables and using SDK I'm creating SparkSession object and WorkspaceObject that you can use to read cre...

  • 7 kudos
6 More Replies
ADuma
by New Contributor III
  • 4915 Views
  • 2 replies
  • 0 kudos

Job sometimes failing due to library installation error of Pypi library

I am running a job on a Cluster from a compute pool that is installing a package from our Azure Artifacts Feed. My task is supposed to run a wheel task from our library which has about a dozen dependencies.For more than 95% of the runs this job works...

  • 4915 Views
  • 2 replies
  • 0 kudos
Latest Reply
ADuma
New Contributor III
  • 0 kudos

Hi Brahma,thanks a lot for the help. I'm trying installing my libraries with an init script right now. Unfortunately the error does not occur very regularily, so I'll have to observer for a few days I'm not 100% happy with the solution though. We are...

  • 0 kudos
1 More Replies
Dnirmania
by Contributor
  • 3602 Views
  • 4 replies
  • 0 kudos

Read file from AWS S3 using Azure Databricks

Hi TeamI am currently working on a project to read CSV files from an AWS S3 bucket using an Azure Databricks notebook. My ultimate goal is to set up an autoloader in Azure Databricks that reads new files from S3 and loads the data incrementally. Howe...

Dnirmania_0-1744106993274.png
  • 3602 Views
  • 4 replies
  • 0 kudos
Latest Reply
Aviral-Bhardwaj
Esteemed Contributor III
  • 0 kudos

no ,it is very easy follow this guide it will work - https://github.com/aviral-bhardwaj/MyPoCs/blob/main/SparkPOC/ETLProjectsAWS-S3toDatabricks.ipynb   

  • 0 kudos
3 More Replies
William_Scardua
by Valued Contributor
  • 12666 Views
  • 4 replies
  • 1 kudos

How to read data from Azure Log Analitycs ?

Hi guys,I need to read data from Azure Log Analitycs Workspace directaly, have any idea ?thank you

  • 12666 Views
  • 4 replies
  • 1 kudos
Latest Reply
alexott
Databricks Employee
  • 1 kudos

You can use Kusto Spark connector for that: https://github.com/Azure/azure-kusto-spark/blob/master/docs/KustoSource.md#source-read-command It heavily depends on how you access data, there could be a need for using ADX cluster for it: https://learn.mi...

  • 1 kudos
3 More Replies
KristiLogos
by Contributor
  • 2933 Views
  • 2 replies
  • 0 kudos

Resolved! GCS Error getting access token from metadata server at: http://169.254.169.254/computeMetadata/v1/in

I’m running Databricks on Azure and trying to read a CSV file from Google Cloud Storage (GCS) bucket using Spark. However, despite configuring Spark with a Google service account key, I’m encountering the following error:Error getting access token fr...

  • 2933 Views
  • 2 replies
  • 0 kudos
Latest Reply
ShivangiB
New Contributor III
  • 0 kudos

Hey, @KristiLogos , can you please suggest in what format key was stored in gsa_private_key.Actually we are using key vault based scope

  • 0 kudos
1 More Replies
Labels