cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

KrzysztofPrzyso
by New Contributor III
  • 1631 Views
  • 3 replies
  • 1 kudos

databricks-connect, dbutils, abfss path, URISyntaxException

When trying to use `dbutils.fs.cp` in the #databricks-connect #databricks-connect context to upload files to Azure Datalake Gen2 I get a malformed URI errorI have used the code provided here:https://learn.microsoft.com/en-gb/azure/databricks/dev-tool...

KrzysztofPrzyso_0-1707241094344.png
Data Engineering
abfss
databricks-connect
  • 1631 Views
  • 3 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Hi @KrzysztofPrzyso, It appears that you’re encountering an issue with relative paths in absolute URIs when using dbutils.fs.cp in the context of Databricks Connect to upload files to Azure Data Lake Gen2. Let’s break down the problem and explore po...

  • 1 kudos
2 More Replies
564824
by New Contributor II
  • 3671 Views
  • 6 replies
  • 0 kudos

Resolved! Why is Photon increasing DBU used per hour?

I noticed that enabling photon acceleration is increasing the number of DBU utilized per hour which in turn increases our cost.In light of this, I am interested in gaining clarity on the costing of Photon acceleration as I was led to believe that Pho...

  • 3671 Views
  • 6 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

well that depends on what kinds of tests you do.  In data warehousing there are different kinds of loads.What have you tested?  Data transformations or analytical queries.  Because for the latter databricks sql is a better choice than a common spark ...

  • 0 kudos
5 More Replies
high-energy
by New Contributor III
  • 460 Views
  • 1 replies
  • 0 kudos

Resolved! Accessing a series in a DataFrame

Frequently I see this syntax to access a series in DBX. df['column_name'] However, I get this as my output from that.Column<'derived_value'>What's the correct way to access a series 

  • 460 Views
  • 1 replies
  • 0 kudos
Latest Reply
high-energy
New Contributor III
  • 0 kudos

I realized I was looking at the wrong dataframe type.I needed a Pandas dataframe, not a databricks dataframe. 

  • 0 kudos
akshayauser
by New Contributor
  • 490 Views
  • 2 replies
  • 1 kudos

Create a table name without back tick when using set variable

When i tried to create a table name with variable like this-- Set a string variableSET table_suffix = 'suffix';-- Use dynamic SQL to create a table with the variable as a suffix in the table nameCREATE TABLE IF NOT EXISTS <dbname>.my_table_${table_su...

  • 490 Views
  • 2 replies
  • 1 kudos
Latest Reply
brockb
Valued Contributor
  • 1 kudos

Hi,It's possible that the `identifier` clause is what you're looking for (https://docs.databricks.com/en/sql/language-manual/sql-ref-names-identifier-clause.html#identifier-clause). If so, this basic example should work: DECLARE mytab = '`mycatalog`....

  • 1 kudos
1 More Replies
nehaa
by New Contributor II
  • 374 Views
  • 1 replies
  • 0 kudos

Filter in DBX dashboards

How to add a column from Table1 as a filter to Table2 (Also called as on-click action filter) in databricks Dashboards?Both the tables are getting data through sql query 

  • 374 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Honored Contributor
  • 0 kudos

To add a column from Table1 as a filter to Table2 in Databricks Dashboards, you can use the dashboard parameters feature. Here are the steps: Create a visualization for each of your SQL queries. You can do this by clicking the '+' next to the Resul...

  • 0 kudos
high-energy
by New Contributor III
  • 970 Views
  • 3 replies
  • 2 kudos

Resolved! Union and Column data types

I have three data frames that I create in python. I want to write all three of these to the same delta table. In code I bring the three of them together using the union operation.When I do this the data in the columns is no longer aligned correctly.I...

  • 970 Views
  • 3 replies
  • 2 kudos
Latest Reply
high-energy
New Contributor III
  • 2 kudos

Aligning the data types and column order across all three data frames before attempting to union them together solved the problem. The below snippet highlights what was happening.data = [[2021, "test", "Albany", "M", 42]] df1 = spark.createDataFrame...

  • 2 kudos
2 More Replies
semsim
by Contributor
  • 917 Views
  • 4 replies
  • 0 kudos

Resolved! Installing LibreOffice on Databricks

Hi, I need to install libreoffice to do a document conversion from .docx to .pdf. The requirement is no use of containers. Any idea on how I should go about this? Environment: Databricks 13.3 LTSThanks,Sem

  • 917 Views
  • 4 replies
  • 0 kudos
Latest Reply
Yeshwanth
Honored Contributor
  • 0 kudos

Hi @semsim Good day! I just wanted to check if you have tried the following commands already. %sh sudo apt-get install -y libreoffice sudo apt-get install -y unoconv  

  • 0 kudos
3 More Replies
nistrate
by New Contributor III
  • 6406 Views
  • 2 replies
  • 5 kudos

Resolved! Restricting Workflow Creation and Implementing Approval Mechanism in Databricks

Hello Databricks Community,I am seeking assistance understanding the possibility and procedure of implementing a workflow restriction mechanism in Databricks. Our aim is to promote a better workflow management and ensure the quality of the notebooks ...

  • 6406 Views
  • 2 replies
  • 5 kudos
Latest Reply
Avvar2022
Contributor
  • 5 kudos

I believe this has to happen in 2 steps.step1: Currently admin can't restrict workflow creation in databricks  currently any user with workspace access can create workflows. Admins should be able to restrict workflow creation. Databricks doesn't have...

  • 5 kudos
1 More Replies
CaptainJack
by New Contributor II
  • 212 Views
  • 1 replies
  • 0 kudos

Giving coworker "runing" permision on workflow but without allowing him access to notebooks.

I noticed that there is can_manage_run permission on workflow level, and someone can run a workflow only with these permission (without needing can_run permission on notebook level). Problem is that coworker can go to run details and then click on ta...

  • 212 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ravivarma
New Contributor III
  • 0 kudos

Hello @CaptainJack , In Databricks, the can_manage_run permission lets a user manage workflow executions but does not hide the code in the tasks. If someone has this permission, they can see the details and code of the workflow runs. At...

  • 0 kudos
AhsanKhawaja
by New Contributor
  • 3459 Views
  • 4 replies
  • 0 kudos

using databricks sql warehouse as web app backend

Hi,I wanted to ask if anyone is using Databricks SQL Warehouse as backend for small to large scale web application? What are your thoughts about it, specially what Databricks team thinks of it ?Kind Regards,A

  • 3459 Views
  • 4 replies
  • 0 kudos
Latest Reply
Robert-Scott
New Contributor II
  • 0 kudos

Using Databricks SQL Warehouse as a backend for a web application involves integrating Databricks with your web app to handle data processing, querying, and analytics. Here are the steps to achieve this:1. Set Up Databricks SQL WarehouseCreate a Data...

  • 0 kudos
3 More Replies
yj940525
by New Contributor II
  • 332 Views
  • 0 replies
  • 0 kudos

question of changing cluster key in liquid cluster

If i already have a cluster key1 for existing table, i want to change cluster key to key2 using ALTER TABLE table CLUSTER BY (key2), then run OPTIMIZE table, based on databrick document , existing files will not be rewritten (verified by my test as w...

  • 332 Views
  • 0 replies
  • 0 kudos
yatharth
by New Contributor III
  • 4113 Views
  • 1 replies
  • 1 kudos

AWS CLI Commands

I wish to run aws CLI command in databricks, is there a way i can achieve the same, to be more specific i would like to run:aws cloudwatch get-metric-statistics --metric-name BucketSizeBytes --namespace AWS/S3 --start-time 2017-03-06T00:00:00Z --end-...

  • 4113 Views
  • 1 replies
  • 1 kudos
Latest Reply
Yeshwanth
Honored Contributor
  • 1 kudos

@yatharth please check this: https://docs.databricks.com/en/compute/access-mode-limitations.html#network-and-file-system-access-limitations-for-unity-catalog-shared-access-mode:~:text=Cannot%20connect%20to%20the%20instance%20metadata%20service%20(IMD...

  • 1 kudos
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels
Top Kudoed Authors