cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

vanepet
by New Contributor II
  • 9560 Views
  • 5 replies
  • 2 kudos

Is it possible to use multiprocessing or threads to submit multiple queries to a database from Databricks in parallel?

We are trying to improve our overall runtime by running queries in parallel using either multiprocessing or threads. What I am seeing though is that when the function that runs this code is run on a separate process it doesnt return a dataFrame with...

  • 9560 Views
  • 5 replies
  • 2 kudos
Latest Reply
BapsDBS
New Contributor II
  • 2 kudos

Thanks for the links mentioned above. But both of them uses raw python to achieve parallelism. Does this mean Spark (read PySpark) does exactly provisions for parallel execution of functions or even notebooks ? We used a wrapper notebook with ThreadP...

  • 2 kudos
4 More Replies
VVM
by New Contributor III
  • 7933 Views
  • 13 replies
  • 3 kudos

Resolved! Databricks SQL - Unable to Escape Dollar Sign ($) in Column Name

It seems that due to how Databricks processes SQL cells, it's impossible to escape the $ when it comes to a column name.I would expect the following to work:%sql SELECT 'hi' `$id`The backticks ought to escape everything. And indeed that's exactly wha...

  • 7933 Views
  • 13 replies
  • 3 kudos
Latest Reply
Casper-Bang
New Contributor II
  • 3 kudos

What is the status on this bug report? its been over a year now. 

  • 3 kudos
12 More Replies
noimeta
by Contributor II
  • 2336 Views
  • 7 replies
  • 4 kudos

Resolved! Databricks SQL: catalog of each query

Currently, we are migrating from hive metastore to UC. We have several dashboards and a huge number of queries whose catalogs have been set to hive_metastore and using <db>.<table> access pattern.I'm just wondering if there's a way to switch catalogs...

  • 2336 Views
  • 7 replies
  • 4 kudos
Latest Reply
abdulrahim
New Contributor II
  • 4 kudos

Absolutely accurate, in order to grow your business you need to create an image of your brand such that it is the first thing coming to customers mind when they think about a certain product or service that’s where social media marketing agencies com...

  • 4 kudos
6 More Replies
Ajay-Pandey
by Esteemed Contributor III
  • 675 Views
  • 3 replies
  • 7 kudos

docs.databricks.com

Rename and drop columns with Delta Lake column mapping. Hi all,Now databricks started supporting column rename and drop.Column mapping requires the following Delta protocols:Reader version 2 or above.Writer version 5 or above.Blog URL##Available in D...

  • 675 Views
  • 3 replies
  • 7 kudos
Latest Reply
Poovarasan
New Contributor II
  • 7 kudos

Above mentioned feature is not working in the DLT pipeline. if the scrip has more than 4 columns 

  • 7 kudos
2 More Replies
DataGirl
by New Contributor
  • 3405 Views
  • 5 replies
  • 2 kudos

Multi value parameter on Power BI Paginated / SSRS connected to databricks using ODBC

Hi All, I'm wondering if anyone has had any luck setting up multi valued parameters on SSRS using ODBC connection to Databricks? I'm getting "Cannot add multi value query parameter" error everytime I change my parameter to multi value. In the query s...

  • 3405 Views
  • 5 replies
  • 2 kudos
Latest Reply
TechMG
New Contributor II
  • 2 kudos

Hello,I am facing similar kind of issue.  I am working on Power BI paginated report and databricks is my source for the report. I was trying to pass the parameter by passing the query in expression builder as mentioned above. However, I have end up w...

  • 2 kudos
4 More Replies
Anonymous
by Not applicable
  • 3224 Views
  • 8 replies
  • 2 kudos
  • 3224 Views
  • 8 replies
  • 2 kudos
Latest Reply
djhs
New Contributor III
  • 2 kudos

I also tried to leverage this endpoint (inferred from devtools): https://<workspace_id>.cloud.databricks.com/sql/api/dashboards/import with the exported dashboard (the dbdash file) in the request payload. It returns a 200 but nothing happens. Maybe s...

  • 2 kudos
7 More Replies
-werners-
by Esteemed Contributor III
  • 3741 Views
  • 6 replies
  • 12 kudos

Resolved! SSRS (on-prem) on Databricks SQL

Has anybody succeeded in querying Databricks SQL with an on-prem SSRS (so an on-prem Report Server and Report Builder)?I manage to create a connection that works (according to the connection test anyway), but the moment I try to create a dataset on t...

  • 3741 Views
  • 6 replies
  • 12 kudos
Latest Reply
Haider93
New Contributor III
  • 12 kudos

Hi @-werners- , I am able to build connection between Microsoft visual studio and data bricks using Simba Spark ODBC driver. I can query delta tables sitting in Databricks from Microsoft Visual studio (SSRS). However, when I am deploying the report t...

  • 12 kudos
5 More Replies
mickniz
by Contributor
  • 13803 Views
  • 7 replies
  • 18 kudos

cannot import name 'sql' from 'databricks'

I am working on Databricks version 10.4 premium cluster and while importing sql from databricks module I am getting below error. cannot import name 'sql' from 'databricks' (/databricks/python/lib/python3.8/site-packages/databricks/__init__.py).Trying...

  • 13803 Views
  • 7 replies
  • 18 kudos
Latest Reply
wallystart
New Contributor II
  • 18 kudos

I resolve the same error installing library from cluster interface (UI)

  • 18 kudos
6 More Replies
SRK
by Contributor III
  • 2998 Views
  • 6 replies
  • 5 kudos

Resolved! How to deploy Databricks SQL queries and SQL Alerts from lower environment to higher environment?

We are using Databricks SQL Alerts to handle one scenario. We have written the queries for the same, also we have created the SQL Alert. However, I was looking for the best way to deploy it on Higher Environments like Pre-Production and Production.I ...

  • 2998 Views
  • 6 replies
  • 5 kudos
Latest Reply
valeryuaba
New Contributor III
  • 5 kudos

Thanks!

  • 5 kudos
5 More Replies
alexisjohnson
by New Contributor III
  • 4412 Views
  • 7 replies
  • 6 kudos

Resolved! Window function using last/last_value with PARTITION BY/ORDER BY has unexpected results

Hi, I'm wondering if this is the expected behavior when using last or last_value in a window function? I've written a query like this:select col1, col2, last_value(col2) over (partition by col1 order by col2) as column2_last from values ...

Screen Shot 2021-11-18 at 12.48.25 PM Screen Shot 2021-11-18 at 12.48.32 PM
  • 4412 Views
  • 7 replies
  • 6 kudos
Latest Reply
Carv
Visitor II
  • 6 kudos

For those stumbling across this; it seems LAST_VALUE emulates the same functionality as it does in SQL Server which does not, in most people's minds, have a proper row/range frame for the window. You can adjust it with the below syntax.I understand l...

  • 6 kudos
6 More Replies
DJey
by New Contributor III
  • 3009 Views
  • 4 replies
  • 3 kudos

Databricks CI/CD Azure DevOps

Hi All. I have a scenario where there are few .sql scripts present in my repo. Is there any way we can execute those SQLs on Databricks via Azure DevOps CI/CD pipeline?Please help.

  • 3009 Views
  • 4 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Divyansh Jain​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers...

  • 3 kudos
3 More Replies
dukebaslangic
by New Contributor II
  • 927 Views
  • 3 replies
  • 3 kudos

Resolved! Databricks performance related documentation/books

Hi,Do you know any good resources about Databricks performance improvements(like improving query performances, monitoring/resolving performance bottlenecks etc)?Thanks

  • 927 Views
  • 3 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Ömer Özsakarya​  We haven't heard from you since the last response from @Lakshay Goel​ ​, and I was checking back to see if her suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be helpful to ...

  • 3 kudos
2 More Replies
HariharaSam
by Contributor
  • 14094 Views
  • 10 replies
  • 4 kudos

Resolved! To get Number of rows inserted after performing an Insert operation into a table

Consider we have two tables A & B.qry = """INSERT INTO Table ASelect * from Table B where Id is null """spark.sql(qry)I need to get the number of records inserted after running this in databricks.

  • 14094 Views
  • 10 replies
  • 4 kudos
Latest Reply
GRCL
New Contributor III
  • 4 kudos

Almost same advice than Hubert, I use the history of the delta table :df_history.select(F.col('operationMetrics')).collect()[0].operationMetrics['numOutputRows']You can find also other 'operationMetrics' values, like 'numTargetRowsDeleted'.

  • 4 kudos
9 More Replies
Merchiv
by New Contributor III
  • 5574 Views
  • 8 replies
  • 2 kudos

Resolved! AnalysisException when running SQL queries

When running some SQL queries using spark.sql(...), we sometimes get a variant of the following error:AnalysisException: Undefined function: current_timestamp. This function is neither a built-in/temporary function, nor a persistent function that is ...

  • 5574 Views
  • 8 replies
  • 2 kudos
Latest Reply
ashish1
New Contributor III
  • 2 kudos

This is most likely a conflict in the lib code, you can uninstall some libs on your cluster and try to narrow it down to the problematic one.

  • 2 kudos
7 More Replies
ros
by New Contributor III
  • 1476 Views
  • 2 replies
  • 3 kudos

Apache Hudi Table creation using hudi maven library

I installed hudi maven library org.apache.hudi:hudi-spark3.3-bundle_2.12:0.13.0 in Dbricks Runtime Ver : 12.2 LTS (includes Apache Spark 3.3.2, Scala 2.12) with spark config :spark.sql.catalog.spark_catalog org.apache.spark.sql.hudi.catalog.HoodieCat...

  • 1476 Views
  • 2 replies
  • 3 kudos
Latest Reply
ros
New Contributor III
  • 3 kudos

@Shanmugavel Chandrakasu​ %sql create table hudi_cow_pt_tbl ( id bigint, name string, ts bigint, dt string, hh string ) using hudi tblproperties ( type = 'cow', primaryKey = 'id', preCombineField = 'ts' ) partitioned by (dt, hh) location '/mnt/data/h...

  • 3 kudos
1 More Replies
Labels