by
KVNARK
• Honored Contributor II
- 945 Views
- 3 replies
- 11 kudos
Is there any limitation in querying the no. of SQL queries in Databricks SQL workspace.
- 945 Views
- 3 replies
- 11 kudos
Latest Reply
1000 has been documented to be by default, though I have never checked the correctness.
2 More Replies
- 885 Views
- 2 replies
- 9 kudos
Hi allI want to integrate Kafka with databricks if anyone can share any doc or code it will help me a lot.Thanks in advance
- 885 Views
- 2 replies
- 9 kudos
Latest Reply
This is code that I am using to read from KafkainputDF = (spark
.readStream
.format("kafka")
.option("kafka.bootstrap.servers", host)
.option("kafka.ssl.endpoint.identification.algorithm", "https")
.option("kafka.sasl.mechanism", "PLAIN")
.option("ka...
1 More Replies
- 4668 Views
- 8 replies
- 6 kudos
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 458.0 failed 4 times, most recent failure: Lost task 0.3 in stage 458.0 (TID 2247) (172.18.102.75 executor 1): com.databricks.sql.io.FileReadException: Error while rea...
- 4668 Views
- 8 replies
- 6 kudos
Latest Reply
Hi @Rupesh gupta​ Hope you are well. Just wanted to see if you were able to find an answer to your question and would you like to mark an answer as best? It would be really helpful for the other members too.Cheers!
7 More Replies
- 2399 Views
- 2 replies
- 4 kudos
We are trying to connect to an Azure SQL Server from Azure Databricks using JDBC, but have faced issues because our firewall blocks everything. We decided to whitelist IPs from the SQL Server side and add a public subnet to make the connection work. ...
- 2399 Views
- 2 replies
- 4 kudos
Latest Reply
Using subnets for Databricks connectivity is the correct thing to do. This way you ensure the resources (clusters) can connect to the SQL Database. We also recommend using NPIP (No Public IPs) so that there won't be any public ip associated with the...
1 More Replies
- 5105 Views
- 4 replies
- 2 kudos
Hello,We have a business request to compare the evolution in a certain delta table.We would like to compare the latest version of the table with the previous one using Delta time travel.The main issue we are facing is to retrieve programmatically us...
- 5105 Views
- 4 replies
- 2 kudos
Latest Reply
In the docs it says that "'Neither timestamp_expression nor version can be subqueries." So it does sound challenging. I also tried playing with widgets to see if it could be populated using SQL but didn't succeed. With python it's really easy to do.
3 More Replies
- 1038 Views
- 3 replies
- 4 kudos
I have a notebook that sets up parameters for the run based on some job parameters set by the user as well as the current date of the run. I want to supersede some of this logic and just use the manual values if kicked off manually. Is there a way to...
- 1038 Views
- 3 replies
- 4 kudos
Latest Reply
You can create widgets by using this- dbutils.widgets.text("widgetName", "")To get the value for that widget:- dbutils.widgets.get("widgetName")So by using this you can manually create widgets (variable) and can run the process by giving desired valu...
2 More Replies
- 715 Views
- 1 replies
- 0 kudos
In a scala note, how to I read input arguments (e.g. those proved by a job that runs a scala notebook). In python, dbutils.notebook.entry_point.getCurrentBindings() works. How about for scala.
- 715 Views
- 1 replies
- 0 kudos
Latest Reply
Hi @Robert Russell​ You can use dbutils.notebook.getContext.currentRunId in scala notebooks. Other methods are also available likedbutils.notebook.getContext.jobGroupdbutils.notebook.getContext.rootRunId dbutils.notebook.getContext.tags etc...You ...
by
Snuki
• New Contributor II
- 327 Views
- 0 replies
- 0 kudos
Hi FOLKS, could you please guide me, why my points not reflecting in reward store, it is showing 0.
- 327 Views
- 0 replies
- 0 kudos
- 2001 Views
- 2 replies
- 0 kudos
I want to read password protected excel file and load the data delta table.Can you pleas let me know how this can be achieved in databricks?
- 2001 Views
- 2 replies
- 0 kudos
Latest Reply
df = spark.read.format("com.crealytics.spark.excel")\ .option("dataAddress", "'Base'!A1")\ .option("header", "true")\ .option("workbookPassword", "test")\ .load("test.xlsx")display(df)
1 More Replies
by
fury88
• New Contributor II
- 945 Views
- 1 replies
- 1 kudos
I'm trying to cache data/queries that we normally have as temporary views that get replaced when the code is run based on dynamic python. What I'd like to know is will CACHE TABLE get overwritten each time you run it? Is it smart enough to recognize ...
- 945 Views
- 1 replies
- 1 kudos
Latest Reply
Hi @Matt Fury​ Yes...I guess cache overwrites each time you run it because for me it took nearly same amount of time for 1million records to be cached. However, you can check whether the table is cached or not using .storageLevel method. E.g. I have...
- 2585 Views
- 4 replies
- 4 kudos
My Azure Databricks workspace default DNS is #168.63.129.16, this DNS doesn't seem to resolve azure storage accounts which were created a year ago, after tweaking the cluster to use 8.8.8.8 then able to resolve desired storage accounts, is there a d...
- 2585 Views
- 4 replies
- 4 kudos
Latest Reply
IP address 168.63.129.16 is a virtual public IP address that is used to facilitate a communication channel to Azure platform resources. Customers can define any address space for their private virtual network in Azure. Therefore, the Azure platform...
3 More Replies
by
200723
• New Contributor II
- 1338 Views
- 4 replies
- 4 kudos
My Mongo Atlas connect url is like mongodb+srv://<srv_hostname>I don't want to use direct url like mongodb://<hostname1, hostname2, hostname3....> because our Mongo Atlas global clusters have many hosts. It would be hard to maintain.Our java programs...
- 1338 Views
- 4 replies
- 4 kudos
Latest Reply
Hi @Raymond Lai​ The issue looks to be on the Mongo DB connector. The connection is created and maintained by the mongo-spark connector. You can try using the direct mongodb hosts in the connection string instead of SRV to avoid doing DNS lookups or...
3 More Replies
by
Dicer
• Valued Contributor
- 4812 Views
- 5 replies
- 7 kudos
I only have 1000 columns. Each column has 252 rows, so there are only 252000 data points.How come it can route tasks for the best-cached locality for 7 hours?
- 4812 Views
- 5 replies
- 7 kudos
Latest Reply
Hi @Cheuk Hin Christophe Poon​ have you optimize your table anytime since it's creation? If not, then optimize may take some time depending on the no of underlying files.Please try to run optimize manually as described in below document:https://docs....
4 More Replies