- 2570 Views
- 3 replies
- 0 kudos
Hello, recently I've tried to upgrade my runtime env to the 13.3 LTS ML and found that it breaks my workload during applyInPandas.My job started to hang during the applyInPandas execution. Thread dump shows that it hangs on direct memory allocation: ...
- 2570 Views
- 3 replies
- 0 kudos
Latest Reply
Having a near identical issue just materializing a dataframe with `.toPandas()` an operation that now (14.3) takes 5 minutes used to take ~30s before on 10.4.
2 More Replies
by
RobinK
• New Contributor III
- 527 Views
- 10 replies
- 7 kudos
Hello,since last night none of our ETL jobs in Databricks are running anymore, although we have not made any code changes.The identical jobs (deployed with Databricks asset bundles) run on an all-purpose cluster, but fail on a job cluster. We have no...
- 527 Views
- 10 replies
- 7 kudos
Latest Reply
Hello,We are also experiencing the same error message [NOT_COLUMN] Argument `col` should be a Column, got ColumnThis occurs when a workflow is run as a task from another workflow, but not when said workflow is run on its own, that is not triggered by...
9 More Replies
- 46 Views
- 0 replies
- 0 kudos
I'm using the docs here: https://docs.databricks.com/en/data-sharing/read-data-open.html#store-credsHowever I am unable to read the stored file which is sucessfully created with the following code:%scala
dbutils.fs.put("dbfs:/FileStore/extraction/con...
- 46 Views
- 0 replies
- 0 kudos
- 34 Views
- 0 replies
- 0 kudos
Each merge/update on a table with liquid clustering force the streaming to read whole table.Databricks Runtime: 14.3 LTSBelow I prepare a simple scripts to reproduce the issue:Create schema. %sql
CREATE SCHEMA IF NOT EXISTS test; Create table with si...
- 34 Views
- 0 replies
- 0 kudos
- 59 Views
- 0 replies
- 0 kudos
Hi, using the Databricks cli, I exported the jobs in json format from the workspace in Azure, using the same json to create a new job, but in a workspace in AWS, the error below occurs.To create a job via Databricks cli on AWS, do you need to change ...
- 59 Views
- 0 replies
- 0 kudos
by
Dicer
• Valued Contributor
- 54 Views
- 0 replies
- 0 kudos
I am using the distributed Pandas on Spark, not the single node Pandas.But when I try to run the following code to transform a data frame with 652 x 729803 data points df_ps_pct = df.pandas_api().pct_change().to_spark() , it returns me this error: ...
- 54 Views
- 0 replies
- 0 kudos
- 51 Views
- 0 replies
- 0 kudos
We've been running delta live tables for some time with unity catalog and it's as slow as a sloth on a Hawaiian vacation.Anyway, DLT had three consecutive failures (due to the data source being unreliable) and then the logs printed: "MaxRetryThreshol...
- 51 Views
- 0 replies
- 0 kudos
- 111 Views
- 1 replies
- 0 kudos
We have structured streaming that reads from external delta table defined in following way: try:
df_silver = (
spark.readStream
.format("delta")
.option("skipChangeCommits", True)
.table(src_location)
...
- 111 Views
- 1 replies
- 0 kudos
Latest Reply
Hi,I see you are using `Trigger.AvailableNow`. Is this intended to be a continuous stream or an incremental batch trigger at an interval with Databricks Workflows?From the docs (https://docs.databricks.com/en/structured-streaming/triggers.html#config...
- 198 Views
- 1 replies
- 0 kudos
Recent changes to the worskpace UI (and introduction of Unity Catalog) seem to have discretely sunset the ability to upload data directly to DBFS from the local Filesystem using the UI (NOT the CLI)I want to be able to load a raw file (no matter the ...
- 198 Views
- 1 replies
- 0 kudos
Latest Reply
Have you tried using Volumes? https://docs.databricks.com/en/connect/unity-catalog/volumes.html You can do it through the UI, on the Catalog Explorer > Add Data button. Also, you could double check if your workspace admin has disabled DBFS access, b...
by
Erik
• Valued Contributor II
- 3202 Views
- 8 replies
- 10 kudos
​Databricks connect is a program which allows you to run spark code locally, but the actual execution happens on a spark cluster. Noticeably, it allows you to debug and step through the code locally in your own IDE. Quite useful. But it is now beeing...
- 3202 Views
- 8 replies
- 10 kudos
Latest Reply
Thank you all for the interesting and useful information
7 More Replies
by
TWib
• New Contributor III
- 116 Views
- 0 replies
- 1 kudos
This code fails with exception:[NOT_COLUMN_OR_STR] Argument `col` should be a Column or str, got Column.File <command-4420517954891674>, line 7 4 spark = DatabricksSession.builder.getOrCreate() 6 df = spark.read.table("samples.nyctaxi.trips") ---->...
- 116 Views
- 0 replies
- 1 kudos
by
mjar
• New Contributor
- 111 Views
- 0 replies
- 0 kudos
Recently we have run into an issue using foreachBatch after upgrading our Databricks cluster on Azure to a runtime version 14 with Spark 3.5 with Shared access mode and Unity catalogue.The issue was manifested by ModuleNotFoundError error being throw...
- 111 Views
- 0 replies
- 0 kudos
- 96 Views
- 1 replies
- 0 kudos
Hi All,Introduction: I am trying to register my model on Databricks so that I can serve it as an endpoint. The packages that I need are "torch", "mlflow", "torchvision", "numpy" and "git+https://github.com/facebookresearch/detectron2.git". For this, ...
- 96 Views
- 1 replies
- 0 kudos
Latest Reply
Found an answer!Basically pip was somehow installed the dependencies from the git repo first and was not following the given order so in order to solve this, I added the libraries for conda to install.```
conda_env = {
"channels": [
"defa...
- 10455 Views
- 4 replies
- 6 kudos
I try to create a table but I get this error: AnalysisException: Cannot create table ('`spark_catalog`.`default`.`citation_all_tenants`'). The associated location ('dbfs:/user/hive/warehouse/citation_all_tenants') is not empty but it's not a Delta t...
- 10455 Views
- 4 replies
- 6 kudos
Latest Reply
Hi Team, I am facing the same issue. When we try to load data to table in production batch getting error as table not in delta format. there is no recent change in table. and we are not trying any create or replace table. this is existing table in pr...
3 More Replies
by
WWoman
• New Contributor III
- 261 Views
- 1 replies
- 0 kudos
Hello,I am looking for a way to persist query history data. I have not have direct access to the system tables. I do have access to a query_history view created by selecting from the system.query.history and system.access.audit system tables. I want ...
- 261 Views
- 1 replies
- 0 kudos
Latest Reply
Hi @WWoman you can set up a simple append only DLT pipeline or just schedule a job that will run merge statement insert * only. In both cases you will get a table that will accumulate history of additions in your view over time and in both cases it w...