- 4 Views
- 0 replies
- 0 kudos
Hi, I implemented a job that should incrementally read all the available data from a Kinesis Data Stream and terminate afterwards. I schedule the job daily. The data retention period of the data stream is 7 days, i.e., there should be enough time to ...
- 4 Views
- 0 replies
- 0 kudos
- 2583 Views
- 3 replies
- 0 kudos
Hello, recently I've tried to upgrade my runtime env to the 13.3 LTS ML and found that it breaks my workload during applyInPandas.My job started to hang during the applyInPandas execution. Thread dump shows that it hangs on direct memory allocation: ...
- 2583 Views
- 3 replies
- 0 kudos
Latest Reply
Having a near identical issue just materializing a dataframe with `.toPandas()` an operation that now (14.3) takes 5 minutes used to take ~30s before on 10.4.
2 More Replies
by
RobinK
• New Contributor III
- 547 Views
- 10 replies
- 7 kudos
Hello,since last night none of our ETL jobs in Databricks are running anymore, although we have not made any code changes.The identical jobs (deployed with Databricks asset bundles) run on an all-purpose cluster, but fail on a job cluster. We have no...
- 547 Views
- 10 replies
- 7 kudos
Latest Reply
Hello,We are also experiencing the same error message [NOT_COLUMN] Argument `col` should be a Column, got ColumnThis occurs when a workflow is run as a task from another workflow, but not when said workflow is run on its own, that is not triggered by...
9 More Replies
- 48 Views
- 0 replies
- 0 kudos
I'm using the docs here: https://docs.databricks.com/en/data-sharing/read-data-open.html#store-credsHowever I am unable to read the stored file which is sucessfully created with the following code:%scala
dbutils.fs.put("dbfs:/FileStore/extraction/con...
- 48 Views
- 0 replies
- 0 kudos
by
bampo
• New Contributor
- 38 Views
- 0 replies
- 0 kudos
Each merge/update on a table with liquid clustering force the streaming to read whole table.Databricks Runtime: 14.3 LTSBelow I prepare a simple scripts to reproduce the issue:Create schema. %sql
CREATE SCHEMA IF NOT EXISTS test; Create table with si...
- 38 Views
- 0 replies
- 0 kudos
- 59 Views
- 0 replies
- 0 kudos
Hi, using the Databricks cli, I exported the jobs in json format from the workspace in Azure, using the same json to create a new job, but in a workspace in AWS, the error below occurs.To create a job via Databricks cli on AWS, do you need to change ...
- 59 Views
- 0 replies
- 0 kudos
by
Dicer
• Valued Contributor
- 56 Views
- 0 replies
- 0 kudos
I am using the distributed Pandas on Spark, not the single node Pandas.But when I try to run the following code to transform a data frame with 652 x 729803 data points df_ps_pct = df.pandas_api().pct_change().to_spark() , it returns me this error: ...
- 56 Views
- 0 replies
- 0 kudos
- 55 Views
- 0 replies
- 0 kudos
We've been running delta live tables for some time with unity catalog and it's as slow as a sloth on a Hawaiian vacation.Anyway, DLT had three consecutive failures (due to the data source being unreliable) and then the logs printed: "MaxRetryThreshol...
- 55 Views
- 0 replies
- 0 kudos
- 113 Views
- 1 replies
- 0 kudos
We have structured streaming that reads from external delta table defined in following way: try:
df_silver = (
spark.readStream
.format("delta")
.option("skipChangeCommits", True)
.table(src_location)
...
- 113 Views
- 1 replies
- 0 kudos
Latest Reply
Hi,I see you are using `Trigger.AvailableNow`. Is this intended to be a continuous stream or an incremental batch trigger at an interval with Databricks Workflows?From the docs (https://docs.databricks.com/en/structured-streaming/triggers.html#config...
- 260 Views
- 1 replies
- 0 kudos
Recent changes to the worskpace UI (and introduction of Unity Catalog) seem to have discretely sunset the ability to upload data directly to DBFS from the local Filesystem using the UI (NOT the CLI)I want to be able to load a raw file (no matter the ...
- 260 Views
- 1 replies
- 0 kudos
Latest Reply
Have you tried using Volumes? https://docs.databricks.com/en/connect/unity-catalog/volumes.html You can do it through the UI, on the Catalog Explorer > Add Data button. Also, you could double check if your workspace admin has disabled DBFS access, b...
by
Erik
• Valued Contributor II
- 3263 Views
- 8 replies
- 10 kudos
​Databricks connect is a program which allows you to run spark code locally, but the actual execution happens on a spark cluster. Noticeably, it allows you to debug and step through the code locally in your own IDE. Quite useful. But it is now beeing...
- 3263 Views
- 8 replies
- 10 kudos
Latest Reply
Thank you all for the interesting and useful information
7 More Replies
by
TWib
• New Contributor III
- 118 Views
- 0 replies
- 1 kudos
This code fails with exception:[NOT_COLUMN_OR_STR] Argument `col` should be a Column or str, got Column.File <command-4420517954891674>, line 7 4 spark = DatabricksSession.builder.getOrCreate() 6 df = spark.read.table("samples.nyctaxi.trips") ---->...
- 118 Views
- 0 replies
- 1 kudos
by
mjar
• New Contributor
- 111 Views
- 0 replies
- 0 kudos
Recently we have run into an issue using foreachBatch after upgrading our Databricks cluster on Azure to a runtime version 14 with Spark 3.5 with Shared access mode and Unity catalogue.The issue was manifested by ModuleNotFoundError error being throw...
- 111 Views
- 0 replies
- 0 kudos
- 97 Views
- 1 replies
- 0 kudos
Hi All,Introduction: I am trying to register my model on Databricks so that I can serve it as an endpoint. The packages that I need are "torch", "mlflow", "torchvision", "numpy" and "git+https://github.com/facebookresearch/detectron2.git". For this, ...
- 97 Views
- 1 replies
- 0 kudos
Latest Reply
Found an answer!Basically pip was somehow installed the dependencies from the git repo first and was not following the given order so in order to solve this, I added the libraries for conda to install.```
conda_env = {
"channels": [
"defa...
- 10459 Views
- 4 replies
- 6 kudos
I try to create a table but I get this error: AnalysisException: Cannot create table ('`spark_catalog`.`default`.`citation_all_tenants`'). The associated location ('dbfs:/user/hive/warehouse/citation_all_tenants') is not empty but it's not a Delta t...
- 10459 Views
- 4 replies
- 6 kudos
Latest Reply
Hi Team, I am facing the same issue. When we try to load data to table in production batch getting error as table not in delta format. there is no recent change in table. and we are not trying any create or replace table. this is existing table in pr...
3 More Replies