I need help with migrating from dbfs on databricks to workspace. I am new to databricks and am struggling with what is on the links provided.My workspace.yml also has dbfs hard-coded. Included is a full deployment with great expectations.This was don...
#!/bin/bashcurl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -curl https://packages.microsoft.com/config/ubuntu/$(lsb_release -rs)/prod.list > /etc/apt/sources.list.d/mssql-release.listsudo apt-get updatesudo ACCEPT_EULA=Y apt-get ...
Error code: 404 - {'error_code': 'FEATURE_DISABLED', 'message': 'FEATURE_DISABLED: Tool Calling is not enabled for this workspace'}Receiving this error when trying to complete chat using tools. How to enable Features?I am posting for the first time. ...
Hi,In the training Data Analysis with Databricks SQL, ID: E-089Z3V, there is mentioned some workinstruction, there are links where to downlaod sample .csv files.Where I can find this instructions?
Hi @Agnieszka_1987, Thank you for sharing your concern on Community! To expedite your request, please list your concerns on our ticketing portal. Our support staff would be able to act faster on the resolution (our standard resolution time is 24-48...
I need to connect to a server to retrieve some files using spark and a private ssh key. However, to manage the private key safely I need to store it as a secret in Azure Key Vault, which means I don't have the key as a file to pass down in the keyFil...
Hi @orangepepino, Instead of specifying the keyFilePath, you can pass the private key as a PEM string directly. This approach avoids the need for a physical key file.Since you’re already using Azure Key Vault, consider storing the private key as a s...
i have used a cluster termination logic for terminating a cluster , the issue is , the cluster is not terminating gracefully , returns a return/exit code 1The cluster is completing all the spark jobs, but it goes on long running state, hence i create...
Hi @Harsh-dataB, First, review your cluster termination logic. Make sure it accounts for all necessary cleanup tasks and allows sufficient time for Spark jobs to complete.If you’re using custom scripts or logic, ensure that it gracefully handles a...
import pandas as pd from pyspark.sql.types import StringType, IntegerType from pyspark.sql.functions import col save_path = os.path.join(base_path, stg_dir, "testCsvEncoding") d = [{"code": "00034321"}, {"code": "55964445226"}] df = pd.Data...
@georgeyjy Try opening the CSV as text editor. I bet that Excel is automatically trying to detect the schema of CSV thus it thinks that it's an integer.
Having issues with the pyspark DataFrames returned by delta.DeltaTable.toDF(), in what I believe is specific to shared access clusters on DBR14.3. Recently created a near identical workflow with the only major difference being that one of the source ...
That works, as mentioned it is easy to work around. as does replacing df = spark.table("test")df.select(df.col)
Dear all,I am following the guide in this article: https://docs.databricks.com/en/notebooks/testing.htmlhowever I am unable to run pytest due to the following error: ImportError while importing test module '/Workspace/Users/deadmanhide@gmail.com/test...
Hi @StephanKnox, Ensure that your directory structure is set up correctly. Based on your description, it should look something like this: Workspace/ ├── run_tests.py ├── test_trans.py └── transform/ ├── operations.py └── __init__.py In both ...
Hello Everyone,In my project I am using databricks autoloader to incrementally and efficiently processes new data files as they arrive in cloud storage.I am using file notification mode with event grid and queue service setup in azure storage account...
Hi @Sambit_S, cloudFiles.maxFilesPerTrigger: This option specifies the maximum number of files processed in each micro-batch.By default, it’s set to 10001.When you set it to 50000, you expect it to trigger more files per batch, but you’re observing ...
Hi Team,Currently I am trying to find size of all tables in my Azure databricks, as i am trying to get idea of current data loading trends, so i can plan for data forecast ( i.e. Last 2 months, approx 100 GB data came-in, so in next 2-3 months there ...
Hi @Devsql, For delta tables, you can use Apache Sparkâ„¢ SQL commands.To determine the size of non-delta tables, calculate the total sum of the individual files within the underlying directory. Alternatively, you can use queryExecution.analyzed.stats...
Hello, today I use Azure Databricks, I want to migrate my wordspaces to AWS Databricks. What is the best practice, which path should I follow?, I didn't find anything in the documentation.thanks.
Hi @thiagoawstest, For detailed guidance, consider reading Databricks’ blog series on deploying Databricks on AWS1. Additionally, explore the official Databricks documentation on migrating data applications to Databricks. Please let me know if you n...
Hello everyone,In my Databricks partner academy account, there is no course material while it should be under the lesson video. How can I resolve this problem? Does anyone else face the same problem? I had submitted a ticket to ask Databricks team bu...
Hi, @Monsem. I'm sorry about the issue with your Databricks Partner Academy account. Since you've already submitted a ticket without a response, please follow up on your ticket or provide the ticket number. If anyone else has faced this issue and has...
Hi Team,Recently we had created new Databricks project/solution (based on Medallion architecture) having Bronze-Silver-Gold Layer based tables. So we have created Delta-Live-Table based pipeline for Bronze-Layer implementation. Source files are Parqu...
Hello @Devsql , It appears that you are creating DLT bronze tables using a standard spark.read operation. This may explain why the DLT table doesn't include "new files" during a REFRESH operation. For incremental ingestion of bronze layer data into y...
Reading file like this "Data = spark.sql("SELECT * FROM edge.inv.rm") Getting this error org.apache.spark.SparkException: Job aborted due to stage failure: Task 10 in stage 441.0 failed 4 times, most recent failure: Lost task 10.3 in stage 441.0 (TID...
Hi @Madhawa, Ensure that the AWS credentials (access key and secret key) are correctly configured in your Spark application. You can set them using spark.conf.set("spark.hadoop.fs.s3a.access.key", "your_access_key") and spark.conf.set("spark.hadoop....
Manual ApproachWe can Update SQL Warehouse manually in Databricks.Click SQL Warehouses in the sidebarIn Advanced optionsWe can find Unity Catalog toggle button there! While Updating Existing SQL Warehouse in Azure to enable unity catalog using terraf...
Hello Raphael,Thank you for the update and for looking into the feature request. I appreciate your efforts in following up on this matter.If possible, could you please provide me with any updates or insights you receive from the Terraform team regard...
User | Count |
---|---|
1804 | |
841 | |
463 | |
311 | |
300 |