- 1455 Views
- 4 replies
- 0 kudos
dbt error: Data too long for column at row 1
Hi there!We are experiencing a Databricks error we don’t recognise when we are running one of our event-based dbt models in dbt core (version 1.6.18). The dbt model uses the ‘insert_by_period’ materialisation that is still experimental for version 1....
- 1455 Views
- 4 replies
- 0 kudos
- 0 kudos
We are yet to upgrade dbt core to the latest version but will check again once we have done so.
- 0 kudos
- 3271 Views
- 4 replies
- 2 kudos
Resolved! Unity Catalog Migration: External AWS S3 Location Tables vs. Managed Tables in Databricks!
Hey Databricks enthusiasts!Migrating to Unity Catalog? Understanding the difference between External S3 Location Tables and Managed Tables is crucial for optimizing governance, security, and cost efficiency.External S3 Location TablesData remains in ...
- 3271 Views
- 4 replies
- 2 kudos
- 2 kudos
Hey!I hope I’m not too late, and I’d like to share my opinion. While it’s true that managed services offer certain advantages over external tables, you should keep in mind that Databricks services often come with an associated cost, such as Predictiv...
- 2 kudos
- 982 Views
- 1 replies
- 0 kudos
Terminated cluster on free account
Hi,I mistakenly terminated my cluster. Could you please advise on how I can reactivate the same cluster?
- 982 Views
- 1 replies
- 0 kudos
- 0 kudos
Hi @Lupo123, To reactivate a terminated cluster on a free Databricks account, you will need to create a new cluster. Unfortunately, once a cluster is terminated, it cannot be reactivated
- 0 kudos
- 11508 Views
- 4 replies
- 2 kudos
Gathering Data Off Of A PDF File
Hello everyone,I am developing an application that accepts pdf files and inserts the data into my database. The company in question that distributes this data to us only offers PDF files, which you can see attached below (I hid personal info for priv...
- 11508 Views
- 4 replies
- 2 kudos
- 2 kudos
You can use PDF Data Source for read data from pdf files. Examples here: https://stabrise.com/blog/spark-pdf-on-databricks/And after that use Scale DP library for extract data from the text in declarative way using LLM. Here is example of extraction ...
- 2 kudos
- 1889 Views
- 1 replies
- 0 kudos
Speaker diarization on databricks with Nemo throwing error
The configuration of my compute is 15.4 LTS ML (includes Apache Spark 3.5.0, GPU, Scala 2.12)Standard_NC8as_T4_v3 on Azure Databricks
- 1889 Views
- 1 replies
- 0 kudos
- 0 kudos
Hi @Nishat ,It looks like there's a problem with GPU compability. As mentioned in the error message, FlashAttention only supports Ampere GPUs or newer.According to following thread, GPU architecture you've chosen is not supportedRuntimeError: FlashAt...
- 0 kudos
- 1315 Views
- 1 replies
- 0 kudos
DBT RUN Command not working while invoked using subprocess.run
Hi,I am using below code to run DBT Model from notebook.I am using parameters to pass DBT run command(project directory, profile directory, schema name etc). The issue is, when I am running this code in my local workspace it is working fine but when ...
- 1315 Views
- 1 replies
- 0 kudos
- 0 kudos
Hi @dk09, Can you share the path of: dbt_project_directory and also try inputting the folder path manually to debug it, does it still fail?
- 0 kudos
- 1817 Views
- 2 replies
- 0 kudos
INSERT OVERWRITE DIRECTORY
I am using this query to create a csv in a volume named test_volsrr that i createdINSERT OVERWRITE DIRECTORY '/Volumes/DATAMAX_DATABRICKS/staging/test_volsrr'USING CSVOPTIONS ('delimiter' = ',', 'header' = 'true')SELECT * FROM staging.extract1gbDISTR...
- 1817 Views
- 2 replies
- 0 kudos
- 0 kudos
The DISTRIBUTE BY COALESCE(1) clause is intended to reduce the number of output files to one. However, this can lead to inefficiencies and large file sizes because it forces all data to be processed by a single task, which can cause memory and perfor...
- 0 kudos
- 2966 Views
- 2 replies
- 0 kudos
Discrepancy in Performance Reading Delta Tables from S3 in PySpark
Hello Databricks Community,I've encountered a puzzling performance difference while reading Delta tables from S3 using PySpark, particularly when applying filters and projections. I'm seeking insights to understand this variation better.I've attempte...
- 2966 Views
- 2 replies
- 0 kudos
- 0 kudos
Use the explain method to analyze the execution plans for both methods and identify any inefficiencies or differences in the plans. You can also review the metrics to understand this further. https://www.databricks.com/discover/pages/optimize-data-wo...
- 0 kudos
- 2820 Views
- 1 replies
- 0 kudos
Error changing connection information of Databricks data source posted on Tableau server
HelloThere is a Databricks data source published on the Tableau server.When I click the 'Edit Data Source' button in the location where the data source is published and go to the Data Source tab, and change the Databricks connection information (HTTP...
- 2820 Views
- 1 replies
- 0 kudos
- 0 kudos
1) I am thinking if there are saved auth, which could cause the issue. 2) If possible, try using different authentication methods (e.g., Personal Access Token) to see if the issue persists. This can help identify if the problem is specific to the aut...
- 0 kudos
- 2303 Views
- 2 replies
- 1 kudos
How to download the results in batches
Hello, how are you?I`m trying to download some of my results on databricks and the sheets is around 300mb, unfortunately my google sheets is not open files that has more then 100mb. Is that any chance that i could download the results in batches to ...
- 2303 Views
- 2 replies
- 1 kudos
- 1 kudos
Hey, Thinking of more alternates to repartition: 1- Use the limit and offset options in your SQL queries to export data in manageable chunks. For example, if you have a table with 100,000 rows and you want to export 10,000 rows at a time, you can us...
- 1 kudos
- 3346 Views
- 1 replies
- 0 kudos
React.js and Databricks Apps
Is there documentation and support for React and Databricks apps. Similar to the diagram below:
- 3346 Views
- 1 replies
- 0 kudos
- 0 kudos
Documentation for https://docs.databricks.com/en/dev-tools/databricks-apps/index.html You can use https://react.dev/ documentation to leverage react, and develop your UI.
- 0 kudos
- 906 Views
- 1 replies
- 0 kudos
COPY INTO from Volume failure (rabbit hole)
hey guys, I am stuck on a loading task, and I simply can't spot what is wrong. The following query fails: COPY INTO `test`.`test_databricks_tokenb3337f88ee667396b15f4e5b2dd5dbb0`.`pipeline_state`FROM '/Volumes/test/test_databricks_tokenb3337f88ee6673...
- 906 Views
- 1 replies
- 0 kudos
- 0 kudos
I see you are reading just 1 file, ensure that there are no zero-byte files in the directory. Zero-byte files can cause schema inference to fail. Double-check that the directory contains valid Parquet files using parquet tools. Sometimes, even if the...
- 0 kudos
- 842 Views
- 1 replies
- 0 kudos
How to identify the goal of a specific Spark job?
I'm analyzing the performance of a DBR/Spark request. In this case, the cluster is created using a custom image, and then we run a job on it.I've dived into the "Spark UI" part of the DBR interface, and identified 3 jobs that appear to account for an...
- 842 Views
- 1 replies
- 0 kudos
- 0 kudos
The spark jobs are decided based on your spark code. You can look at the spark plan to understand what operations each spark job/stage is executing
- 0 kudos
- 2059 Views
- 3 replies
- 1 kudos
Databricks workspace adjust column width
Hi, is it possible to change the column width in the workspace overview? Currently I have a lot of jobs with a name which is too wide for the standard overview and so it not easy to find certain jobs.
- 2059 Views
- 3 replies
- 1 kudos
- 1 kudos
Ahh my mistake! You are right. It can be done only in workflow
- 1 kudos
- 2010 Views
- 2 replies
- 0 kudos
JDBC Invalid SessionHandle with dbSQL Warehouse
Connecting Pentaho Ctools dashboards to Databricks using JDBC to a serverless dbSQL Warehouse, it works fine on the initial load, but then if we leave it idle for awhile and come back we get this error:[Databricks][JDBCDriver](500593) Communication l...
- 2010 Views
- 2 replies
- 0 kudos
- 0 kudos
I should have mentioned that we're using AuthMech=3 and in the JDBC docs (Databricks JDBC Driver Installation and Configuration Guide) I don't see any relevant timeout settings that would apply in that scenario. Am I missing something?
- 0 kudos
-
.CSV
1 -
Access Data
2 -
Access Databricks
3 -
Access Delta Tables
2 -
Account reset
1 -
adcAws databricks
1 -
ADF Pipeline
1 -
ADLS Gen2 With ABFSS
1 -
Advanced Data Engineering
2 -
AI
5 -
Analytics
1 -
Apache spark
1 -
Apache Spark 3.0
1 -
api
1 -
Api Calls
1 -
API Documentation
4 -
App
2 -
Application
2 -
Architecture
1 -
asset bundle
1 -
Asset Bundles
3 -
Auto-loader
1 -
Autoloader
4 -
Aws databricks
1 -
AWS security token
1 -
AWSDatabricksCluster
1 -
Azure
7 -
Azure data disk
1 -
Azure databricks
16 -
Azure Databricks Delta Table
1 -
Azure Databricks Job
1 -
Azure Databricks SQL
6 -
Azure databricks workspace
1 -
Azure Unity Catalog
6 -
Azure-databricks
1 -
AzureDatabricks
1 -
AzureDevopsRepo
1 -
Big Data Solutions
1 -
Billing
1 -
Billing and Cost Management
2 -
Blackduck
1 -
Bronze Layer
1 -
CDC
1 -
Certification
3 -
Certification Exam
1 -
Certification Voucher
3 -
CICDForDatabricksWorkflows
1 -
Cloud_files_state
1 -
CloudFiles
1 -
Cluster
3 -
Cluster Init Script
1 -
Comments
1 -
Community Edition
4 -
Community Edition Account
1 -
Community Event
1 -
Community Group
2 -
Community Members
1 -
Compute
3 -
Compute Instances
1 -
conditional tasks
1 -
Connection
1 -
Contest
1 -
Credentials
1 -
csv
1 -
Custom Python
1 -
CustomLibrary
1 -
Data
1 -
Data + AI Summit
1 -
Data Engineer Associate
1 -
Data Engineering
4 -
Data Explorer
1 -
Data Governance
1 -
Data Ingestion & connectivity
1 -
Data Ingestion Architecture
1 -
Data Processing
1 -
Databrick add-on for Splunk
1 -
databricks
4 -
Databricks Academy
1 -
Databricks AI + Data Summit
1 -
Databricks Alerts
1 -
Databricks App
1 -
Databricks Assistant
1 -
Databricks Certification
1 -
Databricks Cluster
2 -
Databricks Clusters
1 -
Databricks Community
10 -
Databricks community edition
3 -
Databricks Community Edition Account
1 -
Databricks Community Rewards Store
3 -
Databricks connect
1 -
Databricks Dashboard
3 -
Databricks delta
2 -
Databricks Delta Table
2 -
Databricks Demo Center
1 -
Databricks Documentation
4 -
Databricks genAI associate
1 -
Databricks JDBC Driver
1 -
Databricks Job
1 -
Databricks Lakehouse Platform
6 -
Databricks Migration
1 -
Databricks Model
1 -
Databricks notebook
2 -
Databricks Notebooks
4 -
Databricks Platform
2 -
Databricks Pyspark
1 -
Databricks Python Notebook
1 -
Databricks Repo
1 -
Databricks Runtime
1 -
Databricks Serverless
2 -
Databricks SQL
5 -
Databricks SQL Alerts
1 -
Databricks SQL Warehouse
1 -
Databricks Terraform
1 -
Databricks UI
1 -
Databricks Unity Catalog
4 -
Databricks User Group
1 -
Databricks Workflow
2 -
Databricks Workflows
2 -
Databricks workspace
3 -
Databricks-connect
1 -
databricks_cluster_policy
1 -
DatabricksJobCluster
1 -
DataCleanroom
1 -
DataDays
1 -
Datagrip
1 -
DataMasking
2 -
DataVersioning
1 -
dbdemos
2 -
DBFS
1 -
DBRuntime
1 -
DBSQL
1 -
DDL
1 -
Dear Community
1 -
deduplication
1 -
Delt Lake
1 -
Delta Live Pipeline
3 -
Delta Live Table
5 -
Delta Live Table Pipeline
5 -
Delta Live Table Pipelines
4 -
Delta Live Tables
7 -
Delta Sharing
2 -
Delta Time Travel
1 -
deltaSharing
1 -
Deny assignment
1 -
Development
1 -
Devops
1 -
DLT
10 -
DLT Pipeline
7 -
DLT Pipelines
5 -
Dolly
1 -
Download files
1 -
DQX
1 -
Dynamic Variables
1 -
Engineering With Databricks
1 -
env
1 -
ETL Pipelines
1 -
Event Driven
1 -
External Sources
1 -
External Storage
2 -
FAQ for Databricks Learning Festival
2 -
Feature Store
2 -
Filenotfoundexception
1 -
Free Edition
1 -
Free trial
1 -
friendsofcommunity
1 -
GCP Databricks
1 -
GenAI
2 -
GenAI and LLMs
1 -
GenAI Course Material
1 -
Getting started
3 -
Google Bigquery
1 -
HIPAA
1 -
Hubert Dudek
2 -
import
2 -
Integration
1 -
JDBC Connections
1 -
JDBC Connector
1 -
Job Task
1 -
JSON Object
1 -
LakeflowDesigner
1 -
Learning
2 -
Lineage
1 -
LLM
1 -
Login
1 -
Login Account
1 -
Machine Learning
3 -
MachineLearning
1 -
Materialized Tables
2 -
Medallion Architecture
1 -
meetup
2 -
Metadata
1 -
Migration
1 -
ML Model
2 -
MlFlow
2 -
Model
1 -
Model Serving
1 -
Model Training
1 -
Module
1 -
Monitoring
1 -
Networking
2 -
Notebook
1 -
Onboarding Trainings
1 -
OpenAI
1 -
Pandas udf
1 -
Permissions
1 -
personalcompute
1 -
Pipeline
2 -
Plotly
1 -
PostgresSQL
1 -
Pricing
1 -
provisioned throughput
1 -
Pyspark
1 -
Python
5 -
Python Code
1 -
Python Wheel
1 -
Quickstart
1 -
Read data
1 -
Repos Support
1 -
Reset
1 -
Rewards Store
2 -
Sant
1 -
Schedule
1 -
Serverless
3 -
serving endpoint
1 -
Session
1 -
Sign Up Issues
2 -
Software Development
1 -
Spark
1 -
Spark Connect
1 -
Spark scala
1 -
sparkui
2 -
Speakers
1 -
Splunk
2 -
SQL
8 -
streamlit
1 -
Summit23
7 -
Support Tickets
1 -
Sydney
2 -
Table Download
1 -
Tags
3 -
terraform
1 -
Training
2 -
Troubleshooting
1 -
Unity Catalog
4 -
Unity Catalog Metastore
2 -
Update
1 -
user groups
2 -
Venicold
3 -
Vnet
1 -
Voucher Not Recieved
1 -
Watermark
1 -
Weekly Documentation Update
1 -
Weekly Release Notes
2 -
Women
1 -
Workflow
2 -
Workspace
3
- « Previous
- Next »
| User | Count |
|---|---|
| 140 | |
| 135 | |
| 57 | |
| 46 | |
| 42 |