- 979 Views
- 2 replies
- 0 kudos
INSERT OVERWRITE DIRECTORY
I am using this query to create a csv in a volume named test_volsrr that i createdINSERT OVERWRITE DIRECTORY '/Volumes/DATAMAX_DATABRICKS/staging/test_volsrr'USING CSVOPTIONS ('delimiter' = ',', 'header' = 'true')SELECT * FROM staging.extract1gbDISTR...
- 979 Views
- 2 replies
- 0 kudos
- 0 kudos
The DISTRIBUTE BY COALESCE(1) clause is intended to reduce the number of output files to one. However, this can lead to inefficiencies and large file sizes because it forces all data to be processed by a single task, which can cause memory and perfor...
- 0 kudos
- 2295 Views
- 2 replies
- 0 kudos
Discrepancy in Performance Reading Delta Tables from S3 in PySpark
Hello Databricks Community,I've encountered a puzzling performance difference while reading Delta tables from S3 using PySpark, particularly when applying filters and projections. I'm seeking insights to understand this variation better.I've attempte...
- 2295 Views
- 2 replies
- 0 kudos
- 0 kudos
Use the explain method to analyze the execution plans for both methods and identify any inefficiencies or differences in the plans. You can also review the metrics to understand this further. https://www.databricks.com/discover/pages/optimize-data-wo...
- 0 kudos
- 1428 Views
- 1 replies
- 0 kudos
Error changing connection information of Databricks data source posted on Tableau server
HelloThere is a Databricks data source published on the Tableau server.When I click the 'Edit Data Source' button in the location where the data source is published and go to the Data Source tab, and change the Databricks connection information (HTTP...
- 1428 Views
- 1 replies
- 0 kudos
- 0 kudos
1) I am thinking if there are saved auth, which could cause the issue. 2) If possible, try using different authentication methods (e.g., Personal Access Token) to see if the issue persists. This can help identify if the problem is specific to the aut...
- 0 kudos
- 1062 Views
- 2 replies
- 1 kudos
How to download the results in batches
Hello, how are you?I`m trying to download some of my results on databricks and the sheets is around 300mb, unfortunately my google sheets is not open files that has more then 100mb. Is that any chance that i could download the results in batches to ...
- 1062 Views
- 2 replies
- 1 kudos
- 1 kudos
Hey, Thinking of more alternates to repartition: 1- Use the limit and offset options in your SQL queries to export data in manageable chunks. For example, if you have a table with 100,000 rows and you want to export 10,000 rows at a time, you can us...
- 1 kudos
- 1935 Views
- 1 replies
- 0 kudos
React.js and Databricks Apps
Is there documentation and support for React and Databricks apps. Similar to the diagram below:
- 1935 Views
- 1 replies
- 0 kudos
- 0 kudos
Documentation for https://docs.databricks.com/en/dev-tools/databricks-apps/index.html You can use https://react.dev/ documentation to leverage react, and develop your UI.
- 0 kudos
- 432 Views
- 1 replies
- 0 kudos
COPY INTO from Volume failure (rabbit hole)
hey guys, I am stuck on a loading task, and I simply can't spot what is wrong. The following query fails: COPY INTO `test`.`test_databricks_tokenb3337f88ee667396b15f4e5b2dd5dbb0`.`pipeline_state`FROM '/Volumes/test/test_databricks_tokenb3337f88ee6673...
- 432 Views
- 1 replies
- 0 kudos
- 0 kudos
I see you are reading just 1 file, ensure that there are no zero-byte files in the directory. Zero-byte files can cause schema inference to fail. Double-check that the directory contains valid Parquet files using parquet tools. Sometimes, even if the...
- 0 kudos
- 540 Views
- 1 replies
- 0 kudos
How to identify the goal of a specific Spark job?
I'm analyzing the performance of a DBR/Spark request. In this case, the cluster is created using a custom image, and then we run a job on it.I've dived into the "Spark UI" part of the DBR interface, and identified 3 jobs that appear to account for an...
- 540 Views
- 1 replies
- 0 kudos
- 0 kudos
The spark jobs are decided based on your spark code. You can look at the spark plan to understand what operations each spark job/stage is executing
- 0 kudos
- 1279 Views
- 3 replies
- 1 kudos
Databricks workspace adjust column width
Hi, is it possible to change the column width in the workspace overview? Currently I have a lot of jobs with a name which is too wide for the standard overview and so it not easy to find certain jobs.
- 1279 Views
- 3 replies
- 1 kudos
- 1 kudos
Ahh my mistake! You are right. It can be done only in workflow
- 1 kudos
- 1179 Views
- 2 replies
- 0 kudos
JDBC Invalid SessionHandle with dbSQL Warehouse
Connecting Pentaho Ctools dashboards to Databricks using JDBC to a serverless dbSQL Warehouse, it works fine on the initial load, but then if we leave it idle for awhile and come back we get this error:[Databricks][JDBCDriver](500593) Communication l...
- 1179 Views
- 2 replies
- 0 kudos
- 0 kudos
I should have mentioned that we're using AuthMech=3 and in the JDBC docs (Databricks JDBC Driver Installation and Configuration Guide) I don't see any relevant timeout settings that would apply in that scenario. Am I missing something?
- 0 kudos
- 1086 Views
- 6 replies
- 1 kudos
Unity Catalog for Enterprise level governance
Can we import cataloguing information from other non Databricks workloads into unity catalog? Importing metadata information from Synapse, Redshift, ADF etc. into Unity catalog for end to end lineage and tracking?
- 1086 Views
- 6 replies
- 1 kudos
- 1 kudos
Yes, it is possible, but limited at the moment. This is being implemented and under private preview. There is an API called "Bring-your-own Lineage". You can test it but for that you would need to contact your account team to allow you to use the fea...
- 1 kudos
- 634 Views
- 1 replies
- 0 kudos
Understanding Photon Row Group Skipping
Hey guys!I am using Photon to do a simple point query on a Liquid Clustered table with the purpose of understanding the statistics. I see that a significant number of files have been pruned (`files pruned`: 1104, `files read`:files read).However I am...
- 634 Views
- 1 replies
- 0 kudos
- 0 kudos
Hi @tomvogel01 , "row groups skipped via lazy materialization" refers to the process where certain row groups are not physically read into memory during query execution. This is due to the ability of Photon to perform filtering at the row group level...
- 0 kudos
- 10470 Views
- 2 replies
- 1 kudos
how to use R in databricks
Hello everyone.I am a new user of databricks, they implemented it in the company where I work. I am a business analyst and I know something about R, not much either, when I saw that databricks could use R I was very excited because I thought that the...
- 10470 Views
- 2 replies
- 1 kudos
- 1 kudos
There are some existing posts about using R in databricks:https://docs.gcp.databricks.com/en/sparkr/index.htmlhttps://docs.databricks.com/en/dev-tools/databricks-connect/cluster-config.htmlOnce you have the correct cluster started (this post is about...
- 1 kudos
- 7699 Views
- 8 replies
- 1 kudos
log signature and input data for Spark LinearRegression
I am looking for a way to log my `pyspark.ml.regression.LinearRegression` model with input and signature ata. The usual example that I found around are using sklearn and they can simply do # Log the model with signature and input example signature =...
- 7699 Views
- 8 replies
- 1 kudos
- 1 kudos
I accidentally stumbled upon this ticket when researching on a similar issue. Note that starting from MLflow 2.15.0 it supports VectorUDT. https://mlflow.org/releases/2.15.0
- 1 kudos
- 415 Views
- 0 replies
- 0 kudos
Patient Risk Score based on health history: Unable to create data folder for artifacts in S3 bucket
Hi All,we're using the below git project to build PoC on the concept of "Patient-Level Risk Scoring Based on Condition History": https://github.com/databricks-industry-solutions/hls-patient-riskI was able to import the solution into Databricks and ru...
- 415 Views
- 0 replies
- 0 kudos
- 595 Views
- 1 replies
- 0 kudos
Data Bricks x Power BI Report Server
I connected two .pbix files to the local server. In the first, I used Import connectivity, and in the second, Direct Query connectivity. However, I encountered the following problems: Import connection: The data is viewed successfully, but it is not ...
- 595 Views
- 1 replies
- 0 kudos
- 0 kudos
@Iguinrj11 wrote:I connected two .pbix files to the local server. In the first, I used Import connectivity, and in the second, Direct Query connectivity. However, I encountered the following problems: Import connection: The data is viewed successfull...
- 0 kudos
Join Us as a Local Community Builder!
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now-
.CSV
1 -
Access Data
2 -
Access Databricks
1 -
Access Delta Tables
2 -
Account reset
1 -
ADF Pipeline
1 -
ADLS Gen2 With ABFSS
1 -
Advanced Data Engineering
1 -
AI
1 -
Analytics
1 -
Apache spark
1 -
Apache Spark 3.0
1 -
API Documentation
3 -
Architecture
1 -
asset bundle
1 -
Asset Bundles
2 -
Auto-loader
1 -
Autoloader
4 -
AWS
3 -
AWS security token
1 -
AWSDatabricksCluster
1 -
Azure
5 -
Azure data disk
1 -
Azure databricks
14 -
Azure Databricks SQL
6 -
Azure databricks workspace
1 -
Azure Delta Lake
1 -
Azure Unity Catalog
5 -
Azure-databricks
1 -
AzureDatabricks
1 -
AzureDevopsRepo
1 -
Big Data Solutions
1 -
Billing
1 -
Billing and Cost Management
1 -
Blackduck
1 -
Bronze Layer
1 -
Certification
3 -
Certification Exam
1 -
Certification Voucher
3 -
CICDForDatabricksWorkflows
1 -
Cloud_files_state
1 -
CloudFiles
1 -
Cluster
3 -
Community Edition
3 -
Community Event
1 -
Community Group
1 -
Community Members
1 -
Compute
3 -
Compute Instances
1 -
conditional tasks
1 -
Connection
1 -
Contest
1 -
Credentials
1 -
CustomLibrary
1 -
Data
1 -
Data + AI Summit
1 -
Data Engineering
3 -
Data Explorer
1 -
Data Ingestion & connectivity
1 -
Databrick add-on for Splunk
1 -
databricks
2 -
Databricks Academy
1 -
Databricks AI + Data Summit
1 -
Databricks Alerts
1 -
Databricks Assistant
1 -
Databricks Certification
1 -
Databricks Cluster
2 -
Databricks Clusters
1 -
Databricks Community
10 -
Databricks community edition
3 -
Databricks Community Edition Account
1 -
Databricks Community Rewards Store
3 -
Databricks connect
1 -
Databricks Dashboard
2 -
Databricks delta
2 -
Databricks Delta Table
2 -
Databricks Demo Center
1 -
Databricks Documentation
2 -
Databricks genAI associate
1 -
Databricks JDBC Driver
1 -
Databricks Job
1 -
Databricks Lakehouse Platform
6 -
Databricks Migration
1 -
Databricks notebook
2 -
Databricks Notebooks
3 -
Databricks Platform
2 -
Databricks Pyspark
1 -
Databricks Python Notebook
1 -
Databricks Repo
1 -
Databricks Runtime
1 -
Databricks SQL
5 -
Databricks SQL Alerts
1 -
Databricks SQL Warehouse
1 -
Databricks Terraform
1 -
Databricks UI
1 -
Databricks Unity Catalog
4 -
Databricks Workflow
2 -
Databricks Workflows
2 -
Databricks workspace
2 -
Databricks-connect
1 -
DatabricksJobCluster
1 -
DataDays
1 -
Datagrip
1 -
DataMasking
2 -
DataVersioning
1 -
dbdemos
2 -
DBFS
1 -
DBRuntime
1 -
DBSQL
1 -
DDL
1 -
Dear Community
1 -
deduplication
1 -
Delt Lake
1 -
Delta
22 -
Delta Live Pipeline
3 -
Delta Live Table
5 -
Delta Live Table Pipeline
5 -
Delta Live Table Pipelines
4 -
Delta Live Tables
7 -
Delta Sharing
2 -
deltaSharing
1 -
Deny assignment
1 -
Development
1 -
Devops
1 -
DLT
10 -
DLT Pipeline
7 -
DLT Pipelines
5 -
Dolly
1 -
Download files
1 -
Dynamic Variables
1 -
Engineering With Databricks
1 -
env
1 -
ETL Pipelines
1 -
External Sources
1 -
External Storage
2 -
FAQ for Databricks Learning Festival
2 -
Feature Store
2 -
Filenotfoundexception
1 -
Free trial
1 -
GCP Databricks
1 -
GenAI
1 -
Getting started
2 -
Google Bigquery
1 -
HIPAA
1 -
import
1 -
Integration
1 -
JDBC Connections
1 -
JDBC Connector
1 -
Job Task
1 -
Lineage
1 -
LLM
1 -
Login
1 -
Login Account
1 -
Machine Learning
2 -
MachineLearning
1 -
Materialized Tables
2 -
Medallion Architecture
1 -
Migration
1 -
ML Model
1 -
MlFlow
2 -
Model Training
1 -
Module
1 -
Networking
1 -
Notebook
1 -
Onboarding Trainings
1 -
Pandas udf
1 -
Permissions
1 -
personalcompute
1 -
Pipeline
2 -
Plotly
1 -
PostgresSQL
1 -
Pricing
1 -
Pyspark
1 -
Python
5 -
Python Code
1 -
Python Wheel
1 -
Quickstart
1 -
Read data
1 -
Repos Support
1 -
Reset
1 -
Rewards Store
2 -
Schedule
1 -
Serverless
3 -
Session
1 -
Sign Up Issues
2 -
Spark
3 -
Spark Connect
1 -
sparkui
2 -
Splunk
2 -
SQL
8 -
Summit23
7 -
Support Tickets
1 -
Sydney
2 -
Table Download
1 -
Tags
1 -
Training
2 -
Troubleshooting
1 -
Unity Catalog
4 -
Unity Catalog Metastore
2 -
Update
1 -
user groups
1 -
Venicold
3 -
Voucher Not Recieved
1 -
Watermark
1 -
Weekly Documentation Update
1 -
Weekly Release Notes
2 -
Women
1 -
Workflow
2 -
Workspace
3
- « Previous
- Next »
User | Count |
---|---|
133 | |
90 | |
42 | |
42 | |
30 |