- 539 Views
- 1 replies
- 0 kudos
Data difference between SQL warehouse and all-purpose compute
Hey everyone,Executing the following query on my sql warehouse does not return any data:select * from acc_bolt.eod.configurationhistory where netarea = '541454827900000139';However running the same query using an all-purpose compute does return the e...
- 539 Views
- 1 replies
- 0 kudos
- 0 kudos
Hi @Daan , Greetings! Can you please confirm which SQL warehouse you are using here? If you are using a serverless, then can you try to run the query with PRO/Classic warehouse? Kind Regards, Ayushi
- 0 kudos
- 7237 Views
- 4 replies
- 0 kudos
Pyspark cast error
Hi All,hive> create table UK ( a decimal(10,2)) ;hive> create table IN ( a decimal(10,5)) ;hive> create view T as select a from UK union all select a from IN ;above all statements executes successfully in Hive and return results when select statement...
- 7237 Views
- 4 replies
- 0 kudos
- 0 kudos
Hi Nandini,Thanks for sharing the above solution. To be sure my understanding is correct, could you confirm below please ?hive> create table test.UK ( a decimal(10,2)) ;hive> create table test.IN ( a decimal(10,5)) ;hive> create view test.T as select...
- 0 kudos
- 364 Views
- 1 replies
- 0 kudos
EMR cluster pyspark scripts to databricks
Hi All,The PySpark scripts currently operating on the EMR cluster need to be migrated to Databricks. Are there any tools available that can assist in minimizing the time required for code conversion? Your suggestions would be appreciated.Regards,Phan...
- 364 Views
- 1 replies
- 0 kudos
- 0 kudos
Hello @Phani1, This guide can help you: https://www.databricks.com/resources/guide/emr-databricks-migration-guide
- 0 kudos
- 565 Views
- 1 replies
- 0 kudos
Airflow jobs migration to Databricks Workflows
Hi All,We need to move our Airflow jobs over to Databricks Workflows. Are there any tools out there that can help with this migration and make the process quicker? If you have any sample code or documents that could assist, I would really appreciate ...
- 565 Views
- 1 replies
- 0 kudos
- 0 kudos
Hi @Phani1, Please see this post which can help you: https://community.databricks.com/t5/data-engineering/migrating-logic-from-airflow-dags-to-databricks-workflow/td-p/104501
- 0 kudos
- 1051 Views
- 2 replies
- 3 kudos
Snowflake to Databricks migration
We are working on a proposal for our existing customer to migrate approximately 500 tables and the associated business logic from Snowflake to Databricks. The business logic is currently implemented using stored procedures, which need to be converted...
- 1051 Views
- 2 replies
- 3 kudos
- 3 kudos
Hi @tarunnagpal !!Adding to what @MariuszK said,Using an LLM to accelerate the translation process is a great approach, but if the code is proprietary, it's best to use a closed model.Implementing a validation process is crucial to ensure that the tr...
- 3 kudos
- 1365 Views
- 4 replies
- 1 kudos
Different JSON Results when Running a Job vs Running a Notebook
I have a regularly scheduled job that runs a PySpark Notebook that GETs semi-structured JSON data from an external API, loads that data into dataframes, and saves those dataframes to delta tables in Databricks. I have the schema for the JSON defined ...
- 1365 Views
- 4 replies
- 1 kudos
- 1 kudos
@Alberto_Umana Sounds good, thank you for looking into it and let me know if there's any additional information I can provide in the meantime!
- 1 kudos
- 2209 Views
- 4 replies
- 9 kudos
Zero to Hero - Databricks
Hi all!In a nutshell, I want to go from zero to hero with Databricks. I'd like to pursue the Databricks Data Engineering pathway, I think that makes sense as I have a background with Alteryx.I'd really like to get hands on whilst learning. Are the le...
- 2209 Views
- 4 replies
- 9 kudos
- 9 kudos
@MariuszK thanks for the link to your medium article. There's some great stuff in there!Good point about the 30 day Azure free trial for Databricks.
- 9 kudos
- 861 Views
- 4 replies
- 0 kudos
dbt error: Data too long for column at row 1
Hi there!We are experiencing a Databricks error we don’t recognise when we are running one of our event-based dbt models in dbt core (version 1.6.18). The dbt model uses the ‘insert_by_period’ materialisation that is still experimental for version 1....
- 861 Views
- 4 replies
- 0 kudos
- 0 kudos
We are yet to upgrade dbt core to the latest version but will check again once we have done so.
- 0 kudos
- 1948 Views
- 4 replies
- 2 kudos
Resolved! Unity Catalog Migration: External AWS S3 Location Tables vs. Managed Tables in Databricks!
Hey Databricks enthusiasts!Migrating to Unity Catalog? Understanding the difference between External S3 Location Tables and Managed Tables is crucial for optimizing governance, security, and cost efficiency.External S3 Location TablesData remains in ...
- 1948 Views
- 4 replies
- 2 kudos
- 2 kudos
Hey!I hope I’m not too late, and I’d like to share my opinion. While it’s true that managed services offer certain advantages over external tables, you should keep in mind that Databricks services often come with an associated cost, such as Predictiv...
- 2 kudos
- 408 Views
- 1 replies
- 0 kudos
Terminated cluster on free account
Hi,I mistakenly terminated my cluster. Could you please advise on how I can reactivate the same cluster?
- 408 Views
- 1 replies
- 0 kudos
- 0 kudos
Hi @Lupo123, To reactivate a terminated cluster on a free Databricks account, you will need to create a new cluster. Unfortunately, once a cluster is terminated, it cannot be reactivated
- 0 kudos
- 8689 Views
- 4 replies
- 2 kudos
Gathering Data Off Of A PDF File
Hello everyone,I am developing an application that accepts pdf files and inserts the data into my database. The company in question that distributes this data to us only offers PDF files, which you can see attached below (I hid personal info for priv...
- 8689 Views
- 4 replies
- 2 kudos
- 2 kudos
You can use PDF Data Source for read data from pdf files. Examples here: https://stabrise.com/blog/spark-pdf-on-databricks/And after that use Scale DP library for extract data from the text in declarative way using LLM. Here is example of extraction ...
- 2 kudos
- 1299 Views
- 1 replies
- 0 kudos
Speaker diarization on databricks with Nemo throwing error
The configuration of my compute is 15.4 LTS ML (includes Apache Spark 3.5.0, GPU, Scala 2.12)Standard_NC8as_T4_v3 on Azure Databricks
- 1299 Views
- 1 replies
- 0 kudos
- 0 kudos
Hi @Nishat ,It looks like there's a problem with GPU compability. As mentioned in the error message, FlashAttention only supports Ampere GPUs or newer.According to following thread, GPU architecture you've chosen is not supportedRuntimeError: FlashAt...
- 0 kudos
- 731 Views
- 1 replies
- 0 kudos
DBT RUN Command not working while invoked using subprocess.run
Hi,I am using below code to run DBT Model from notebook.I am using parameters to pass DBT run command(project directory, profile directory, schema name etc). The issue is, when I am running this code in my local workspace it is working fine but when ...
- 731 Views
- 1 replies
- 0 kudos
- 0 kudos
Hi @dk09, Can you share the path of: dbt_project_directory and also try inputting the folder path manually to debug it, does it still fail?
- 0 kudos
- 976 Views
- 2 replies
- 0 kudos
INSERT OVERWRITE DIRECTORY
I am using this query to create a csv in a volume named test_volsrr that i createdINSERT OVERWRITE DIRECTORY '/Volumes/DATAMAX_DATABRICKS/staging/test_volsrr'USING CSVOPTIONS ('delimiter' = ',', 'header' = 'true')SELECT * FROM staging.extract1gbDISTR...
- 976 Views
- 2 replies
- 0 kudos
- 0 kudos
The DISTRIBUTE BY COALESCE(1) clause is intended to reduce the number of output files to one. However, this can lead to inefficiencies and large file sizes because it forces all data to be processed by a single task, which can cause memory and perfor...
- 0 kudos
- 2293 Views
- 2 replies
- 0 kudos
Discrepancy in Performance Reading Delta Tables from S3 in PySpark
Hello Databricks Community,I've encountered a puzzling performance difference while reading Delta tables from S3 using PySpark, particularly when applying filters and projections. I'm seeking insights to understand this variation better.I've attempte...
- 2293 Views
- 2 replies
- 0 kudos
- 0 kudos
Use the explain method to analyze the execution plans for both methods and identify any inefficiencies or differences in the plans. You can also review the metrics to understand this further. https://www.databricks.com/discover/pages/optimize-data-wo...
- 0 kudos
Join Us as a Local Community Builder!
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now-
.CSV
1 -
Access Data
2 -
Access Delta Tables
2 -
Account reset
1 -
ADF Pipeline
1 -
ADLS Gen2 With ABFSS
1 -
Advanced Data Engineering
1 -
AI
1 -
Analytics
1 -
Apache spark
1 -
Apache Spark 3.0
1 -
API Documentation
3 -
Architecture
1 -
asset bundle
1 -
Asset Bundles
2 -
Auto-loader
1 -
Autoloader
4 -
AWS
3 -
AWS security token
1 -
AWSDatabricksCluster
1 -
Azure
5 -
Azure data disk
1 -
Azure databricks
14 -
Azure Databricks SQL
6 -
Azure databricks workspace
1 -
Azure Unity Catalog
5 -
Azure-databricks
1 -
AzureDatabricks
1 -
AzureDevopsRepo
1 -
Big Data Solutions
1 -
Billing
1 -
Billing and Cost Management
1 -
Blackduck
1 -
Bronze Layer
1 -
Certification
3 -
Certification Exam
1 -
Certification Voucher
3 -
CICDForDatabricksWorkflows
1 -
Cloud_files_state
1 -
CloudFiles
1 -
Cluster
3 -
Community Edition
3 -
Community Event
1 -
Community Group
1 -
Community Members
1 -
Compute
3 -
Compute Instances
1 -
conditional tasks
1 -
Connection
1 -
Contest
1 -
Credentials
1 -
CustomLibrary
1 -
Data
1 -
Data + AI Summit
1 -
Data Engineering
3 -
Data Explorer
1 -
Data Ingestion & connectivity
1 -
Databrick add-on for Splunk
1 -
databricks
2 -
Databricks Academy
1 -
Databricks AI + Data Summit
1 -
Databricks Alerts
1 -
Databricks Assistant
1 -
Databricks Certification
1 -
Databricks Cluster
2 -
Databricks Clusters
1 -
Databricks Community
10 -
Databricks community edition
3 -
Databricks Community Edition Account
1 -
Databricks Community Rewards Store
3 -
Databricks connect
1 -
Databricks Dashboard
2 -
Databricks delta
2 -
Databricks Delta Table
2 -
Databricks Demo Center
1 -
Databricks Documentation
2 -
Databricks genAI associate
1 -
Databricks JDBC Driver
1 -
Databricks Job
1 -
Databricks Lakehouse Platform
6 -
Databricks Migration
1 -
Databricks notebook
2 -
Databricks Notebooks
3 -
Databricks Platform
2 -
Databricks Pyspark
1 -
Databricks Python Notebook
1 -
Databricks Repo
1 -
Databricks Runtime
1 -
Databricks SQL
5 -
Databricks SQL Alerts
1 -
Databricks SQL Warehouse
1 -
Databricks Terraform
1 -
Databricks UI
1 -
Databricks Unity Catalog
4 -
Databricks Workflow
2 -
Databricks Workflows
2 -
Databricks workspace
2 -
Databricks-connect
1 -
DatabricksJobCluster
1 -
DataDays
1 -
Datagrip
1 -
DataMasking
2 -
DataVersioning
1 -
dbdemos
2 -
DBFS
1 -
DBRuntime
1 -
DBSQL
1 -
DDL
1 -
Dear Community
1 -
deduplication
1 -
Delt Lake
1 -
Delta
22 -
Delta Live Pipeline
3 -
Delta Live Table
5 -
Delta Live Table Pipeline
5 -
Delta Live Table Pipelines
4 -
Delta Live Tables
7 -
Delta Sharing
2 -
deltaSharing
1 -
Deny assignment
1 -
Development
1 -
Devops
1 -
DLT
10 -
DLT Pipeline
7 -
DLT Pipelines
5 -
Dolly
1 -
Download files
1 -
Dynamic Variables
1 -
Engineering With Databricks
1 -
env
1 -
ETL Pipelines
1 -
External Sources
1 -
External Storage
2 -
FAQ for Databricks Learning Festival
2 -
Feature Store
2 -
Filenotfoundexception
1 -
Free trial
1 -
GCP Databricks
1 -
GenAI
1 -
Getting started
2 -
Google Bigquery
1 -
HIPAA
1 -
import
1 -
Integration
1 -
JDBC Connections
1 -
JDBC Connector
1 -
Job Task
1 -
Lineage
1 -
LLM
1 -
Login
1 -
Login Account
1 -
Machine Learning
2 -
MachineLearning
1 -
Materialized Tables
2 -
Medallion Architecture
1 -
Migration
1 -
ML Model
1 -
MlFlow
2 -
Model Training
1 -
Module
1 -
Networking
1 -
Notebook
1 -
Onboarding Trainings
1 -
Pandas udf
1 -
Permissions
1 -
personalcompute
1 -
Pipeline
2 -
Plotly
1 -
PostgresSQL
1 -
Pricing
1 -
Pyspark
1 -
Python
5 -
Python Code
1 -
Python Wheel
1 -
Quickstart
1 -
Read data
1 -
Repos Support
1 -
Reset
1 -
Rewards Store
2 -
Schedule
1 -
Serverless
3 -
Session
1 -
Sign Up Issues
2 -
Spark
3 -
Spark Connect
1 -
sparkui
2 -
Splunk
2 -
SQL
8 -
Summit23
7 -
Support Tickets
1 -
Sydney
2 -
Table Download
1 -
Tags
1 -
Training
2 -
Troubleshooting
1 -
Unity Catalog
4 -
Unity Catalog Metastore
2 -
Update
1 -
user groups
1 -
Venicold
3 -
Voucher Not Recieved
1 -
Watermark
1 -
Weekly Documentation Update
1 -
Weekly Release Notes
2 -
Women
1 -
Workflow
2 -
Workspace
3
- « Previous
- Next »
User | Count |
---|---|
133 | |
88 | |
42 | |
42 | |
30 |