- 2091 Views
- 2 replies
- 0 kudos
Merge version data files of Delta table
Hi,I am having one CDC enabled Delta table. In 256th version, table is having 50 data files. I want all to merge and create a single file. How can I merge all 50 data file and when I query for 256th version, I should get 1 data file? Is there any com...
- 2091 Views
- 2 replies
- 0 kudos
- 0 kudos
Hi, ae you talking about merging CSV files? https://community.databricks.com/t5/machine-learning/merge-12-csv-files-in-databricks/td-p/3551#:~:text=Use%20Union()%20method%20to,from%20the%20specified%20set%2Fs.
- 0 kudos
- 1425 Views
- 0 replies
- 0 kudos
Deleting external table takes 8 hrs
Hi,I am trying to delete the data from the external partitioned table, it has around 3 years of data, and the partition is created on the date column.I am trying to delete each partition first and then the schema of the table, which takes around 8hrs...
- 1425 Views
- 0 replies
- 0 kudos
- 1772 Views
- 0 replies
- 0 kudos
why the code breaks below?
from pyspark.sql import SparkSessionfrom pyspark.ml.regression import LinearRegressionfrom pyspark.ml.feature import VectorAssemblerfrom pyspark.ml.evaluation import RegressionEvaluatorfrom pyspark.ml import Pipelineimport numpy as np# Create a Spark...
- 1772 Views
- 0 replies
- 0 kudos
- 1437 Views
- 2 replies
- 0 kudos
Having trouble with ARC (Automated Record Connector) Python Notebook
I'm trying to use Databricks ARC (Automated Record Connector) and running into an object issue. I assume I'm missing something rather trivial that's not related to ARC. #Databricks Python notebook #CMD1 import AutoLinker from arc.autolinker import A...
- 1437 Views
- 2 replies
- 0 kudos
- 0 kudos
https://www.databricks.com/blog/improving-public-sector-decision-making-simple-automated-record-linking and https://github.com/databricks-industry-solutions/auto-data-linkage#databricks-runtime-requirements
- 0 kudos
- 1344 Views
- 2 replies
- 0 kudos
Delta Sharing CDF API error: "RESOURCE_LIMIT_EXCEEDED"
Hi, When attempting to read a particular version from the Databricks Delta Sharing CDF (Change Data Feed) API, even when that version contains only one data file, an error occurs due to a timeout with following message:"errorCode": "RESOURCE_LIMIT_EX...
- 1344 Views
- 2 replies
- 0 kudos
- 0 kudos
Hi Data_Analytics1Use Optimize on your delta tables. Refer https://docs.databricks.com/en/sql/language-manual/delta-optimize.html
- 0 kudos
- 875 Views
- 1 replies
- 0 kudos
Databricks Pricing Model
Hello Everyone!!I'm new on Databricks Platform & I'm using Databricks for learning purposes.I want to undetstand pricing model of databricks. How Databricks calculates DBU from Compute Type & Instance Type For AWS, Azure & GCP.Can Anyone Explain It.T...
- 875 Views
- 1 replies
- 0 kudos
- 0 kudos
I would like to share the following url https://www.databricks.com/product/pricing/product-pricing/instance-types it will help you to get a price estimation
- 0 kudos
- 801 Views
- 1 replies
- 0 kudos
Databricks certification
I joined two events on 7th Sept,2023 and it was told that by oct 1st week 50% discounted voucher will be given .Also after joining I filled survey by attaching lakehouse fundamentals certificate. Still I didn't get any mail regarding this voucher cou...
- 801 Views
- 1 replies
- 0 kudos
- 0 kudos
@SouvikngTo expedite your request, please list your concern on our ticketing portal. Our support staff would be able to act faster on the resolution (our standard resolution time is 24-48 hours). Thank you for posting your concern on Community!
- 0 kudos
- 1722 Views
- 2 replies
- 0 kudos
Disable personal compute for everyone including workspace admins
If we disable personal compute feature in the account console, it gets deactivated only for non-admin users but still admin users are able to create personal compute clusters. Is there way to restrict to everyone? If not, can you raise a feature requ...
- 1722 Views
- 2 replies
- 0 kudos
- 0 kudos
@DebayanI am talking about the personal compute feature here not the way how clusters are created. If personal compute feature is set to delegate, it should be disable for workspace admin users as well. If this is not supported, it's good to have fea...
- 0 kudos
- 1557 Views
- 1 replies
- 0 kudos
how to upgrade pip associated with the default python
We have a job scheduled and submitted via Airflow to Databricks using api: api/2.0/jobs/runs/submit. Each time the job runs an ephemeral cluster will be launched and during the process a virtual env named: /local_disk0/.ephemeral_nfs/cluster_librarie...
- 1557 Views
- 1 replies
- 0 kudos
- 0 kudos
Hi, I got an interesting article on the same. You can follow and let us know if this helps. Please tag @Debayan with your next comment which will notify me!
- 0 kudos
- 754 Views
- 0 replies
- 0 kudos
Is it possible to embed Databricks Workspace plataform in my site?
I want to use databricks in my site, because there're some funcionality the user needs to use in my website, but I want the user to use databricks workspace.. is it possible to embed or do something like that ?
- 754 Views
- 0 replies
- 0 kudos
- 1457 Views
- 1 replies
- 1 kudos
Monitor all Streaming jobs to make sure they are in RUNNING status.
Hi Experts,Is there any way that we can monitor all our Streaming jobs in workspace to make sure they are in "RUNNING" status?I could see there is one option to create a batch job that runs frequently and check the status(through REST API) of all str...
- 1457 Views
- 1 replies
- 1 kudos
- 1 kudos
- 1 kudos
- 2994 Views
- 1 replies
- 1 kudos
Call a workspace notebook from a repository notebook
We have a Databricks workspace with several repositories. We'd like to have a place with shared configuration variables that can be accessed by notebooks in any repository.I created a folder named Shared under the root workspace and in that folder, c...
- 2994 Views
- 1 replies
- 1 kudos
- 1 kudos
- 1 kudos
- 1628 Views
- 1 replies
- 0 kudos
Pyspark API reference
All,I am using Azure Databricks and at times I refer to pyspark API's to interact with data in Azure datalake using python, SQL here https://spark.apache.org/docs/3.5.0/api/python/reference/pyspark.sql/index.htmlDoes databricks website has the list o...
- 1628 Views
- 1 replies
- 0 kudos
- 0 kudos
- 0 kudos
- 1158 Views
- 0 replies
- 0 kudos
Partitioning or Processing : Reading CSV file with size of 5 to 9 GB
Hi Team,Would you please guide me onInstance with 28GB and 8 Cores1. how data bricks reading 5 to 9GB files from BLOB storage ? ( directly loaded full file into one nodes memory )2. howmany tasks will be created based on Core ? how many executors wil...
- 1158 Views
- 0 replies
- 0 kudos
- 1435 Views
- 1 replies
- 2 kudos
The base provider of Delta Sharing Catalog system does not exist.
I have enabled system tables in Databricks by following the procedure mentioned here. The owner of the system catalog is System user. I cannot see the schemas or tables of this catalog. It is showing me the error: The base provider of Delta Sharing C...
- 1435 Views
- 1 replies
- 2 kudos
- 2 kudos
I have already enabled all these schemas using the Databricks CLI command. After enabling, I was able to see all the tables and data inside these schemas. Then I disabled the all the schemas using the CLI command mentioned here. Now, even after re-en...
- 2 kudos
Connect with Databricks Users in Your Area
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group-
12.2 LST
1 -
Access Data
2 -
Access Delta Tables
2 -
Account reset
1 -
ADF Pipeline
1 -
ADLS Gen2 With ABFSS
1 -
Analytics
1 -
Apache spark
1 -
API
2 -
API Documentation
2 -
Architecture
1 -
Auto-loader
1 -
Autoloader
2 -
AWS
3 -
AWS security token
1 -
AWSDatabricksCluster
1 -
Azure
2 -
Azure data disk
1 -
Azure databricks
10 -
Azure Databricks SQL
5 -
Azure databricks workspace
1 -
Azure Unity Catalog
4 -
Azure-databricks
1 -
AzureDatabricks
1 -
AzureDevopsRepo
1 -
Best Practices
1 -
Big Data Solutions
1 -
Billing
1 -
Billing and Cost Management
1 -
Bronze Layer
1 -
Bug
1 -
Catalog
1 -
Certification
1 -
Certification Exam
1 -
Certification Voucher
1 -
CICD
2 -
cleanroom
1 -
Cli
1 -
Cloud_files_state
1 -
cloudera sql
1 -
CloudFiles
1 -
Cluster
3 -
clusterpolicy
1 -
Code
1 -
Community Group
1 -
Community Social
1 -
Compute
2 -
conditional tasks
1 -
Connection
1 -
Cost
2 -
Credentials
1 -
CustomLibrary
1 -
CustomPythonPackage
1 -
DABs
1 -
Data Engineering
2 -
Data Explorer
1 -
Data Ingestion & connectivity
1 -
DataAISummit2023
1 -
DatabrickHive
1 -
databricks
2 -
Databricks Academy
1 -
Databricks Alerts
1 -
Databricks Audit Logs
1 -
Databricks Certified Associate Developer for Apache Spark
1 -
Databricks Cluster
1 -
Databricks Clusters
1 -
Databricks Community
1 -
Databricks connect
1 -
Databricks Dashboard
1 -
Databricks delta
2 -
Databricks Delta Table
2 -
Databricks Documentation
1 -
Databricks JDBC
1 -
Databricks Job
1 -
Databricks jobs
2 -
Databricks Lakehouse Platform
1 -
Databricks notebook
1 -
Databricks Notebooks
2 -
Databricks Platform
1 -
Databricks Pyspark
1 -
Databricks Python Notebook
1 -
Databricks Repo
1 -
Databricks SQL
1 -
Databricks SQL Alerts
1 -
Databricks SQL Warehouse
1 -
Databricks UI
1 -
Databricks Unity Catalog
3 -
Databricks Workflow
2 -
Databricks Workflows
2 -
Databricks workspace
1 -
DatabricksJobCluster
1 -
DataDays
1 -
Datagrip
1 -
DataMasking
2 -
dbdemos
1 -
DBRuntime
1 -
DDL
1 -
deduplication
1 -
Delt Lake
1 -
Delta
13 -
Delta Live Pipeline
3 -
Delta Live Table
5 -
Delta Live Table Pipeline
5 -
Delta Live Table Pipelines
4 -
Delta Live Tables
6 -
Delta Sharing
2 -
deltaSharing
1 -
denodo
1 -
Deny assignment
1 -
Devops
1 -
DLT
9 -
DLT Pipeline
6 -
DLT Pipelines
5 -
DLTCluster
1 -
Documentation
2 -
Dolly
1 -
Download files
1 -
dropduplicatewithwatermark
1 -
Dynamic Variables
1 -
Engineering With Databricks
1 -
env
1 -
External Sources
1 -
External Storage
2 -
FAQ for Databricks Learning Festival
1 -
Feature Store
2 -
Filenotfoundexception
1 -
Free trial
1 -
GCP Databricks
1 -
Getting started
1 -
glob
1 -
Good Documentation
1 -
Google Bigquery
1 -
hdfs
1 -
Help
1 -
How to study Databricks
1 -
informatica
1 -
Jar
1 -
Java
1 -
JDBC Connector
1 -
Job Cluster
1 -
Job Task
1 -
Kubernetes
1 -
LightGMB
1 -
Lineage
1 -
LLMs
1 -
Login
1 -
Login Account
1 -
Machine Learning
1 -
MachineLearning
1 -
masking
1 -
Materialized Tables
2 -
Medallion Architecture
1 -
Metastore
1 -
MlFlow
2 -
Mlops
1 -
Model Serving
1 -
Model Training
1 -
Mount
1 -
Networking
1 -
nic
1 -
Okta
1 -
ooze
1 -
os
1 -
Password
1 -
Permission
1 -
Permissions
1 -
personalcompute
1 -
Pipeline
2 -
policies
1 -
PostgresSQL
1 -
Pricing
1 -
pubsub
1 -
Pyspark
1 -
Python
2 -
Python Code
1 -
Python Wheel
1 -
Quickstart
1 -
RBAC
1 -
Repos Support
1 -
Reserved VM's
1 -
Reset
1 -
run a job
1 -
runif
1 -
S3
1 -
SAP SUCCESS FACTOR
1 -
Schedule
1 -
SCIM
1 -
Serverless
1 -
Service principal
1 -
Session
1 -
Sign Up Issues
2 -
Significant Performance Difference
1 -
Spark
2 -
sparkui
2 -
Splunk
1 -
sqoop
1 -
Start
1 -
Stateful Stream Processing
1 -
Storage Optimization
1 -
Structured Streaming ForeachBatch
1 -
suggestion
1 -
Summit23
2 -
Support Tickets
1 -
Sydney
2 -
Table Download
1 -
tabrikck
1 -
Tags
1 -
Troubleshooting
1 -
ucx
2 -
Unity Catalog
1 -
Unity Catalog Error
2 -
Unity Catalog Metastore
1 -
UntiyCatalog
1 -
Update
1 -
user groups
1 -
Venicold
3 -
volumes
2 -
Voucher Not Recieved
1 -
Watermark
1 -
Weekly Documentation Update
1 -
with open
1 -
Women
1 -
Workflow
2 -
Workspace
2
- « Previous
- Next »