- 2253 Views
- 2 replies
- 0 kudos
New to PySpark
Hi all,I am trying to get the domain from an email field using below expression; but getting an error.Kindly help. df.select(df.email, substring(df.email,instr(df.email,'@'),length(df.email).alias('domain')))
- 2253 Views
- 2 replies
- 0 kudos
- 0 kudos
In your case, you want to extract the domain from the email, which starts from the position just after '@'. So, you should add 1 to the position of '@'. Also, the length of the substring should be the difference between the total length of the email ...
- 0 kudos
- 1575 Views
- 1 replies
- 0 kudos
Issue in inferring schema for streaming dataframe using json files
Below is the pileine design in databricks and it's not working out , kindly look on this and let me know whether it will work or not , I'm getting json files of different schemas from directory under the root directory and it read all the files using...
- 1575 Views
- 1 replies
- 0 kudos
- 0 kudos
Could you please share some sample of your dataset and code snippet of what you're trying to implement?
- 0 kudos
- 5231 Views
- 0 replies
- 0 kudos
Database: Delta Lake or PostgreSQL
Hey all,I am searching for a non-political answer to my database questions. Please know that I am a data newbie and litteraly do not know anything about this topic, but I want to learn, so please be gentle. Some context: I am working for an OEM that...
- 5231 Views
- 0 replies
- 0 kudos
- 4889 Views
- 2 replies
- 3 kudos
Resolved! Pros and cons of physically separating data in different storage accounts and containers
When setting up Unity Catalog, it is recommended by Databricks to figure out your data isolation model when it comes to physically separating your data into different storage accounts and/or contaners. There are so many options, it can be hard to be ...
- 4889 Views
- 2 replies
- 3 kudos
- 3 kudos
Hello @pernilak , Thanks for reaching out to Databricks Community! My name is Raphael, and I'll be helping out. Should all catalogs and the metastore reside in the same storage account (but different containers) Yes, Databricks recommends having o...
- 3 kudos
- 2172 Views
- 0 replies
- 0 kudos
New draft for every post I visit
When I visit my profile page, under the drafts section I see an entry for every post I visit in the discussions. Is this normal?
- 2172 Views
- 0 replies
- 0 kudos
- 1366 Views
- 1 replies
- 1 kudos
Databricks Web Editor's Cell like UI in local IDE
I want to have databricks related developement locally.There is extension that allows to run local python file on remote databricks cluster.But I want to have cell like structure that is present in databricks UI for python files in local IDE as well....
- 1366 Views
- 1 replies
- 1 kudos
- 1 kudos
@swapnilmd You can use VSCode extension for Databricks.https://docs.databricks.com/en/dev-tools/vscode-ext/index.html
- 1 kudos
- 4234 Views
- 2 replies
- 0 kudos
Databricks jdbc driver connectiion issue with apache solr
Hi,databricks jdbc version - 2.6.34I am facing the below issue with connecting databricks sql from apache solr Caused by: java.sql.SQLFeatureNotSupportedException: [Databricks][JDBC](10220) Driver does not support this optional feature.at com.databri...
- 4234 Views
- 2 replies
- 0 kudos
- 0 kudos
Databricks team recommended to set IgnoreTransactions=1 and autocommit=false in the connection string but that didn't resolve the issue .Ultimately I had to use solr update API for uploading documents
- 0 kudos
- 2045 Views
- 3 replies
- 1 kudos
[Memory utilization in Metrics Tab still display after terminate a cluster]
Hi All,Could you guys help me to check this?I run a cluster and then terminate that cluster but when i navigate to the Metrics tab of Cluster still see the Memory utilization show metrics.Thanks
- 2045 Views
- 3 replies
- 1 kudos
- 1 kudos
here are my cluster display and my simple notebook:
- 1 kudos
- 2919 Views
- 3 replies
- 0 kudos
databricks-jdbc lists `spark_catalog` among catalogs for Standard tier Azure workspace
databricks-jdbc lists `spark_catalog` among catalogs for Standard tier Azure workspace. The UI lists `hive_metastore`. It would be better if these two were consistent.
- 2919 Views
- 3 replies
- 0 kudos
- 1988 Views
- 0 replies
- 0 kudos
Both `hive_metastore` and `spark_catalog`?
HiIs it possible for a workspace to have both a `hive_metastore` catalog and a `spark_catalog` catalog?
- 1988 Views
- 0 replies
- 0 kudos
- 3065 Views
- 0 replies
- 0 kudos
New to Spark
Hi all,I am new to Spark, trying to write below code but getting an error.Code:df1 = df.filter(df.col1 > 60 and df.col2 != 'abc') Any suggestion?
- 3065 Views
- 0 replies
- 0 kudos
- 8227 Views
- 3 replies
- 4 kudos
Resolved! Error not a delta table for Unity Catalog table
Is anyone able to advise why I am getting the error not a delta table? The table was created in Unity Catalog. I've also tried DeltaTable.forName and also using 13.3 LTS and 14.3 LTS clusters. Any advice would be much appreciated
- 8227 Views
- 3 replies
- 4 kudos
- 4 kudos
@StogponI believe if you are using DeltaTable.forPath then you have to pass the path where the table is. You can get this path from the Catalog. It is available in the details tab of the table.Example:delta_table_path = "dbfs:/user/hive/warehouse/xyz...
- 4 kudos
- 1841 Views
- 0 replies
- 0 kudos
Best practices for working with external locations where many files arrive constantly
I have an Azure Function that receives files (not volumes) and dumps them to cloud storage. One-five files are received approx. per second. I want to create a partitioned table in Databricks to work with. How should I do this? E.g.: register the cont...
- 1841 Views
- 0 replies
- 0 kudos
- 7919 Views
- 9 replies
- 0 kudos
Performance issue while calling mlflow endpoint
Hi,I have pyspark dataframe and pyspark udf which calls mlflow model for each row but its performance is too slow.Here is sample codedef myfunc(input_text): restult = mlflowmodel.predict(input_text) return resultmyfuncUDF = udf(myfunc,StringType(...
- 7919 Views
- 9 replies
- 0 kudos
- 2764 Views
- 1 replies
- 0 kudos
Resolved! Understanding Spark Architecture during Table Creation
Team ,I am trying understand how the parquet files and JSON under the delta log folder stores the data behind the scenesTable Creation:from delta.tables import *DeltaTable.create(spark) \.tableName("employee") \.addColumn("id", "INT") \.addColumn("na...
- 2764 Views
- 1 replies
- 0 kudos
- 0 kudos
@Ramakrishnan83 - Kindly go through the blog post - https://www.databricks.com/blog/2019/08/21/diving-into-delta-lake-unpacking-the-transaction-log.html which discuss in detail on delta's transaction log.
- 0 kudos
Join Us as a Local Community Builder!
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now-
Access Data
2 -
Access Delta Tables
2 -
Account reset
1 -
ADF Pipeline
1 -
ADLS Gen2 With ABFSS
1 -
Advanced Data Engineering
1 -
AI
1 -
Analytics
1 -
Apache spark
1 -
API Documentation
3 -
Architecture
1 -
asset bundle
1 -
Asset Bundles
1 -
Auto-loader
1 -
Autoloader
2 -
AWS
3 -
AWS security token
1 -
AWSDatabricksCluster
1 -
Azure
4 -
Azure data disk
1 -
Azure databricks
13 -
Azure Databricks SQL
5 -
Azure databricks workspace
1 -
Azure Unity Catalog
5 -
Azure-databricks
1 -
AzureDatabricks
1 -
AzureDevopsRepo
1 -
Big Data Solutions
1 -
Billing
1 -
Billing and Cost Management
1 -
Bronze Layer
1 -
Certification
3 -
Certification Exam
1 -
Certification Voucher
3 -
Cloud_files_state
1 -
CloudFiles
1 -
Cluster
3 -
Community Edition
3 -
Community Group
1 -
Community Members
1 -
Compute
3 -
Compute Instances
1 -
conditional tasks
1 -
Connection
1 -
Contest
1 -
Credentials
1 -
CustomLibrary
1 -
Data
1 -
Data + AI Summit
1 -
Data Engineering
3 -
Data Explorer
1 -
Data Ingestion & connectivity
1 -
databricks
2 -
Databricks Academy
1 -
Databricks AI + Data Summit
1 -
Databricks Alerts
1 -
Databricks Assistant
1 -
Databricks Certification
1 -
Databricks Cluster
2 -
Databricks Clusters
1 -
Databricks Community
9 -
Databricks community edition
3 -
Databricks Community Edition Account
1 -
Databricks Community Rewards Store
3 -
Databricks connect
1 -
Databricks Dashboard
2 -
Databricks delta
2 -
Databricks Delta Table
2 -
Databricks Demo Center
1 -
Databricks Documentation
2 -
Databricks genAI associate
1 -
Databricks JDBC Driver
1 -
Databricks Job
1 -
Databricks Lakehouse Platform
6 -
Databricks Migration
1 -
Databricks notebook
2 -
Databricks Notebooks
2 -
Databricks Platform
2 -
Databricks Pyspark
1 -
Databricks Python Notebook
1 -
Databricks Repo
1 -
Databricks Runtime
1 -
Databricks SQL
5 -
Databricks SQL Alerts
1 -
Databricks SQL Warehouse
1 -
Databricks UI
1 -
Databricks Unity Catalog
4 -
Databricks Workflow
2 -
Databricks Workflows
2 -
Databricks workspace
1 -
Databricks-connect
1 -
DatabricksJobCluster
1 -
DataDays
1 -
Datagrip
1 -
DataMasking
2 -
DataVersioning
1 -
dbdemos
2 -
DBFS
1 -
DBRuntime
1 -
DBSQL
1 -
DDL
1 -
Dear Community
1 -
deduplication
1 -
Delt Lake
1 -
Delta
22 -
Delta Live Pipeline
3 -
Delta Live Table
5 -
Delta Live Table Pipeline
5 -
Delta Live Table Pipelines
4 -
Delta Live Tables
7 -
Delta Sharing
2 -
deltaSharing
1 -
Deny assignment
1 -
Development
1 -
Devops
1 -
DLT
10 -
DLT Pipeline
7 -
DLT Pipelines
5 -
Dolly
1 -
Download files
1 -
Dynamic Variables
1 -
Engineering With Databricks
1 -
env
1 -
ETL Pipelines
1 -
External Sources
1 -
External Storage
2 -
FAQ for Databricks Learning Festival
2 -
Feature Store
2 -
Filenotfoundexception
1 -
Free trial
1 -
GCP Databricks
1 -
GenAI
1 -
Getting started
2 -
Google Bigquery
1 -
HIPAA
1 -
Integration
1 -
JDBC Connections
1 -
JDBC Connector
1 -
Job Task
1 -
Lineage
1 -
LLM
1 -
Login
1 -
Login Account
1 -
Machine Learning
2 -
MachineLearning
1 -
Materialized Tables
2 -
Medallion Architecture
1 -
Migration
1 -
ML Model
1 -
MlFlow
2 -
Model Training
1 -
Networking
1 -
Notebook
1 -
Onboarding Trainings
1 -
Pandas udf
1 -
Permissions
1 -
personalcompute
1 -
Pipeline
2 -
Plotly
1 -
PostgresSQL
1 -
Pricing
1 -
Pyspark
1 -
Python
4 -
Python Code
1 -
Python Wheel
1 -
Quickstart
1 -
Read data
1 -
Repos Support
1 -
Reset
1 -
Rewards Store
2 -
Schedule
1 -
Serverless
3 -
Session
1 -
Sign Up Issues
2 -
Spark
3 -
Spark Connect
1 -
sparkui
2 -
Splunk
1 -
SQL
8 -
Summit23
7 -
Support Tickets
1 -
Sydney
2 -
Table Download
1 -
Tags
1 -
Training
2 -
Troubleshooting
1 -
Unity Catalog
4 -
Unity Catalog Metastore
2 -
Update
1 -
user groups
1 -
Venicold
3 -
Voucher Not Recieved
1 -
Watermark
1 -
Weekly Documentation Update
1 -
Weekly Release Notes
2 -
Women
1 -
Workflow
2 -
Workspace
3
- « Previous
- Next »
User | Count |
---|---|
124 | |
61 | |
42 | |
30 | |
20 |