cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

regexp_count seems to not work as it should

Rajasql
New Contributor II
The below SQL's should give different answers as regexp rules state that * is a special character which needs to be escaped with \ to be considered as a literal string. This second should literally match for A*B and return 2 but it is also taking AB as a match and returning 3
 
SELECT regexp_count('nA*BsABreA*Bthe', 'A*B') str_cnt;
 
SELECT regexp_count('nA*BsABreA*Bthe', 'A\*B') str_cnt;
4 REPLIES 4

Walter_C
Databricks Employee
Databricks Employee

Is this being executed on SQL editor or on a Notebook?

Rajasql
New Contributor II

It is being executed on a notebook.

PabloCSD
Valued Contributor

Hello @Rajasql ,

Try this way (it worked for me in a Databricks notebook and it returns 2):

SELECT regexp_count('nA*BsABreA*Bthe', 'A\\*B') str_cnt;

Rajasql
New Contributor II

It fails if this is the code and returns 3 which it should not.

SELECT regexp_count('nA*BsA\*BreA*Bthe', 'A\\*B') str_cnt;
 
This is what I am following for Regexp standards

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group