How to identify similarly pronounced words in SQL server?

Rajanand Ilangovan
SQL Server
Published in
2 min readMay 7, 2022

--

There are two functions in SQL Server that are used to identify whether the two strings are pronounced similarly or not.

They are

  • SOUNDEX() - This function takes a string as parameter and returns a four-character code. This code is called as Soundex. When this code is calculated it basically ignores the vowels (A, E, I, O, U), H, W, and Y unless they are the first letter of the string.
  • DIFFERENCE() - This function takes two strings as parameter and returns a integer value from 1 to 4. This function internally calculates the SOUNDEX code for each of the string and find the difference between the two SOUNDEX code.
SELECT 
SOUNDEX ('SQL') AS SQL,
SOUNDEX ('Sequel') AS Sequel,
DIFFERENCE('SQL', 'Sequel') AS Similarity;
SELECT
SOUNDEX ('Michael Jackson') AS Michael_Jackson,
SOUNDEX ('Mitchel Johnson') AS Mitchel_Johnson,
DIFFERENCE('Michael Jackson','Mitchel Johnson') AS Similarity;
SELECT
SOUNDEX ('Ramesh') AS Ramesh,
SOUNDEX ('Suresh') AS Suresh,
DIFFERENCE('Ramesh','Suresh') AS Similarity;
SELECT
SOUNDEX ('Tamil') AS Tamil,
SOUNDEX ('Malayalam') AS Malayalam,
DIFFERENCE('Tamil','Malayalam') AS Similarity;

The output of the DIFFERENCE function
1 - Not similar
2 - Very less similar
3 - Some what similar
4 - Exact match/ Mostly similar

If you like this question, you may also like these…

Originally published at blog.rajanand.org.

--

--