Fuzzy Functions
The Fuzzy module provides functions for fuzzy string matching and similarity calculations.
Including the Fuzzy Module
To use the Fuzzy functions, include the module at the top of your mq script:
include "fuzzy"
Functions
levenshtein(s1, s2)
Calculates the Levenshtein distance between two strings. The Levenshtein distance is the minimum number of single-character edits (insertions, deletions, or substitutions) required to change one string into the other.
Parameters:
s1: First string to compares2: Second string to compare
Returns:
- Integer representing the Levenshtein distance (0 means strings are identical)
Example:
include "fuzzy"
# Calculate Levenshtein distance
| levenshtein("hello", "hallo")
# Returns: 1
| levenshtein("kitten", "sitting")
# Returns: 3
| levenshtein("identical", "identical")
# Returns: 0
| levenshtein("", "abc")
# Returns: 3
jaro(s1, s2)
Calculates the Jaro distance between two strings. The Jaro distance is a measure of similarity between strings, ranging from 0.0 (no similarity) to 1.0 (exact match).
Parameters:
s1: First string to compares2: Second string to compare
Returns:
- Float between 0.0 and 1.0 (1.0 indicates exact match)
Example:
include "fuzzy"
# Calculate Jaro distance
| jaro("hello", "hallo")
# Returns: 0.866667
| jaro("martha", "marhta")
# Returns: 0.9444444444444444
| jaro("identical", "identical")
# Returns: 1.0
| jaro("", "abc")
# Returns: 0.0
jaro_winkler(s1, s2)
Calculates the Jaro-Winkler distance between two strings. This is a variant of the Jaro distance with a prefix scale that gives more favorable ratings to strings with common prefixes.
Parameters:
s1: First string to compares2: Second string to compare
Returns:
- Float between 0.0 and 1.0 (1.0 indicates exact match)
Example:
include "fuzzy"
# Calculate Jaro-Winkler distance
| jaro_winkler("hello", "hallo")
# Returns: 0.88
| jaro_winkler("martha", "marhta")
# Returns: 0.9611111111111111
| jaro_winkler("prefix_test", "prefix_example")
# Returns: 0.84724
| jaro_winkler("identical", "identical")
# Returns: 1.0
fuzzy_match(candidates, query)
Performs fuzzy matching on an array of strings using the Jaro-Winkler distance algorithm. Returns results sorted by similarity score in descending order.
Parameters:
candidates: Array of strings to search within, or a single stringquery: String to search for
Returns:
- Array of objects with
textandscoreproperties, sorted by best match first
Example:
include "fuzzy"
# Fuzzy match with multiple candidates
| fuzzy_match(["hallo", "hello", "hi", "help"], "hello")
# Returns: [
# {"text": "hello", "score": 1},
# {"text": "hallo", "score": 0.88},
# {"text": "help", "score": 0.848333},
# {"text": "hi", "score": 0.1}
# ]
# Fuzzy match with single candidate
| "testing" | fuzzy_match("test")
# Returns: [{"text": "testing", "score": 0.8095238095238095}]
fuzzy_match_levenshtein(candidates, query)
Performs fuzzy matching using Levenshtein distance. Returns results sorted by distance (lower distance means better match).
Parameters:
candidates: Array of strings to search withinquery: String to search for
Returns:
- Array of objects with
textandscoreproperties, sorted by lowest distance first
Example:
include "fuzzy"
# Fuzzy match using Levenshtein distance
| fuzzy_match_levenshtein(["hallo", "hello", "hi", "help"], "hello")
# Returns: [
# {"text": "hello", "score": 0},
# {"text": "hallo", "score": 1},
# {"text": "help", "score": 2},
# {"text": "hi", "score": 4}
# ]
fuzzy_match_jaro(candidates, query)
Performs fuzzy matching using the Jaro distance algorithm. Returns results sorted by similarity score in descending order.
Parameters:
candidates: Array of strings to search withinquery: String to search for
Returns:
- Array of objects with
textandscoreproperties, sorted by best match first
Example:
include "fuzzy"
# Fuzzy match using Jaro distance
| fuzzy_match_jaro(["hallo", "hello", "hi", "help"], "hello")
# Returns: [
# {"text": "hello", "score": 1.0},
# {"text": "hallo", "score": 0.8666666666666667},
# {"text": "help", "score": 0.7333333333333334},
# {"text": "hi", "score": 0.0}
# ]
fuzzy_filter(candidates, query, threshold)
Filters candidates by minimum fuzzy match score using Jaro-Winkler distance. Only returns matches that meet or exceed the specified threshold.
Parameters:
candidates: Array of strings to search withinquery: String to search forthreshold: Minimum score threshold (0.0 to 1.0)
Returns:
- Array of objects with
textandscoreproperties for matches above threshold
Example:
include "fuzzy"
# Filter matches with minimum threshold
| fuzzy_filter(["hallo", "hello", "hi", "help"], "hello", 0.7)
# Returns: [
# {"text": "hello", "score": 1.0},
# {"text": "hallo", "score": 0.8666666666666667},
# {"text": "help", "score": 0.7333333333333334}
# ]
# Filter with high threshold
| fuzzy_filter(["hallo", "hello", "hi", "help"], "hello", 0.9)
# Returns: [
# {"text": "hello", "score": 1.0}
# ]
fuzzy_best_match(candidates, query)
Finds the best fuzzy match from candidates using Jaro-Winkler distance.
Parameters:
candidates: Array of strings to search withinquery: String to search for
Returns:
- Object with
textandscoreproperties for the best match, orNoneif no matches found
Example:
include "fuzzy"
# Find best match
| fuzzy_best_match(["hallo", "hi", "help"], "hello")
# Returns: {"text": "hallo", "score": 0.88}
# No matches case
| fuzzy_best_match([], "xyz")
# Returns: None