•  Lvxferre   ( @lvxferre@lemmy.ml ) 
      link
      fedilink
      English
      2
      edit-2
      10 months ago

      He tried it, in a rather dumb way, comparing whole strings; e.g. 123 Main St, Brooklyn, NY 11217 vs. 124 Main St, Brooklyn, NY 11217.

      It’s silly because his whole approach to the problem was assumptive. It’s fine to say “I don’t know”, or to code a program that does it. And yet he’s trying to dichotomise the program’s output to “same” vs. “different”.

        •  Lvxferre   ( @lvxferre@lemmy.ml ) 
          link
          fedilink
          English
          1
          edit-2
          10 months ago

          Yup - it’s stupid. The catch is that text is yet another example of people hyping generative bots and trying to “sell” the idea as the solution for everything and a bit more; and one of the ways to do that is to make the alternative look worse than it is, for example incorrectly using the other tools at your disposal.

          Even then I wouldn’t use fuzzy string matching here, it’s bound to introduce more false positives than it’s worth. Such as Ant Street and Aunt Street matching (Levenshtein distance = 1). In those cases it’s simply better to say “dunno”.