Ummm, the author used awk for the first version in the 1990s. > This was easy to...

ChancyChance · on May 5, 2023

I know. I’m talking about Perl. Try to keep up.

eesmith · on May 5, 2023

Do pay attention.

You wrote "Prior to Perl doing this was painful.".

I highlighted how you contradicted the author's claim that it was easy to do using awk. Awk existed long before Perl.

eesmith · on May 5, 2023

As a concrete example, while my awk skills are extremely rusty, here's a program which will normalize the input line, create a table mapping the normalized name to the matching original lines, then at the end it only displays the ones with at least 5 anagrams.

I tried to avoid modern awk features, like asort, to be something that would have worked in the 1980s:

  {
      # Convert to normal form:
      #  1. Fold to lower case
      #  2. Bin the letters to get frequency counts
      #  3. Only consider lower case ASCII letters
      split(tolower($0), letters, "");
  
      # Ignore asort() in modern awks and do a bin sort instead.
      for (i in letters) {
          c = letters[i];
          repeats[c] = repeats[c] c;
      }
  
      normal_form = "";
      for (i=97; i<=122; i++) {
          c = sprintf("%c", i); # no chr() in a 1980s awk
          normal_form = normal_form repeats[c];
      }
  
      table[normal_form] = table[normal_form] "," $0
      delete repeats;
  }
  END {
      # Only show the ones with at least 5 matches
      for (i in table) {
          match_str = table[i];
          split(match_str, matches, ",");
          num_matches = length(matches)-1
          if (num_matches >= 5) {
              # print the number of matches, then the match string
              printf("%d%s\n", num_matches, match_str);
          }
      }
  }

When I try it on a word list I have handy, here are the most common words:

  % awk -f anagram.awk < words_alpha.txt | sort -n -t, | tail -5
  13,elaps,lapse,leaps,lepas,pales,peals,pleas,salep,saple,sepal,slape,spale,speal
  14,anestri,antsier,asterin,eranist,nastier,ratines,resiant,restain,retains,retinas,retsina,stainer,starnie,stearin
  14,apers,apres,asper,pares,parse,pears,prase,presa,rapes,reaps,repas,spaer,spare,spear
  14,arest,aster,astre,rates,reast,resat,serta,stare,strae,tares,tarse,tears,teras,treas
  15,alerts,alters,artels,estral,laster,lastre,rastle,ratels,relast,resalt,salter,slater,staler,stelar,talers

Certainly Perl is more succinct, though note that even up to Perl 4 in the early 1990s you would need to use the string concatenation method to store the list of matches in the table.

But, "painful"? No. Not to someone who knew how to use awk.

ChancyChance · on May 5, 2023

What’s the word for when snark backfires?

DonHopkins · on May 5, 2023

snark backfires = barf earns kicks

doctor_eval · on May 5, 2023

“Sorry”?

hgsgm · on May 5, 2023

krans