Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The example in the article with cat, grep and awk:

    cat *.pgn | \
    grep "Result" | \
    awk '
     {
        split($0, a, "-");
        res = substr(a[1], length(a[1]), 1);
        if (res == 1) white++;
        if (res == 0) black++;
        if (res == 2) draw++;
      }
      END { print white+black+draw, white, black, draw }
    '
Can be written much more succinctly with just awk, and you don't even need to split the string or use substr:

    awk '
      /Result/ {
        if (/1\/2/) draw++;
        else if (/1-0/) white++;
        else if (/0-1/) black++;
      }
      END { print white+black+draw, white, black, draw }
    ' *.pgn


Keep reading, he removes the cat and grep in the final solution.


Yes, but he still keeps the awkward Awk code with the substr and such. I haven't benchmarked, maybe that's faster than the pretty regex matches.


I believe this is to be a bit more educative about how to build a pipeline. Also, iteratively building such solutions quickly often leads to such "inefficiencies" but makes things easier to reason with. Besides, the awk step may have been factored out in the end so it wouldn't make sense to optimise early. Also, by the time the author reaches the end, he gets IO-bound so there's not much need to optimise further (in the context of the exercise).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: