If you use and like `ag`, I suggest taking a look at ripgrep (`rg`). It seems to be by far the fastest out of three (`ack`, `ag`, `rg`). And it has a pretty interesting codebase (written in Rust).
If you're working in a git repository then IMO the most appropriate search tool is simply `git grep`. I don't think there's any reason to use ripgrep, ag, ack etc in that situation. (Personally, if I'm working with text files, then I'm nearly always in a git repo.)
Well at least one reason is because ripgrep is faster. On simple literal queries they'll have comparable speed, but beyond that, `git grep` is _a lot_ slower. Here's an example on a checkout of the Linux kernel:
$ time rg '\w+_PM_RESUME' | wc -l
8
real 0.127
user 0.689
sys 0.589
maxmem 19 MB
faults 0
$ time LC_ALL=C git grep -E '\w+_PM_RESUME' | wc -l
8
real 4.607
user 28.059
sys 0.442
maxmem 63 MB
faults 0
$ time LC_ALL=en_US.UTF-8 git grep -E '\w+_PM_RESUME' | wc -l
8
real 21.651
user 2:09.54
sys 0.413
maxmem 64 MB
faults 0
ripgrep supports Unicode by default, so it's actually comparable to the LC_ALL=en_US.UTF-8 variant.
There are other reasons. It is nice to use a single tool for searching in all circumstances. ripgrep can fit that role. Maybe you don't know, but ripgrep respects your .gitignore file.
Thanks! I knew ripgrep was praised in particular for its performance but I didn't know the difference was that large. The repo I usually work in has 8.7M lines of code and I had been finding `git grep` performance very adequate (I use it in combination with the Emacs helm library where it forms part of an incremental search UI, and hence gets called multiple times in quick succession in response to changing search input.) It looks like it will be fun to try swapping in ripgrep as the helm search backend; I'll try it.
I wouldn’t recommend anyone to use this. Other than really poor implementation and quality of the algorithms (some of which are totally incorrect), code in that repo is anything but Pythonic - reimplementing a lot of things from the standard library, not using list, dict and set comprehensions, using indexes instead of iterators, copying things around for no reason etc. They didn’t even care to use linter to PEP8-ify it.
People commenting that $35 is too much for the content included - if you want an equivalent set of channels from Comcast/XFINITY, it will cost you almost 3x more. So it's a non brainer for me, and the fact I don't have to deal with Comcast is worth even more than saving ~60% of my monthly cable bill.
This of course is only true if you pay for cable TV.
I have Netflix and Amazon and the reset of the internet. I had a HDHomeRun hooked to an over the air but since I moved I have not hooked that up again yet as I need a huge mast to get reception.
You can use a VPN like service to get world wide streams also.
I do pay Comcast for a business internet connection to keep away from the data caps however.
Not sure why are you getting downvoted, I came here to make the same comment.
QPM is a useless metric. When talking about distributed systems from engineering point of view, you always want to use QPS. QPM is simply not fined-grained enough to show whether the traffic is bursty or not. For example in this particular case, when you say 1M QPM that can mean anything - they might be idle for 50s and then get 100k QPS for the next ten seconds, or they might be getting 15k QPS all the time (like it's visible on the graph). Distributed systems are designed for the peak workload, not for the average one. Using misleading numbers like QPM leads to bad design and sizing decisions.
The only case where you would use QPM, QPD and similar metrics is when you want to artificially show your numbers bigger than they are (10M transactions a day sounds better than 115 transactions a second). But those should be used by sales, not by engineers.
I read it originally as 1M QPS, and thought that was a nice number. It was upon further inspection that I saw it was 1M QPM, and I was no longer intrigued.
I'm not sure if my org qualifies. Depends on how you count it I guess. We have 80k+ RPS at times at SendGrid, and each request can generate 4 to 8 external events and at least a dozen internal API calls. If you count total internal QPS, that would be something in the order of 80k * 4 * 12 ~= 3.8M QPS. I'd have to check with an operations person to see if that checks out. I don't know if it is fair to count this though. So, let's go back to the 80k RPS. If someone was doing 10X that, I'd be intrigued to learn more about their set up for sure. I imagine the Googles, Facebooks, and Amazons of our industry do this level of traffic.
It's basically the Sieve of Eratosthenes[1] algorithm, and it's possible to do it with regular expressions because numbers here are represented in unary[2] number system, where number of characters/tokens equals the number itself. It's a common trick for testing various Turing-machine stuff.
Sometimes I find typing grep/cut/awk/etc to be easier to remember than custom flags and thus faster to type. Often times my time spent looking through the man page is better spent just writing a more verbose command line.
+1. You can see the same effect in natural language as modern English has fewer tenses and declensions and makes heavier use of helper words, as contrasted with olde English. Same with Latin vs modern romance.
One thing that I was pretty annoyed about while testing (server) beta and alpha was Cockpit web UI that is enabled by default. I know it's easy to disable it with `systemctl disable cocpit.socket`, but if you select "minimal/base install" you shouldn't get a full blown web UI management console installed and enabled by default.
I'm interested in this issue. As I understand it, Cockpit shouldn't be included in a minimal install. It is included in the default Fedora Server install.
If you like, join us on IRC in #cockpit on FreeNode, and we can work through this there.
It's installed by default but not running by default. So the only resource it consumes is a small amount of disk space, plus one socket so it autostarts if you try to use it (by connecting to example.com:8888 or whatever it is).
If it's listening on a socket and spins up on a request sent to the port that socket is bound on, for all intents and purposes, it is running and enabled. This is the same behavior as old inetd-based servers.
Correct, if you try to open http://<hostname>:9090/ it will get automatically started. I'm not concerned about the resources it uses, I just don't like having services that I don't use installed and listening on ports in minimal install, especially on servers.
Even if API gave you strong durability guarantees, it still wouldn't mean much. Disk caches, big enterprise SAN attached storages etc, they can also "cheat", saying they flushed the cache while they actually didn't.
The API allows blaming the right cog in the machinery. Now everyone gets a Get Out Of Jail Free card because they can blame the kernel, filesystem drivers, userspace libraries, application developers, disk controllers, and whatnot, and thus nothing forces a single direction in that cyclic graph of blame.