Hacker Newsnew | past | comments | ask | show | jobs | submit | aorist's commentslogin

The process where resources accrue to those with more resources is called the Matthew Effect. It explains, amongst other things, why the degree distribution of social networks follows a power law.

There's a nice experimental test of this where showing the number of previous downloads a song has makes it more likely to be downloaded (but not to the extent that it entirely overrides the quality of the song. <https://www.princeton.edu/~mjs3/salganik_dodds_watts06_full....>


> Examples include converting boxplots into violins or vice versa, turning a line plot into a heatmap, plotting a density estimate instead of a histogram, performing a computation on ranked data values instead of raw data values, and so on.

Most of this is not about Python, it’s about matplotlib. If you want the admittedly very thoughtful design of ggplot in Python, use plotnine

> I would consider the R code to be slightly easier to read (notice how many quotes and brackets the Python code needs)

This isn’t about Python, it’s about the tidyverse. The reason you can use this simpler syntax in R is because it’s non-standard-evaluation allows packages to extend the syntax in a way Python does not expose: http://adv-r.had.co.nz/Computing-on-the-language.html


"The reason you can use this simpler syntax in R is because it’s non-standard-evaluation ..."

So it actually is about Python vs R.

That said, while this kind of non-standard evaluation is nice when working interactively on the command line, I don't think it's that relevant when writing code for more elaborated analyses. In that context, I'd actually see this as a disadvantage of R because you suddenly have to jump through loops to make trivial things work with that non-standard evaluation.


The increasing prevalence of non-standard evaluation in R packages was one of the major reasons I switched from R to python for my work. The amount of ceremony and constant API changes just to have something as an argument in a function drove me mad.


> nd constant API changes

Yeah, this was so very very painful. I once ended up maintaining a library that basically used all the different NSE approaches, which was not very much fun at all.


>> I would consider the R code to be slightly easier to read (notice how many quotes and brackets the Python code needs)

Oh god no, do people write R like that, pipes at the end? Elixir style pipe-operators at the beginning is the way.

And if you really wanted to "improve" readability by confusing arguments/functions/vars just to omit quotes, python can do that, you'll just need a wrapper object and getattr hacks to get from `my_magic_strings.foo` -> `'foo'`. As for the brackets.. ok that's a legitimate improvement, but again not language related, it's library API design for function sigs.


The right way is putting the pipe operator at the beginning of the expression.

  (-> (gather-some-data)
    (map 'Vector #'some-functor)
    (filter #'some-predicate)
    (reduce #'some-gatherer))
Or for those who have an irrational fear of brackets:

  ->
    gather-some-data
    map 'Vector #'some-functor
    filter #'some-predicate
    reduce #'some-gatherer


IIRC, putting pipe operator `|>` at end of line prevents the expression from terminating early. Otherwise the newline would terminate it.


Upvoted for pipes at the beginning


Or seaborn. It was built exactly for this purpose: abstracting some of the annoying kinks of matplotlib while still offering a rich set of features.

https://seaborn.pydata.org/tutorial/introduction.html


I wonder what the last example of "logistics without libraries" would look like in R. Based on my experience of having to do "low-level" R, it's gonna be a true horror show.

In R it's often that things for which there's a ready made libraries and recipes are easy, but when those don't exist, things become extremely hard. And the usual approach is that if something is not easy with a library recipe, it just is not done.


Python: easy things are easy, hard things are hard.

R: easy things are hard, hard things are easy.


The way you describe it, can we say that R was AI-first without even knowing?


R is overtly and heavily inspired by Lisp which was a big deal in AI at one point. They knew what they were doing.


> This isn’t about Python, it’s about the tidyverse.

> it’s non-standard-evaluation allows packages to extend the syntax in a way Python does not expose

Well this is a fundamental difference between Python and R.


The point is that the ability to extend the syntax of R leads to chaos and mess (in general) but when used correctly and effectively in the tidyverse, improves the experience of writing and reading code.


Python is nothing without it’s batteries.


The design and success of e.g. Golang is pretty strong support for the idea that you can't and shouldn't separate a language from its broader ecosystem of tooling and packages.


The success of python is due to not needing a broader ecosystem for A LOT of things.

They are of course now abandoning this idea.


> The success of python is due to not needing a broader ecosystem for A LOT of things.

I honestly think that was a coincidence. Perl and Ruby had other disadvantages, Python won despite having bad package management and a bloated standard library, not because of it.


The bloated standard library is the only reason I kept using python in spite of the packaging nightmare. I can do most things with no dependencies, or with one dependency I need over and over like matplotlib

If python had been lean and needed packages to do anything useful, while still having a packaging nightmare, it would have been unusable


Well, sure, but equally I think there would have been a lot more effort to fix the packaging nightmare if it had been more urgent.


There was a massive effort though, the proliferation of several different package managers is evidence of that.


Maybe. A lot of them felt like one-person projects that not many people cared about. I think that on the contrary, part of the reason so many different package managers could coexist with no clear winner emerging was that the problem wasn't very serious for a lot of the community.


The bloated standard library is the reason why you can send around a single .py file to others and they can execute it instantly.

Most of the python users are not able nor aware of venv, uv, pip and all of that.


It's because Ruby captured the web market and Python everything else, and I get everything is more timeless than a single segment.


Ruby was competing on the web market and lost to many others, including Python. In part, because python had a much broader ecosystem, and php had wide adoption through wordpress and others, and javascript was expanding from browsers.


Python is its batteries.


But why whenever I try to use it, it tries to hurt me like it's kicking me right in my batteries?


What language is used to write the batteries


C/C++, in large part


These days it's a whole lot of Rust.


These days it’s still a whole lot of Fortran, with some Rust sprinkled on top. (:


Which since Fortran 2003, or even Fortran 95, has gotten rather nice to use.


IDK it's become too verbose IMHO, looks almost like COBOL now. (I think it was Fortran 66 that was the last Fortran true to its nature as a "Formula Translator"...)


We are way beyond comparing languages to COBOL, now that plenty folks type whole book sized descriptions into tiny chat windows for their AI overloads.


And below that, FORTRAN :)


I hear this so much from Python people -- almost like they are paid by the word to say it. Is it different from Perl, Ruby, Java, or C# (DotNet)? Not in my experience, except people from those communities don't repeat that phrase so much.

The irony here: We are talking about data science. 98% of "data science" Python projects start by creating a virtual env and adding Pandas and NumPy which have numerous (really: squillions of) dependencies outside the foundation library.


Someone correct me if I'm completely wrong, but by default (i.e. precompiled wheels) numpy has 0 dependencies and pandas has 5, one of which is numpy. So not really "squillions" of dependencies.

pandas==2.3.3

├── numpy [required: >=1.22.4, installed: 2.2.6]

├── python-dateutil [required: >=2.8.2, installed: 2.9.0.post0]

│ └── six [required: >=1.5, installed: 1.17.0]

├── pytz [required: >=2020.1, installed: 2025.2]

└── tzdata [required: >=2022.7, installed: 2025.2]


Read https://numpy.org/devdocs/building/blas_lapack.html.

NumPy will fall back to internal and very slow BLAS and LAPACK implementations if your system does not have a better one, but assuming you're using NumPy for its performance and not just the convenience of adding array programming features to Python, you're really gonna want better ones, and what that is heavily depends on the computer you're using.

This isn't really a Python thing, though. It's a hard problem to solve with any kind of scientific computing. If you insist on using a dynamic interpreted language, which you probably have to do for exploratory interactive analysis, and you still need speed over large datasets, you're gonna need to have a native FFI and link against native libraries. Thanks to standardization, you'll have many choices and which is fastest depends heavily on your hardware setup.


The wheels will most likely come with openblas, so while you can get the original blas (which is really only slow by comparison, for small tasks it's likely users won't notice), this is generally not an issue.


I don't know about _squillions_, but numpy definitely has _requirements_, even if they're not represented as such in the python graph.

e.g.

  https://github.com/numpy/numpy/blob/main/.gitmodules (some source code requirements)
  https://github.com/numpy/numpy/tree/main/requirements (mostly build/ci/... requirements)
  ...


They're not represented, because those are build-time dependencies. Most users when they do pip install numpy or equivalent, just get the precompiled binaries and none of those get installed. And even if you compile it yourself, you still don't need those for running numpy.


It's not about Python, it's about how R lets you do something Python can't?


R is more of a statistical software than a programming language. So, if you are a so-called "statistician," then R will feel familiar to you


No, R is a serious general purpose programming language that is great for building almost any type of complex scientific software with. Projects like Bioconductor are a good example.


Perhaps a in a context of comparison with Python?

In my limited experience, Using R feels like to using JavaScript in the browser: it's a platform heavily focused on advanced, feature-rich objects (such as DataFrames and specialized plot objects). but you could also just build almost anything with it.


No, it's not. Even established packages have bugs caused by R weirdness. I like it nevertheless.


Yes, R is a proper general purpose programming language. Turing complete, functional, procedural, object oriented.../


Just in case someone reads this far and sees blubber's confident "No." Blubber is definitely wrong here. I used to do all of my programming in R. Throw the question into an LLM if you're wondering if R has a package like ___ in python.


I know people who used Visual Basic for all of their programming. I'd say No either way unless people explained to me without bursting out into laughter that they also have extensive experience with, e.g., Kotlin, Rust, C#, Java etc. and still prefer VB or R for non-trivial programs.


Of course R isn't a complied language and probably not the same category as C/Rust as systems language but is not in the same category as VB. R is a serious scientific programming language used in non-trivial programs for industrial applications. See Posit's customers. I suggest John Chambers ( https://en.wikipedia.org/wiki/John_Chambers_(statistician) ) book, he explain how he designed S language, R's grandfather so to speak, Software for Data Analysis ( https://link.springer.com/book/10.1007/978-0-387-75936-4 ).


This isn't about compilation vs interpretation. R is simply badly designed as a programming language. This doesn't change just because its inventor wrote a book.


blubber, I think there might be some misconceptions. Just for the record.

R is not actually competing with those languages. R's design purpose is different. it is a general purpose computational language for scientists. There are FFIs (Foreign Function Interfaces) for all those languages.

R-Kotlin-Jave: https://journal.r-project.org/articles/RJ-2018-066/ R-Rust: https://cran.r-project.org/web/packages/using_rust.html R-C# : https://github.com/Open-Systems-Pharmacology/rsharp/

R is supporting C integration natively anyhow (see Chambers's book.

Regarding VB reference. VB was used in finance a lot to do some advanced maths. just a side remark.


I do.

And I'm still waiting for your examples of "established R packages with bugs caused by R weirdness".


Care to give some examples?


I already did in my comment


Hmm? I was referring to blubber's claim that "established packages have bugs caused by R weirdness."


This specific analysis isn’t p-hacking because although they conduct multiple tests, they report all of them rather than just the statistically significant ones.

They should however account for multiple testing. The Bonferroni correction (which is conservative) would set the alpha level to 0.05/5=0.01, for which the 1 day after result is still (just) statistically significant.

Not to say there couldn’t be other problems.


> Thread is called “fils” in French, meaning “son”, considering channels are parents of threads in a sense.

Could be, but “fil” also literally means “thread” in the sewing sense.


You need assumptions exactly for the things which cannot be tested against reality. Otherwise why assume rather than measure?


> Otherwise why assume rather than measure?

You need a model to design an experiment.

This is how all science is done. You hypothesize. Then experiment.


all of

> designed a dozen sites I never published > I was never on time > I couldn't get up for class on time

And depending on why

> don't drive > I feel like an alien, and most everyone drives me insane. > most of them I cut off without a word, and those that reach out I resent

also sounds like how someone with ADHD could describe themselves, and the other issues could be downstream from that


Can buy-and-hold investors use margin?


Of course, but margin is expensive for retail people.


This study can't see past its own midwit view that there is an objective “detailed, literal” reading that necessarily produces the same interpretation of the text that the authors have.

The students in the study are responding in a rational way to the way HS English is taught: the pretense is that you're deriving meaning/themes/symbolism from the text, but these interpretations are often totally made-up[^0] to the extent that authors can't answer the standardized tests about their own work[^1]. The real task is then to flatter the teacher/professor/test-setter's preconceptions about the work — and if the goal is to guess some external source's perspective, why shouldn't that external source be SparkNotes?

This ambivalent literalism is evident in the paper itself: - one student is criticized for "imagin[ing] dinosaurs lumbering around London", because the authors think this language is obviously "figurative". But it's totally plausible that Dickens was a notch more literal than only describing the mud as prehistoric! In the mid-1850s the first descriptions and statues of dinosaurs were being produced, there was a common theory that prehistoric lizards were as developed as present mammals, so maybe he's referring to (or making fun of) that idea? - the authors criticize readers for relying on SparkNotes instead of looking up individual words in the dictionary. But "Chancery" has ~8 definitions, only one of which is about a court and "advocate" has ~4. Is it more competent to guess which of those 32 combinations is correct, or to look up the meaning of the whole passage instead? There's whole texts dedicated to explaining other texts, especially old ones — does pulling from those make you a bad reader? - they say that a student only locates the fog vaguely rather than seeing that "it moves throughout the shipyards". That's not in the text though: the fog is only described as moving laterally in two of the locations, and never between different parts of the yard. Maybe the fog is instead being generated in each ship and by each person, as is the confusion in the High Court of Chancery? (More pedantically still: are all these boats just being built? If not, wouldn't they be at docks or wharfs rather than shipyards?)

I think the underlying implicit belief is that there is always one correct interpretation of the text, at one exactly correct level of literalness, derivable from only the text itself. But by the points students are in college they will have been continuously rebuffed for attempting literal interpretations that don't produce the required result, and unsurprisingly they end up unsure which parts of understanding are mechanical and which are imaginative.

[^0] https://www.theparisreview.org/blog/2011/12/05/document-the-... [^1] https://www.huffpost.com/entry/standardized-tests-are-so-bad...


I think this (jokingly) assumes that PhDs become academics, which is not always true. Also, PhDs are generally free and if you leave early you get a Master's anyway. Terminal Master's you have to pay for.


Let's say that The Rebel Times has a headline "Member of the Imperial Senate on a diplomatic mission boarded and arrested without cause" while the Empire Daily reports that "Leia Organa, part of the Rebel Alliance and a traitor, taken into custody". Following your process, the "what" is just that Leia was arrested.

Then, the Rebel Times says "Moisture farmer with magic powers joins fight against Empire", but the Empire Daily has "Moisture farmer joins fight against Empire". the common whats are just that a moisture farmer joined the Rebel Alliance, which is true, but much less consequential than if he had magic powers.

Later, the Rebel Times says "Secret Empire super-weapon destroyed at the Battle of Yavin", and the Empire Daily publishes... nothing because they don't want to admit defeat. There's no common information between these stories (because there is no second story), so looking for common whats would conclude that nothing happened.

If the process of analysing the news accounted for the fact that the different outlets are interested in presenting different whats, it could conclude that the fact that the Empire Daily published nothing about the third story doesn't mean that it didn't happen. In the second case, if it could account for the Empire wanting to suppress information about the Force, the conclusion would be that Luke joining the Alliance is somewhat more of a big deal than otherwise. Even in the first case, it might realise that the fact that the two sources don't agree about Leia doesn't mean that one side isn't right.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: