It's an interesting line of inquiry to think about how many of these evaluation ...

josephg · on Jan 23, 2019

NPM (nodejs package manager) has started doing that, revealing three combined metrics for each package in the search results. It assesses every package by popularity (number of downloads), quality (a dozen or so small heuristics about how carefully the project has been created) and maintenance (whether the project is actively maintained, keeps its dependancies up to date and if it and has more closed issues than open issues on github). The idea is that when you search for a package, you can see at a glance the relative quality and popularity of the modules you're choosing between.

Its not perfect - there's no way to tell if the packages under consideration are written in a consistent style or if they have thorough unit tests, but its a clever idea. And by rating packages on these metrics they encourage a reasonable set of best practices (write a readme and a changelog, use whitelists / blacklists, close issues on github, etc). The full list of metrics is here:

https://itnext.io/increasing-an-npm-packages-search-score-fb...

perturbation · on Jan 23, 2019

To some extent, R works like this... packages are only on CRAN if they pass automatic checks and there's a pretty strong culture of testing with testthat. You can have your own package on Github or external repo, but then that's a non-standard, extra step for installing the package.

DougBTX · on Jan 23, 2019

Badges on GitHub seem to be getting more popular, a manual way to show this information in project readme files: https://github.com/badges/shields

zbentley · on Jan 23, 2019

> The package manager could run the package's test suite, for instance, and warn you if the tests don't all pass

That's what CPAN does by default. It provides assurance, as well as invaluable real-environment test results back to package maintainers.

DoctorOetker · on Jan 23, 2019

those are really good ideas!

a vague additional idea:

can we improve rough assessment of code quality?

1) suppose we have pseudonym reputation ("error notice probability"): anyone can create a pseudonym, and start auditing code, and you mark the parts of code that you have inspected. those marks are publicly associated with your pseudonym (after enough operation and eventual finding of bugs by others, the "noticing probability" can be computed+).

2) consider the birthday paradox, i.e. drawing samples from the uniform distribution will result in uncoordinated attention, while with coordinated attention we can spread attention more uniformly...

+ of course theres different kinds of issues, i.e. new features, arguments about wheiter something is an improvement or if it was an oversighted issue etc... but the future patch types don't necessarily correlate to the individuals who inspected it...

ALSO: I still believe formal verification is actually counterintuitively cheaper (money and time) and less effort per achieved certainty. But as long as most people refuse to believe this, I encourage strategies like these...

EvilTerran · on Jan 23, 2019

There's some relevant work going on in the "crev" project, discussed here a couple of weeks ago:

https://news.ycombinator.com/item?id=18824923

The big idea is for people to publish cryptographically signed "proofs" that they've reviewed a particular version of a given module, allowing a web-of-trust structure for decentralised code review. I particularly like how, thanks to the signatures, a module's author can distribute reviews alongside the module without compromising their trustworthiness - so there's an incentive for authors to actively seek out reviewers to scrutinise their code.

Sophistifunk · on Jan 24, 2019

There's a boatload more the package managers could be doing, but the problem is that it needs to be paid for.