Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's an interesting line of inquiry to think about how many of these evaluation heuristics, which are all described as things a person can do manually, could instead be built into the package manager itself to do for you automatically.

The package manager could run the package's test suite, for instance, and warn you if the tests don't all pass, or make you jump through extra hoops to install a package that doesn't have any test coverage at all. The package manager could read the source code and tell you how idiomatically it was written. The package manager could try compiling from source with warnings on and let you know if any are thrown, and compare the compiled artifacts with the ones that ship with the package to ensure that they're identical. The package manager could check the project's commit history and warn you if you're installing a package that's no longer actively maintained. The package manager could check whether the package has a history of entries in the National Vulnerability Database. The package manager could learn what licenses you will and won't accept, and automatically filter out packages that don't fit your policies. And so on.

In other words, the problem right now is that package managers are undiscriminating. To them a package is a package is a package; the universe of packages is a flat plane where all packages are treated equally. But in reality all packages aren't equal. Some packages are good and others are bad, and it would be a great help to the user if the package manager could encourage discovery and reuse of the former while discouraging discovery and reuse of the latter. By taking away a little friction in some places and adding some in others, the package manager could make it easy to install good packages and hard to install bad ones.



NPM (nodejs package manager) has started doing that, revealing three combined metrics for each package in the search results. It assesses every package by popularity (number of downloads), quality (a dozen or so small heuristics about how carefully the project has been created) and maintenance (whether the project is actively maintained, keeps its dependancies up to date and if it and has more closed issues than open issues on github). The idea is that when you search for a package, you can see at a glance the relative quality and popularity of the modules you're choosing between.

Its not perfect - there's no way to tell if the packages under consideration are written in a consistent style or if they have thorough unit tests, but its a clever idea. And by rating packages on these metrics they encourage a reasonable set of best practices (write a readme and a changelog, use whitelists / blacklists, close issues on github, etc). The full list of metrics is here:

https://itnext.io/increasing-an-npm-packages-search-score-fb...


To some extent, R works like this... packages are only on CRAN if they pass automatic checks and there's a pretty strong culture of testing with testthat. You can have your own package on Github or external repo, but then that's a non-standard, extra step for installing the package.


Badges on GitHub seem to be getting more popular, a manual way to show this information in project readme files: https://github.com/badges/shields


> The package manager could run the package's test suite, for instance, and warn you if the tests don't all pass

That's what CPAN does by default. It provides assurance, as well as invaluable real-environment test results back to package maintainers.


those are really good ideas!

a vague additional idea:

can we improve rough assessment of code quality?

1) suppose we have pseudonym reputation ("error notice probability"): anyone can create a pseudonym, and start auditing code, and you mark the parts of code that you have inspected. those marks are publicly associated with your pseudonym (after enough operation and eventual finding of bugs by others, the "noticing probability" can be computed+).

2) consider the birthday paradox, i.e. drawing samples from the uniform distribution will result in uncoordinated attention, while with coordinated attention we can spread attention more uniformly...

+ of course theres different kinds of issues, i.e. new features, arguments about wheiter something is an improvement or if it was an oversighted issue etc... but the future patch types don't necessarily correlate to the individuals who inspected it...

ALSO: I still believe formal verification is actually counterintuitively cheaper (money and time) and less effort per achieved certainty. But as long as most people refuse to believe this, I encourage strategies like these...


There's some relevant work going on in the "crev" project, discussed here a couple of weeks ago:

https://news.ycombinator.com/item?id=18824923

The big idea is for people to publish cryptographically signed "proofs" that they've reviewed a particular version of a given module, allowing a web-of-trust structure for decentralised code review. I particularly like how, thanks to the signatures, a module's author can distribute reviews alongside the module without compromising their trustworthiness - so there's an incentive for authors to actively seek out reviewers to scrutinise their code.


There's a boatload more the package managers could be doing, but the problem is that it needs to be paid for.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: