Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
LLVM 5.0.0 Release (llvm.org)
239 points by zmodem on Sept 7, 2017 | hide | past | favorite | 76 comments


> ./configure scripts generated by GNU autoconf determines whether a linker supports modern GNU-compatible features or not by searching for "GNU" in the --help message. To be compatible with the scripts, we decided to add a string "(compatible with GNU linkers)" to our --help message. This is a hack, but just like the web browser's User-Agent string (which everyone still claim they are "Mozilla/5.0"), we had no choice other than doing this to claim that we accept GNU-compatible options.

http://releases.llvm.org/5.0.0/tools/lld/docs/ReleaseNotes.h...

Even though I wrote it, I found this part a bit funny. Configure scripts are hacky by their nature, and we needed another hack to make their hack work. I'm not happy about that though.


Yup, this is why feature detection is better than version detection.

jQuery changed their strategy in 2009: http://blog.jquery.com/2009/01/14/jquery-1-3-released/

"No More Browser Sniffing"

Also, autoconf has a lot of faults, but as far as I recall it's firmly based on the philosophy of feature detectino vs. version detection.

Maintainers of packages can write bad custom checks that test for features, but if you use the built-in autoconf checks, they're all about features.

CMake on the other hand seems to use version detection more, which I don't like.


Good comment, triggers a bunch of associations:

1. the rule of thumb to prefer feature detection to user-agent sniffing is generally a reasonable one, BUT if you have something like WURFL to map UA to device (and device to capabilities) then the one-time, up-front lookup can be a better choice.

2. that said I remember using yepnope.js, and sharing this hilarious (because accurate) post on the history of the user-agent: http://webaim.org/blog/user-agent-string-history/

hard to believe it's been almost 20 years since I started messing with HTML...


It helps when you have a platform that supports feature detection reliably. Most platforms don't have that lucky accident.


What's an example where of a feature of cc or ld that you can't detect with feature detection?

Shell scripts are turing complete and can do arbitrary I/O. In theory you can write a feature detection for any feature by compiling, linking, and running a test program. If the feature couldn't be detected this way, then it would be useless.

Likewise, JavaScript and DOM features should always be detectable by eval(). I guess one instance where it's impossible is if the API changes the pixels on the screen and does nothing else.

Although you can actually detect this in some cases with JavaScript! JavaScript can leak information through the colors of attributes on the screen, e.g. you can tell if a user visited a site by looking if the link is purple!


> In theory you can write a feature detection for any feature by compiling, linking, and running a test program. If the feature couldn't be detected this way, then it would be useless.

It's hard to do feature-detection against cross-compilation toolchains, since "toolchain" doesn't imply a standardized emulator/debugger component for the resulting executables. Under cross-compilation, you can detect that your test programs compile and link, but you can't run them!

This, if you're wondering, is why SDKs for embedded architectures always seem stuck in the time before autoconf/cmake were invented: without feature-detection support, such toolchains (and the projects developed for them) have to get by with manually-written Makefiles.

This is also why the more modern game console SDKs (such as the Wii U SDK, my own most recent experience) are delivered as components that slide into an IDE like Visual Studio, rather than as standalone CLI toolchains for an IDE to drive. This way, they do have a (proprietary) standard way to add an emulator/debugger component, which can then serve as a core for the IDE's build system.


Yes good point. Though I think most features can be tested without running a program -- e.g. testing if a library function exists or has a certain signature.

I think embedded toolchains use plain makefiles because they tend to fork code rather than upstream changes. And that is due to the ship-it-and-forget-it mindset of embedded companies that is well known among open source maintainers.


No modern browser lets you know the state of a:visited and will lie to you if you query for colors etc. In fact if you can find a way for js to determine if a link has been visited you are likely qualified for $$$ bug rewards (which has happened)


>No modern browser lets you know the state of a:visited

There are loads of ways to extract that information, and the number increases all the time as browsers get more complex. It's not something browser vendors actually consider to be a bug. They have historically only 'fixed' the trivial methods of getting the visited status because it became an issue that was widely publicised. If they cared at all they would never have implemented features like Shared Array Buffers.

This stuff is all sitting in years old open bug reports. Nobody cares.

https://bugzilla.mozilla.org/show_bug.cgi?id=884270 Link Visitedness can be detected by redraw timing

https://bugs.chromium.org/p/chromium/issues/detail?id=508166 Security: Chrome provides high-res timers which allow cache side channel attacks

Further reading on some of the exciting new timing attacks added to web standards in recent years:

https://arxiv.org/pdf/1502.07373v2.pdf The Spy in the Sandbox – Practical Cache Attacks in Javascript

https://www.contextis.com/resources/white-papers/pixel-perfe... Pixel Perfect Timing Attacks with HTML5 (now fixed because you could use it to steal the contents of any page)

http://www.cs.vu.nl/~herbertb/download/papers/anc_ndss17.pdf ASLR on the Line: Practical Cache Attacks on the MMU


That's not really feature detection, though.


Isn't there a technique involving custom fonts, to detect history, that's server side detection thou


Ok thanks for the info, that makes sense!


Which is why jQuery throws (and handles) exceptions every single time I run it under a debugger. Ugh.


This is the first time I can see Zig lang [1]. (Self-proclaimed?) C successor with manual memory management, ironed out edge cases, new take on error handling (that resembles well written error handling in C), generics, compile time reflection/execution (macros?), import .h works directly, exporting to C works directly, nullable types etc... all sound quite interesting actually. Anybody has experience/comments on the Zig lang, please?

[1] http://ziglang.org/


I've been writing a fair bit since a few months ago, and have written a few things for the stdlib. Here are some examples I like about it.

Hassle-free error management

Consider you write a function

  fn div(a: u8, b: u8) -> u8 {
      a / b
  }
If later you find an error case that needs to be handled updating the code usually will not require much extra addition.

  error DivideByZero;
  fn div(a: u8, b: u8) -> %u8 {
      if (b == 0) {
          error.divideByZero
      } else {
          a / b
      }
  }
The caller can choose to ignore errors using `%%div(5, 1)` or can propagate errors to the caller, similar to Rust's `try!`, `?` using `%return div(5, 2)`.

I find this so easy that I'm much more inclined to think about edge cases and handle errors up front. I find when writing Rust the extra setup and management of errors adds a fair bit of tedium (although to be fair, with error-chain and proper setup at the beginning of a project this isn't too bad).

Compile-time programming

Zig has pretty strong compile-time programming support. For example, its printf formatting capability is all written in userland code [1]. It doesn't at this moment support code-generation like D's mixins but I personally have not found this too problematic.

Generic functions can be written in a duck-typing fashion. With compile-time assertions the inputs can be limited to what they need pretty clearly and the errors during usage are pretty self-explanatory.

  error Overflow;
  pub fn absInt(x: var) -> %@typeOf(x) {
      const T = @typeOf(x);
      comptime assert(@typeId(T) == builtin.TypeId.Int); // must pass an integer to absInt
      comptime assert(T.is_signed); // must pass a signed integer to absInt
      if (x == @minValue(@typeOf(x))) {
          return error.Overflow;
      } else {
          @setDebugSafety(this, false);
          return if (x < 0) -x else x;
      }
  }
Zig doesn't have any form of macros. Everything is done in the language itself.

[1]: http://ziglang.org/documentation/#case-study-printf


>The caller can choose to ignore errors using `%%div(5, 1)` or can propagate errors to the caller, similar to Rust's `try!`, `?` using `%return div(5, 2)`.

>I find this so easy that I'm much more inclined to think about edge cases and handle errors up front. I find when writing Rust the extra setup and management of errors adds a fair bit of tedium (although to be fair, with error-chain and proper setup at the beginning of a project this isn't too bad).

How is this different than say:

   5_u8.checked_div(1).unwrap()
That unwrap() is performing the same thing as your %% example, if I understand it correctly.

Are the Error types in Zig on the stack or the heap? By default Rust puts everything, including errors, on the stack, which means that the size of the return type always needs to be known. To make this easier you can return Boxed errors:

   fn this_errors() -> Result<u32, Box<Error>> { ... }
And then error types can be very simple. Also, I recommend people getting into Rust really checkout error_chain!, which is a macro that helps in combining all the errors that your library might need to deal with: https://docs.rs/error-chain/0.11.0/error_chain/


Yes you have it right, `unwrap()` is equivalent to `%%`. It will panic if the result is an error.

Error values under the hood are just unsigned integers and are returned on the stack. In fact, the granularity at which allocators are exposed in the stdlib makes any possible dynamic allocation very explicit in the language.

This link [1] provides an overview of errors and some of the surrounding control flow.

[1]: http://ziglang.org/documentation/#errors


Ah. Thanks for the link. I didn't see anything that section talk about passing values with the Errors, based on your comment, is that even possible?

If you had an error you wished to pass a reason string for, would you need to use a global static or something?


Not with the builtin error type. Its pretty much analogous to a c error code.

If you wanted to send data you would need some other means like you suggest. I'd be interested in finding an ergonomic solution to this but it probably wouldn't be at the language level.


You say that Zig has no macros, just like FORTRAN, but that isn't a language property at all. Technically, C has no macros, but most people use the C preprocessor. I've seen people preprocess with sed scripts, perl scripts, tr, python scripts, and m4. FORTRAN too was often preprocessed.

So we'll take your Zig and shove it through a preprocessor or several. I've seen code get preprocessed 3 times during builds, but I'm sure that isn't a record.

If a language designer fails to provide a suitable well-matched and effective preprocessor, we'll add something nasty. Oh well. Stuff has to get done.


> Technically, C has no macros, but most people use the C preprocessor

technically the C preprocessor is a non-optional part of the C language and specification (and in fact embedded in the actual C compiler binary in many implementations). So both in theory and in practice C very much has a macros.

/pedantic


Previous threads:

https://news.ycombinator.com/item?id=12378922

https://news.ycombinator.com/item?id=11060282

Actually I'm surprised that Zig didn't come up on the "Some Were Meant for C" thread, because this strategy is exactly what I wished something like Rust had support for (importing header files directly, more direct support for unit-by-unit translation, etc.):

https://news.ycombinator.com/item?id=15179941


Are you familiar with this?

https://github.com/rust-lang-nursery/rust-bindgen

It will translate a C header into Rust for all your bindings. Something could probably be built to do this more on the fly. If it doesn't already exist.

edit: docs link https://docs.rs/bindgen/0.30.0/bindgen/

edit2: and I should mention the inverse, Rust to C bindings,

https://github.com/eqrion/cbindgen

https://docs.rs/cbindgen/0.1.23/cbindgen/


Yeah that is part of it, but I also like the lighter design philosophy I see Zig. It's the difference between thinking about compatibility up front vs. clean-slate design + compatibility.

If you look at the Python vs. Lua C API you can see that difference. Lua was designed to interoperate with C (albeit after breaking the language 4+ times); Python just took whatever their implementation happened to be, and then exposed that API to users. With thousands of functions and macros and GIL and whatnot.

Then a bunch of people came along later and wrote 5 different binding generators, and all of them solved some the problem, but not all of the problem. And some of them created new problems (SWIG).

Also (to child comment), is it really possible to do bindgen as a macro rather than as a separate code generating tool? Because I assume bindgen has to invoke a C++ parser like Clang.

I looked at Zig and it indeed uses libclang to parse C. I don't know about Rust's macro system but it would be somewhat surprising if it can invoke external tools or run C++ code at compile time.


Procedural macros aren't the macro_rules and custom derives that are currently in stable.

They allow you to do much more. You can see this put to use in the Rocket project quite heavily. I believe it should be possible to link against bindgen as described and generate the bindings inline.

The caveat here is that it would be fairly opaque to the user as you wouldn't see the generated bindings in code. For situations like that I rely on 'cargo doc' to generate the documentation such that I can see the actual interface generated.

Rocket codegen for example: https://github.com/SergioBenitez/Rocket/tree/master/codegen


You also have to take into account that Rust has a borrow checker that makes opaque C calls 2nd class citizens anyway. Zig doesn't have that as far as I can tell.

So it means that even if Rust had a standard C header include directive it still wouldn't be completely transparent because most of the time you want to wrap around C APIs in Rust to provide safe, borrow checked and pointer-free alternatives. There's a different philosophy I think.


> I looked at Zig and it indeed uses libclang to parse C. I don't know about Rust's macro system but it would be somewhat surprising if it can invoke external tools or run C++ code at compile time.

Bindgen is often invoked in a "build script", which Cargo runs before building your project. So not exactly, but basically.


Bindgen is excellent with C APIs but what I've found really impressive is that it can handle C++ pretty well too. Most of my issues with using C++ code via bindgen are due to deviations in calling conventions that are hard to solve at that level.


And as soon as procedural macros are stabilized, I'm sure that someone will provide a bindgen macro that would let you do something like bindgen!("somelib.h").


Zig competes with C instead of depending on it. The Zig Standard Library does not depend on libc.

That's an interesting choice. I'm sure there are other semi-recent languages that have made the same choice, it would be interesting to hear the benefits and problems of that approach.

I say semi-recent, because there are of course many from before C or that competed with C and Unix initially, but unless they are still used much (Lisp?) it's not necessarily the best for a comparison of modern issues.


A useful thing of this is also the portability story. Go for example is pretty good as far as I'm aware of cross-compiling to a different target. The main problem would probably be the extra implementation needed, instead of just writing a binding to libc. It also is more popular lately to prefer static linking vs. dynamic for deployment ease too, which not requiring linking to libc may help.

For zig it is not all or nothing of course. It is pretty easy to link and use c if wanted. The following for example asks the compiler to link against libc.

  zig build_exe main.zig --library c
One other benefit that zig gets is more compile-time execution opportunity. The math library for example being written in zig means that we can determine `math.log(7)` and use it for compile calculations when using libm we couldn't.


Free Pascal has its own standard library that talks directly with the underlying OS too.


  it would be interesting to hear the benefits and problems of
  that approach.
A big problem is that libc interfaces are more stable, which means with libc you don't need to recompile as much when the system is upgraded, if ever.

Solaris and Linux work hard to keep their kernel interfaces stable, so it's less of an issue there. But Apple, for example, very explicitly says that kernel interfaces (Unix syscalls and Mach ports) are unstable and unsupported, generally speaking. Consumable interfaces are provided via the userland runtime (e.g. libc). Notably, Apple's toolchains don't even support static compilation.

A system like OpenBSD isn't so gratuitous as Apple in terms of breaking compatibility, but because they're focused on improving and simplifying a cohesive system, they don't flinch about breaking compatibility when needed. For various reasons that happens more often with kernel interfaces than with libc interfaces.


Go


Are you sure? I'm seeing reports that go sometimes wouldn't run with musl instead of GNU libc, with a runtime segfault[1]. That seems to depend on libc. Are you saying that Go's standard library primitives are not shims (or more complex structures) built on top of libc?

1: https://github.com/golang/go/issues/14476


Go uses libc optionally (for some things like DNS resolving). You can build Go programs which do not utilize libc by disabling cgo and building static binaries. By default the biggest part of Gos stdlib isn't based on libc, but some small parts are.


Wow, that looks awesome. The syntax is pretty much what I look for in a language, much more approachable to me than Rust. Will tinker with it.


The syntax actually looks fairly similar to Rust, just with generics replaced with `comptime`. What do you find more approachable about it?


Interesting. I've been somewhat disappointed with Rust thus far and this looks a lot more like what I was hoping for.


What did you find disappointing? What were you hoping for (so I can compare it to my own hopes)?

Rust is the one (out of Rust, Nim and D) that looks most promising to me from the outside for my goals[1], but I haven't really settled on one to devote my very limited free time to.

1: If I'm going to drop down from a high level dynamic interpreted language to a low level strongly typed compiled one, I might as well got the extra distance Rust is asking for the gains it promises.


Mostly I ran into a lot of friction with shared data structures that are being accessed by multiple threads. Stuff like ring buffers, flow state databases, and similar systems.

There's probably Rust ways to do that stuff, but it was not obvious to me.


If you're able to write a robust shared ring buffer in C, you should be able to do the exact same implementation in unsafe Rust using raw pointers, declare it as Sync and use it from safe Rust with no issue. Or am I missing something ?

Or, if you're like me and not confident enough to do this, you could check on crates.io[1] to see if no available lib already does what you need. In which case you basically have it for free.

[1] https://crates.io/keywords/ring-buffer


I'm pretty sure that's intentional friction (having to prove what you're doing is safe).


Figuring out how to prove it is safe to the compiler is definitely a challenge.


Wow, nice find! I've been looking recently for a language like that. I like Swift but I dislike the heavy coupling with macOS/iOS and ObjC. I like D but dislike its somewhat arcane syntax and bloat, plus it has a GC so it's not the same category. I like Rust but the borrow checker is brutal.

I will give ziglang a try!


D now has "Better C" mode, which might interest you.

https://dlang.org/blog/2017/08/23/d-as-a-better-c/


One feature I'm excited about in this release is proper support for non-integral address spaces. Allows us to do significantly more optimization in the presence of GC roots in Julia.


Does anyone know the state of the project to compile Linux Kernel with clang? Does this release help with such a goal?


From an August 23 email ( https://lists.linuxfoundation.org/pipermail/llvmlinux/2017-A... )

Over the past months efforts have been made to upstream the remaining LLVMLinux patches (http://llvm.linuxfoundation.org) and to address other outstanding issues in order to build a usable kernel with clang. To my knowledge upstream is in a relatively good shape by now for x86 and arm64 (I heard the same about PowerPC, but have no first hand experience), most of the patches are already in Linus' tree, others have landed in maintainer trees.

...

It goes on, but there has been some pretty significant work done, and you can easily compile with clang as your default cc with simple patches.


I don't recall seeing much recent activity on the mailing lists along these lines.

I have a very vague recollection that Linux may be dependent on gcc specific features that clang doesn't intend to support?

EDIT: "One of the major compilation problems with LLVM/CLang is that they do not support VLA’s ... widely used inside the linux kernel." from [1]

There was some activity over the prior six months or so to get lld to link the kernel (can't recall if it was built by gcc or clang for those tests).

[1] https://www.quora.com/What-is-the-state-of-the-art-of-compil...


Those LLD benchmarks are looking very impressive! Well done to all involved.


Where are you seeing the LLD benchmarks? I'm very curious about the prospect of LLD performance improvements for the benefit of the Rust compiler, and having benchmarks would be lovely.


Maybe these [1] ones comparing ld, gold, lld?

[1] https://lld.llvm.org/#performance


Better AVR support for Rust? Can anyone comment on this?


Well, the AVR backend is being actively developed, I have no idea how good it is. (Although some tooling for embedded systems is rather poor, so one would imagine that LLVM with the right heuristics couldn't do badly).

rustc is apparently (https://github.com/rust-embedded/rfcs/issues/3) not far of it being usable.

However, I don't regularly write code for uC-ers - and when I do, it's usually simple enough to be in assembly - so I can't really say anymore than that.


AVR-Rust author here

The backend itself is working very well - there are only a handful of known bugs at this point, and many projects can be compiled with no issues.

The issue you linked also links to (https://github.com/rust-lang/rust/issues/44052), which shows that once these bugs are fixed, we can start working on merging avr-rust upstream.

A number of projects/libraries have also been developed recently

* https://github.com/avr-rust/blink * https://github.com/avr-rust/avrd * https://github.com/avr-rust/arduino * https://github.com/gergoerdi/rust-avr-chip8-avr


Added support for AMD Lightweight Profiling (LWP) instructions.

Funnily enough, AMD already deprecated those. They're not in Zen.


What's a good reference to get started with LLVM? I've been wanting to write an Oberon-2 compiler, but I don't know what LLVM provides, nor how I might use it from Rust.


It provides IR, low level optimization passes, native code generation, and common metadata features (e.g. for debugging). Rust probably has a library somewhere that binds to it.

So you would have to write linking (with help from LLVM), ASTs, high-level optimizations and validation, garbage collection (with help from LLVM), and standard library (or link an existing one). Also intermediate IRs or ASTs you plan on using (for example Rust's MIR) and their infrastructure.


> Added heuristics to convert CMOV into branches when it may be profitable

Does anyone know why this is the case? I thought CMOVs are a straight win over branches but I guess modern CPUs might be more complicated than that.


"LLVM compiler recognizes opportunities to transform a branch into IR select instruction(s) - later it will be lowered into X86::CMOV instruction, assuming no other optimization eliminated the SelectInst. However, it is not always profitable to emit X86::CMOV instruction. For example, branch is preferable over an X86::CMOV instruction when:

- Branch is well predicted

- Condition operand is expensive, compared to True-value and the False-value operands"

https://reviews.llvm.org/D34769

"We have seen periodically performance problems with cmov where one operand comes from memory. On modern x86 processors with strong branch predictors and speculative execution, this tends to be much better done with a branch than cmov. We routinely see cmov stalling while the load is completed rather than continuing, and if there are subsequent branches, they cannot be speculated in turn."

https://reviews.llvm.org/D36858


CMOV introduces a data dependency. Predictable branches, on the other hand, are basically free.


Just in time for Xcode 9? :)


Now if only Apple would quit mysteriously stripping out features from the full LLVM release...


It seems more likely that they simply don't use upstream release tags and put their changes and maybe backport some changes on top of randomly chosen revision. The same way as Google released Android ndk r15 few months ago with clang "5.0".

The sad part is that their version is neither older or newer than official release as it doesn't contain everything from upstream release and upstream doesn't contain all of their changes yet.


It can't possibly be pure coincidence that a major llvm release is timed about one week off from the annual major apple macOS/iOS/toolchain + iPhone hardware release, with their history of employing big llvm contributors?


It is. The tool chain going into those releases is locked down months in advance.


What are they stripping out?


OpenMP, for one. They at least used to strip thread_local too.


Working in scientific software development, I've seen quite a few people developing on Macs who would really appreciate OpenMP support. They end up having to use GCC or a vanilla Clang, but it's extra work and it comes with its own caveats.


Or just `brew install llvm` and `CC=/usr/local/opt/llvm/bin/clang`.


Naive question: Isn't this just "vanilla clang" that GP refers to?


I misunderstood, thanks.


Yes.


They just don't ship the runtime library, no?


Given the current release manager works for me at Google, probably not related :)

I thought Swift still had parts of LLVM forked anyway, IIRC.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: