Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Asm-declaration – Embed assembly language code within a C++ program (2017) (cppreference.com)
45 points by ____Sash---701_ on July 26, 2019 | hide | past | favorite | 51 comments


In a string literal?!?! We did better almost 40 years ago..., this is assembly in turbo pascal:

  procedure init; assembler;
  asm
    mov ax,13h
    int 10h
  end;
You can even have only the asm block in a regular pascal function and make use of regular arguments in the asm block.


For some cultural reason C and C++ compilers for UNIX do it the hard way.

PC compilers always followed that path.

The same in Turbo C would be:

    void init()
    {
        asm {
            mov ax, 0x13
            int 0x10
        }
    }


The style used by Turbo C is strictly inferior. There is no way for the author of the assembly code to cooperate with the compiler's register allocator or instruction scheduler.

You wouldn't even need both of those lines with gcc. You could ask the compiler to put 0x13 into ax, and the compiler might schedule that instruction far earlier or even take advantage of the value already being in the register by luck.

Turbo C makes a simplistic assumption about what registers might have been trashed. With gcc, the compiler knows because you told it. Turbo C must save things to memory before the assembly and then reload registers afterward, which adds enough slowness that the assembly might not even be worthwhile.


That has nothing to do with their actual point; Turbo C could perfectly well support:

  void init() {
    asm("ax"(0x13)) { int 0x10 }
    }
and as TFA suggests, plenty of implementations use the string-syntax without any register/clobber extensions.


Turbo C was a compiler done in 1990 for MS-DOS, naturally it was just an example.

Your remarks are a mute point in modern Windows compilers with Assembly intrinsics, the evolution of those inline assembly instructions.


It's not mute or even moot. Assembly intrinsics are not enough.

I've dealt with this, getting software to run in Visual Studio. The intrinsics are simply not available. You end up running code through gcc to produce assembly, then hacking up the assembly (way too much to write by hand) into a separate *.asm file for Visual Studio.

Vector stuff, if it isn't very new, is covered by intrinsics. Well, it is badly covered, with terrible failures to keep things in registers.

Once you get into exotic OS-level stuff, the intrinsics simply don't exist. The most important one is the ability to put an arbitrary byte sequence into the instruction stream. For example, suppose you wanted to add a Spectre fix to your JIT on day 1. You needed to fix a security problem, so waiting for a new release of Visual Studio isn't an acceptable option. You really truly need the ability to put weird byte sequences into the code. Visual Studio doesn't provide an intrinsic for that.


For such corner cases using MASM on day 1 until the new VC++ release isn't much of an issue.


MASM is just an assembler, with no awareness of the C or C++ code. MASM is unable to place arbitrary bytes into the middle of functions that are written in C or C++.

To fix a problem like Spectre, and generally to solve unusual problems that the compiler vendor isn't dealing with, the full capability of an inline assembler is required.


Sure it can, I really don't see any difference between that and a naked call with LTO, beyond the convenience of saving about a minute not having to write a prototype for the function being called and adding MASM into the build.


Also Rust uses strings for asm. It can't be bad.


It’s a fairly major point of contention, and we may not keep it. Note that it’s not a stable feature yet. As pointed out below, it’s this way because it’s basically a convenience into LLVM.


Keep it.

Porting to rust is easier if it doesn't require a person to be an expert at two different kinds of inline assembly syntax. Being able to grab a chunk of inline assembly from a C project is very useful.

Whatever you do, don't embed knowledge of the assembly language into the compiler. That way lies madness. Assembly is often used for new CPU features that are not yet supported by the compilers that people are using. Constantly putting out minor compiler updates for every CPU revision would be miserable, and the users won't want to force those upgrades anyway.

There is also the issue, I'm sorry, of the rust preprocessor. It will be written and it will be used. It may even be popular and ultimately written into an ISO standard. The irregularity of switching suddenly to a radically different CPU-specific syntax for assembly code would make the preprocessor situation much more nasty and gross.


> Porting to rust is easier if it doesn't require a person to be an expert at two different kinds of inline assembly syntax.

Sadly, this doesn't save you from that, in fact, it can be argued that the string syntax is what makes you need to learn a whole second set of syntax. This is due to clobbers.

> There is also the issue, I'm sorry, of the rust preprocessor. It will be written and it will be used.

I don't forsee this happening; Rust has powerful enough generic capabilities that even with tens of millions of lines of Rust existing today (I'd actually guess we're in the hundreds right now, but still), nobody has invented one yet. Getting away from the pre-processor is considered a pro, not a con.


> nobody has invented one yet.

no need to invent one when cpp exists and works. I've seen people use it in Java, and it is sometimes used in unix config files (e.g. .Xresources).


Sure; I'm not aware of anybody using cpp with Rust either.


> Getting away from the pre-processor is considered a pro, not a con.

Yep, even ISO C++ is putting energy into making each standard revision one reason less to use it.


Unisys ClearPath MCP, the modern version of the Burroughs linage never supported any kind of Assembly, instead it was the first high level systems programming language to use compiler intrinsics, in 1961.

The same path that Microsoft has decided to follow since they introduced 64 bit support. Compiler intrinsics.

Also copy paste inline Assembly from C into Rust only works for a specific C compiler.


> Being able to grab a chunk of inline assembly from a C project is very useful.

Wasn't Rust's current inline assembly syntax subtly different from the gcc-compatible one you'd find in most C projects? IIRC, it uses the syntax from the LLVM IR to specify inputs and outputs, instead of the syntax from GCC inline assembly.


Rust stdlib uses strings because it's just piping through to LLVM and that's what LLVM expects: http://llvm.org/docs/LangRef.html#inline-assembler-expressio...

There's also a Rust version of Dynasm.


It gives me nostalgia seeing mode 13 graphics. What's next, mode x? Triangle drawing routines? :)


I've actually picked this up as a hobby since a year ago or so.

I grew up in the late 80's early 90's doing exactly this type of thing as a kid. Having worked as a professional dev for the past few decades I started to notice that I've been taking my work home for all that time, and even though I like my work, I felt I needed a hobby as I'm getting older.

So, I'm currently making a shooting game for the 286/16 in modex VGA with SoundBlaster digitised sound and Adlib for music. All programmed from scratch...


Cool! Got a public repo of code? What toolchain are you using?


Sorry, no public repo, yet. Once I have something coherent working I will definitely publish it.

I'm using Watcom C v11. I actually bought this somewhere in the 90's. However, the Watcom tool chain is still being developed as Open Watcom. So, great for 32bit DOS:

https://github.com/open-watcom


next, we implement bresenham's algorithms, then we embed antialiasing.


Ah yes. And Borland Pascal. Mode X VGA hackery in unreal mode. Later, when I worked for a software store in high school, I acquired Borland C++ 3.1, the physically-largest and heaviest (27 lbs / 12.2 kg) retail software package that I know of. It was a small software shop, they gave us crazy discounts, vendors gave us NFRs and they let us borrow anything on the shelf (trusty-dusty shrinkwrap machine). The profiler, debugger, and assembler were also good as there were protected-mode variants that could sometimes keep the machine from crashing.


> In a string literal?!?!

It's a thing literal as it is literally dumped straight into the generated .s file by the compiler, as most C compilers don't do the actual assembly or generation of object file.

Of course these days it's not literally dumped in unmodified -- various compilers do substitutions for you. But that's the legacy.


Assembler? Inline LLVM intermediate representation might be extremely useful instead, for many applications, see:

https://idea.popcount.org/2013-07-24-ir-is-better-than-assem...


Lots of languages have inline assembly. Pascal does:

- Free Pascal: https://www.freepascal.org/docs-html/prog/progch3.html#progs...

- Delphi: http://docwiki.embarcadero.com/RADStudio/Rio/en/Inline_Assem...

You can even do assembly in batch files via Debug if you really want to:

https://thestarman.pcministry.com/asm/debug/debug2.htm

https://www.robvanderwoude.com/debug.php



D has a very nice inline assembler (assuming you're on a supporter platform, if not then you get to use the standard gnu style)


Odd bit of news reporting on something that was part of a standard published in 1997. What's next, breaking news on a standard 7-bit code used by Americans for information interchange?


This is a link to a reference page, not a news report.


> This is a link to a reference page, not a news report.

I don't know about you, but to me the name "Hacker News" implies what I read here is a news report, not a reference work of any sort.


I don't see anything on that page referencing C++20. Am I missing something?

On a related note, I always find the official docs for GCC inline assembly are insufficient for figuring out what I am trying to do. I nearly always have to resort to dumb trial and error. I was just recently planning to write some docs of my own on the subject. Not tutorial docs, but reference docs.


I believe it was discussed in Cologne, 'Enabling Constexpr Intrinsics......' - http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p166...

Quite a meaty document, should provide you with help for material on the subject.


Even as a C++ amateur I didn't find it as a meaty or intimidating proposal, rather it was refreshingly brief. If you understand that constexpr functions can be evaluated at either run time or compile time, it makes sense that asm isn't allowed in a compile-time context (it could get really trippy if it was!). This allows for flow analysis to permit asm when evaluating at runtime within a constexpr function, as long as there's a non-asm constexpr alternative path available at compile time. If I'm misunderstanding or missing some subtleties, as a novice I'd appreciate any corrections.


Trial and error is valuable, but if you go to github and search for "movq", "ld a,(hl)", or similar string you can often find examples of code that is presumably working.

I'm reminding myself of Z80 assembly at the moment, building a simple computer and I've done a bit of that.


Years ago, I started work on a library that does this in Java: https://github.com/dvx/jssembly

The syntax looks like this:

    jsm.define("test", new Block(x64) {{
        __asm(
            "nop",  // no-op
            "ret"   // return
        );
    }}).invoke();


Tangentially related but in gcc and clang you can jump to or call machine language embedded within string literals.

https://stackoverflow.com/questions/48593734/calling-goto-on...

https://codegolf.stackexchange.com/questions/2203/tips-for-g...

This is not recommended for anything in production but is an amusing parlor trick.

Edit: Recent toolchains may require the -zexecstack flag to avoid a segfault.


Old, well-known feature of one of the most popular languages. I must be missing something, because I have no idea how this became a hacker "news" :)


Hacker News Guidelines

What to Submit

On-Topic: Anything that good hackers would find interesting. That includes more than hacking and startups. If you had to reduce it to a sentence, the answer might be: anything that gratifies one's intellectual curiosity.


It's not technically a feature of the language, as it is not standardized and is compiler-specific.

But thing you are talking about indeed recurs here. The way I typically guess is that new generations of programmers are coming up all the time and may be unfamiliar with some old stuff. Or, maybe even some experienced people who never happened to touch c or c++ [there are more of those as time goes on].


I'm scratching my head, too. I could've sworn the Linux kernel itself had a few chunks of hand-written assembly for better performance in key spots.

EDIT: I see several, just from doing a simple search against 'movq'

https://github.com/torvalds/linux/search?q=movq&unscoped_q=m...

I am not a kernel programmer, though, so I can't say whether it's being used in any capacity on modern, x64 systems or whether it's a compatibility mode for low-powered embedded architectures. Maybe someone more knowledgeable can chime in.


In the case of the kernel it's not just a performance thing. There are a lot of things that are totally irrelevant to the high level notion of the C execution model that are therefore not exposed, and certainly not in the C standard.

Things like: Hm, I need to swap my stack register and page table with this other process.

Or writing interrupt handlers.

Atomic operations and memory barriers used to be one of those things, but compiler extensions and new language standards have been catching up on some of that... Though to be honest a kernel will want enough control that it will likely still go outside the standard or extra compiler support for these anyway.


The real question remains unanswered: why is this on the front page?


There is no QC check, it works on votes. I think anything could end up here. Which casts quite a bit of shadow over what has been here.


It does not work entirely on votes. The mods have quite a bit of weight in deciding what gets pushed to the front page.

For example, there's an article about gorillas on the front page right now that has exactly 7 points. HN is nowhere near that anemic that 7 votes would be enough to push to the front page on its own.

To boot, I once submitted something that went unnoticed. An HN mod then emailed me and said it looked cool, asking me to resubmit it again so that it could get more prominent featuring.


Maybe what's new here is the suggestion of using a raw string literal for the assembly source? That does look neat (if minor).


I wonder if that has anything to do with not having to then use escaped newlines and tabs. Without newlines, multiple instructions may fail to parse in the assembler, tabs are just for readability when printing asm rather than assembling.


I was a Forth dev a long time ago (on a planet far away...). Switching to & from assembly seemed as natural as breathing. Oh for the days of 16bit cores and minuscule register banks.


did this in college over 10 years ago..




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: