Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I certainly should voice my opinion here.

I've done analysis of GA144 before: http://news.ycombinator.com/item?id=1810641

Most of the Chuck Moore design (I reviewed several, starting from M17) can be described by quote from Devil Wears Prada: "the same girl- stylish, slender, of course... worships the magazine. But so often, they turn out to be- I don't know- disappointing and, um... stupid". Chuck Moore designs are all slick, slender, stylish, worship Forth, but they turn out to be disappointing and stupid in the end. The only beneficiary often is Chuck Moore itself. You just cannot apply his experience to other places in the world of computing.

Let's see what we have here in GA144.

The memory inside all cores is way too small for general purpose programs, even if you split your program into 144 parts and spread them across cores. 64 words of RAM, ie, 128 bytes. 128 bytes times 144 - 18Kbytes. Same for the (program) ROM, and you should factor communication code in there. Communication cost affects RAM as well.

They offer no compiler from high-level language like C. You had to learn a specific dialect of Forth and some bizarre (albeit small) assembly language.

The only benefit for general population from this affair is the relative ease of the desing of asynchronous hardware.

http://en.wikipedia.org/wiki/Asynchronous_system



Your are focusing on classic applications only.

The GA144 is so different from the classic way of computing that it requires new approaches of development. One very interesting feature of GA144 is that all 144 cores can share instructions over I/O lines. That means every core can send instructions to their neighbors who execute them directly without conversion.

I/O ist fast enough. I guess it should be possible to have (someday) an external interface to SRAM which circumvents the low memory problem.

> They offer no compiler from high-level language like C.

That's right, and that's the real weak spot of GA144. Almost every microcontroller board today comes with C or alike. Mr. Moore loves Forth but I doubt that there are many developers out there who like to be forced to learn Forth just for this single platform. I know Forth, it was one of my first languages I ever learned. It is perfectly suitable for embedded systems but you have to learn a lot to master it.


I protyped dynamic dataflow machine which (in theory) could be scaled to hundreds of cores (corelets - something very small which does not have even jump command). In my experiments readying information to be sent accounts for hefty 30%+ of code.

http://thesz.mskhug.ru/svn/hhdl/previous/HSDF/CoreletTest.hs

The link above contains some simple "Hello, world!" program, in five "big instructions" which contains 21 corelet instruction in total. 8 of those 21 instructions are send and front advancing instructions - their only purpose is to establish communication between program parts. My machine sends pointers to "big instructions", up to 32 bytes long (up to 32 instructions if you're lucky) while GA144 could send only 4 instructions max.

30% of 4 instructions is 1 instruction. Another one instruction from those 4 is a loop or jump or something like that. So you have two instructions to perform program logic. And this is best case.

So I again express dislike of GA144 as a computing machine. And I again express my gratitude to Chuck Moore for proving that clockless design works.


> In my experiments readying information to be sent accounts for hefty 30%+ of code.

Unfortunately I don't have time to dive into your design but AFAIK the GA144 doesn't need 30% preparation code because every instruction can be executed immediately by neighbor nodes.

That means (correct me if I am wrong) if core X has to evaluate a Forth function of say five arguments then it could pass all five arguments to its neighbors (without any preparation) by sending them the code addresses of the arguments, wait until they have finished and then use their results to compute the function result. These neighbor nodes themselves could evaluate (or delegate) subexpressions to other (free) nodes and so on.

This form of parallelization would require an efficient shared memory access. This problem needs to be solved because AFAIK I/O ports are accessible by the edge cores only. It doesn't make much sense to transport each shared data through several columns or rows of cores.


I think you're wrong about many arguments for command send to other core.

You can send a word, ie, "big command" composed from four MISC command. One of MISC commands in "big command" can retrieve data from other core.

So most of the time you will wait to send a command or to receive some data.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: