Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Acme Thttpd (acme.com)
80 points by assttoasstmgr on June 7, 2022 | hide | past | favorite | 23 comments


I worked at Demon Internet in the late 1990s, and our “homepages” service ran on thttpd. My colleagues had hacked an earlier version to support mass virtual hosting - there were tens of thousands of homepages web sites at a time when that was a lot.

On the front end we had a few cacheing reverse proxies running Squid on FreeBSD, like an early CDN (we had nodes in London and New York). The FreeBSD kernel was hacked to support configuring IP addresses by the CIDR block instead of individually. When Homepages was set up, the HTTP Host: header was not yet well supported, so it needed an IP address per web site. The Squid boxes had 96k addresses each (a /16 and two /18s) and the network was configured to spread the load across them.

Squid was modified to translate the server IP address into a Host header as necessary, so the back end did not need many IP addresses. There was one box running thttpd, one box running an FTP server for uploads, and a NetApp NFS storage server. There were hacks in thttpd to shard the web site names across multiple directories to avoid performance problems with very large directories.

We made some effort to get the modifications upstream, tho thttpd had diverged a lot by the time we sent our code to Jef. But, IIRC thttpd’s vhost setup has some resemblance to the way Demon did things. (This is also why Apache httpd mod_vhost_alias is so weird.) And these days, firewalls can be configured to do the CIDR IP address trick, so there is no need for kernel hacking.

(edited to add) Here's an old blog post on the topic, with the links I left out of this comment https://dotat.at/@/2012-09-25-large-scale-ip-based-virtual-h...


Still one of the funniest stories about the clueless https://www.acme.com/software/thttpd/repo.html



Less funny, but the creator of curl gets mistaken messages too.

https://daniel.haxx.se/blog/2021/02/19/i-will-slaughter-you/


That’s a sad story. Mental illness is such a tragedy. The author clearly has a great deal of technical depth shining through which makes it all the more tragic.


This is becoming a scary story thread. This reads like a threat that could actually be carried out. It could end up like the Ebay/EcommerceBytes case or worse.

edit: this turned out to be pretty sad.


Ah man, I miss that sort of early-2000s directness-to-the-point-of-rudeness (not that it was unjustified!). Nobody talks like that online anymore.


Yeah I don't. It seems needlessly inflammatory in those email threads.


You have no idea what you are talking about.


?


I do miss that sort of thing, but these days I would stop responding much, much quicker than I used to.


I guess it's funny in retrospect. I sell my own software and have to deal with these "confidently incorrect" people all too often. Seems like they are either mentally ill or have some kind of sociopathic personality. I'm thinking about quitting the game because of it.

I guess the "smart" thing to do would be to pay someone to deal with these people, but I really don't want to subject anyone else to the abuse. Not to mention that having another person on the payroll shifts the calculus of whether or not it is even worth it.

I thought I'd get used to it. But, after 10 years, it still makes my fight or flight response kick in. Every time.

At least I get paid to take the abuse. I can't imagine having to deal with it for an open source project. I guess it explains why so many open source project maintainers seem so testy.

Thanks for letting me air my grievances. I feel a little better :)


Oh, I have the SysAdmin style of humor because you either laugh or cry. I've sold software and experienced that fun angle to customers. My favorite is still the customer who got confused and blamed me for another vendor's software. It was an interesting e-mail chain. Being a SysAdmin for a number of years has lead to me seeing some of the worst in people.

I used to think these people are mentally ill or sociopaths too, but I think I have come to a much simpler explanation. Steve Jobs was correct about computers being a bicycle for the mind, but I think we need to look at the negative part of that statement too. Computers just emphasis what is already there and a cost of the strengthening / speed is the removal of societal filters.

We grow up learning a bunch of societal filters that we use to take our raw thoughts and make them acceptable to the world around us. Most people get this right. At our hearts, most of us are good people and don't want to cause harm.

Sadly, most people don't really write e-mail, tweets, or posts to another person. They write it to some image in their head where the basic societal filters are no longer operating. Communication is dehumanized and you get the raw person. Sadly, many of our fellow humans are not very pleasant without those learned filters. In fact, they are rather primitive scum. Add to that the voices that say you don't have to be nice or even be humane to "them" for some value of "them" that is not "us" or "me". I really wish there were some fix, but I guess its AH from here on out.


See also: Tuttle, Oklahoma vs CentOS. https://www.theregister.com/2006/03/24/tuttle_centos/


I love such focused, cooperative and mature tools. They do one thing well, play nicely and keep their promises.

however, using lighttpd due to https.


Why not a separate TLS terminator like Hitch[1]? (Why do people in general insist that TLS be built and linked into everything?)

[1] https://hitch-tls.org/


yes, that would be it. I thought of stunnel. But that's extra moving parts.


The scale goes from “less moving parts” to “more focused”; the optimal point varies, but I don’t think you can get away from the fact that those are in opposition. Of course, the usual vernacular meaning of “more focused” is “does less things I don’t need”, but that’s not unrelated: because everyone needs a different subset of things, the more things a given piece of software needs the larger, on average, the proportion of those that you don’t need; so running more pieces of (less feature-rich) software seems necessary in order for having less stuff you don’t need or understand overall.

(In the specific case of TLS, I get additional warm fuzzies from being sure that, however screwed up the web server is, it cannot be confused[1] into revealing the secret keys when it does not have access to them in the first place. I don’t know to which extent this is actually important, though. A factotum[2]-like approach is a compromise that gives the web server the ephemeral keys in exchange for not having to pass the entirety of the traffic through the terminator, but I’m not aware of any practical implementations except for the one Akamai, disgustingly, patented[3] 10 years after the actual invention.)

[1] http://www.cap-lore.com/CapTheory/ConfusedDeputy.html

[2] https://www.usenix.org/conference/11th-usenix-security-sympo... or http://doc.cat-v.org/plan_9/4th_edition/papers/auth

[3] https://patents.google.com/patent/US9531685B2 (wow, I though this was the Cloudflare patent, but that one, https://patents.google.com/patent/US8782774B1, is even more trivial)


> warm fuzzies from being sure that, however screwed up the web server is, it cannot be confused[1] into revealing the secret keys when it does not have access to them in the first place.

indeed a big one. THE big one.

Usually I do watch out for 'https' in CGIs which requires the webserver to know. Need it e.g. to build absolute URLs https://codeberg.org/mro/geohash/src/branch/master/lib/cgi.m...


To test, chdir to directory with files to be served before running

    #ifndef IDX
    #define IDX 1
    #endif
    #if IDX==1 // final
    
          #include/*           micro HTTP server           */<stdio.h>
         #include/*           usage: ./http [port]          */<stdlib.h>
        #include/*                                           */<string.h>
         #include/*   the default port is 8080, files are   */<unistd.h>
          #include/*    read from the current directory    */<netdb.h>
    
    
                      int main(int n,char**
                      V){int t=SOCK_STREAM,
                      N=SO_REUSEADDR,i=1,c=
                             htons(n
      >1?atoi       (V[1]):  404*20)  ;struct sockaddr_in s
      ={000};       void*f=  &s;char  *m,b[1036];s.sin_port
      =c;for(       N=!((t=  socket(  s.sin_family=AF_INET,
      #define       http(c)  setsock         ##opt(\
      t,SOL_S       ##OCKET  ,N,&i,c         (i)),bi  ##nd(t,f,c(s))<0)||
      t,0))<0       ||(http  (sizeof         )listen  (t,5))<<10;N&&(0)<=(c
      =accept(t,0,0));close  (c)){b[         n=recv(  c,b,N,0),0>n?0:n]=0;n
      =!memcmp(b,"GET /",5)  <<6;for         (i=4;n^  '?'&&n<       127&&n>
      32;){n=b[++i];}strcpy  (b+i,b[         f=0,i-1  ]=='/'?       "index"
      ".html"       :"");m=  n?puts(         b+5),b[  03]='.'       ,strstr
      (b,"/."       )||0==(  f=fopen         (&b[3],  "rb"))?"404 Not Foun"
      "d":"2"       "00 OK"                  :"501 "  "Not Implemented";for
      ((send)       ((c),b,                  sprintf  (b,"HTTP/1.1 %s\r\n"
      "\r\n%"       "s",m,f                  ?"":m),  0);f&&!
      ((send)       (c,b+0,                  fread(b  ,1,N,f)
                                                      ,0)-N&&
                                                      (fclose
                                                      (f),404
                                                      )););}}
    
    #elif IDX==2 // almost readable
    
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #ifdef _WIN32 /* needs linking with Ws2_32 */
    #include <winsock2.h>
    #define close closesocket
    #define strstr(a,b)strstr(a,b)||strchr(a,':')||strchr(a,'\\')
    #define struct WSADATA w={0};WSAStartup(514,&w);struct
    #else
    #include <unistd.h>
    #include <netdb.h>
    #endif
    int main(int N,char**V){
            int i=1,t,c,n;struct sockaddr_in s={0};
            char b[1036];const char*m;void*f;
            s.sin_port=htons(N>1?atoi(V[1]):8080);
            for(N=(t=socket(s.sin_family=AF_INET,SOCK_STREAM,0))<0||
                            (setsockopt(t,SOL_SOCKET,SO_REUSEADDR,f=&i,sizeof(i)),
                            bind(t,f=&s,sizeof(s)))||listen(t,5)?0:1024;
                            N&&(c=accept(t,f=0,0))>=0;close(c)){
                    b[0>(n=recv(c,b,N,0))?0:n]=0;
                    for(n=!memcmp(b,"GET /",i=5)<<6;'?'^n&&32<n&&n<127;)n=b[i++];
                    m=n?strcpy(b+i-1,b[i-2]-'/'?"":"index.html"),
                                    puts(b+5),b[3]='.',strstr(b,"/.")||!(f=fopen(b+3,"rb"))?
                                    "404 Not Found":"200 OK":"501 Not Implemented";
                    for(send(c,b,sprintf(b,"HTTP/1.1 %s\r\n\r\n%s",m,f?"":m),0);
                                    f&&!(send(c,b,fread(b,1,N,f),0)-N&&fclose(f)|N););
            }
    }
    
    #endif

https://github.com/ilyakurdyukov/ioccc/blob/main/practice/20...


> In typical use it's about as fast as the best full-featured servers (Apache, NCSA, Netscape).

That's names i haven't heard in a long time!


For static content and cgi-bin, busybox is much faster.


Busybox lacks a chroot, which might improve security.

On the other hand, everything in busybox will itself run in a chroot, which might nullify this benefit.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: