Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Lies About Logs (honeycomb.io)
80 points by infogulch on April 17, 2017 | hide | past | favorite | 32 comments


This is a great article. A lot of people treat logs as something you write and never read. Therefore, the only "cost" is the storage. The actual VALUE is in the retrieval, the analysis, etc. That's bandwidth plus software development, which is by far more valuable than a hard disk on a server.

It would be hilarious if people replied to this blog post with comparisons of Honeycomb's monthly price to the price of a DO drop or a EC2 host with the equivalent amount of storage. Those people would be showing their complete and total misunderstanding of logs.

Someone should write a blog post about misconceptions about logs to help educate such people... oh wait... she did!


Wow, really, 600$/month for 5GB storage? Even a 8 GB GPU RAM costs less than that for GPU instances. Not trying to undermine the work, but it really does not have that much utility to justify such a huge price for such low capacity.



Splunk is far more superior to this startup.


In what ways?


Yeah, those prices are... wow?!

It even says these are introductory prices, which to me at least normally means a 'time limited, cheaper offer'.


It's actually a lot cheaper. Honeycomb is a lot more efficient in the way it collects and stores data than any other option available. But yeah, side by side it looks worse. At this stage we are mostly talking to good engineers though, who understand how this works.


> It's actually a lot cheaper

Just so I understand: these prices are a lot cheaper than you plan on making them?

> At this stage we are mostly talking to good engineers though, who understand how this works

Is that a dig?! For the record, I am more than familiar with how logging works. If Honeycomb is so much better than other offerings, I think you need to do some convincing on your website. Because as it stands, it's not a compelling offering.


>... how this works.

How what works? Logging or pricing or storage or what?


“They’re just logs, they can’t possibly impact my code efficiency.” More bad news: oh yes they can, particularly in the critical execution path. Lots of people seem to think that there is no cost to dropping DEBUG statements around their code willy-nilly. This is usually true on your laptop,

Haha - this happened to me last week and I found out the hard way. My log had grown so quickly that the code was taking 15 seconds just to append to it.


> My log had grown so quickly that the code was taking 15 seconds just to append to it.

How is that possible? Opening a file with the O_APPEND flag automatically seeks to the end of it.


The log itself was a 2.7Gb file sitting on a slow SD card on a Raspberry Pi.


> “They’re just logs, they can’t impact production reliability.” See also, “Logging to disk is awesome!” You can take down nearly any production database or I/O intensive workload just by sharing a filesystem with its logging.

I hope nobody reads this and decides to stop logging to disk instead of diverting those logs to a separate device. Disk-buffered logs are essential for high availability: they loosen the otherwise-too-tight coupling between the log consumer and the log-producing application.


I am really dumbfounded by the pricing. $600 a month for 5GB storage and 50 writes per second?? Meanwhile, a DO droplet is 5$/Mo, has 1TB traffic and 20GB SSD. Am I overlooking something? I agree with most of the points this article makes, but leveled logging has always served me well, with the exact levels she* mentions in the article. And of course logging is expensive if you pay 600$ per month for 50 writes per second. Also, there's no be-all-end-all logging solution, it always has to be tailored specifically for your needs.

*EDIT


Rolling your own makes sense only to a point; anything much more operationally complex than "an rsyslog collector" rapidly crosses the boundary of "I don't care what the hardware costs me, it's dwarfed by the costs of engineering time / loss of focus."

Unless logging is the core of what your company does, I'd suggest not reinventing this particular wheel; those stories never end well.


The author isn't a "he", and the hardware cost of running 512MB of RAM infra doesn't capture at all the cost of running a logging solution. By expensive, the article mentions the word in the context of performance overhead, which is absolutely right.

Their pricing seems a bit high for the write throughput but the equivalent cost in engineering time, for equal capabilities, is way in their favor.


>The author isn't a "he"

Historically, "he" has been a gender neutral way of referring to another person when gender is irrelevant to the conversation (such as it is now). "They" may be more politically correct, but "he" is still a completely valid gender-neutral term. There's no need to correct that, gender is completely irrelevant to this discussion.


Using "they" makes gender irrelevant. It's not politically correct, it's precise. Meanwhile, using "he" is a statement of assumption, or a statement of not caring about changes in society. I think this was involuntary by OP. Yet, I also think those need to be corrected. What was seen as the norm historically is not an argument for what we should make the future from. If gender is irrelevant in a sentence, then let's correct that by using and encouraging gender neutral pronouns.

If anything, this struck me enough in the OP's paragraph to wonder if I had misattributed the article, which is why I raised the point after verifying that the author was indeed who I thought it was. To a sample size of at least 1, the usage of the gendered pronoun was noticeable. Maybe that's just me.


"They" is possibly more precise on gender, but adds a new layer of ambiguity about how many people you're talking about. If the OP saying "he" led you to second guess the gender of the author, saying "they" may lead you to second guess how many authors had contributed to the article. What really stands out to me is how quick some people are to correct the gender of someone when a comment may misrepresent them as if the content of the article was somehow altered by the word. The OP "involuntarily" wrote "he" because "he" is completely grammatically correct.

But you're right, since what was seen as the norm historically is not an argument for what we should make the future from, I propose we just use the word "floobity" to represent the singular, gender-neutral pronoun. Floobity wrote a great article, no matter what gender floobity happens to be.


Yeah, I know about the performance overhead of logging. We have some embedded devices that do minimal logging on flash storage and if you log that you missed a timing on the critical path, it causes you to miss another timing, and this leads to a cascade that bricks the entire device because the CPU only does logging and can't even receive any commands.


Throw an MRAM chip or something similar on your next design. They're basically battery backed SRAM without the battery. Awesome for logging data plane errors.


Their competition is more like Splunk Cloud, or Dynatrace APM.


It's not apparent how this (Honeycomb) is different from, say, ELK (ElasticSearch-Logstash-Kibana).

We aggregate logging from all nodes, where the log entries/events are json-payloads with a bunch of JSON metadata. This makes searching/slicing/etc. in Kibana easy.

It feels like the website brushed over this a bit too quickly, dismissing ELK as dumb string-logging.


We've been using Honeycomb at Nylas for 6+ months at this point and love it - the product keeps getting better and better, it's fast and reliable, and it's given us new debugging capabilities that make investigating outages and digging into system performance easier than they were before.

We've also built our own ELK cluster (three times!) and can attest that it takes very significant engineering effort to get a scalable, reliable, high performance cluster, and the ongoing cost in hardware is easily hundreds of dollars a month or more.

Three things that we can do with Honeycomb that we can't do with ELK (or at least not easily):

* Quickly iterate on questions to explore datasets. It makes a huge difference in how you approach a tool when the time between question and answer is milliseconds, not seconds. * Compare trends in data - Honeycomb can easily group data by field and display many lines on the same graph. As an example: easily break down your API traffic by endpoint to pinpoint a spike in traffic to a specific endpoint. (Or by endpoint and customer, etc.) * Calculate percentiles, averages, etc. on our data in real-time - without having to set up the calculated metrics beforehand. This makes honeycomb a better tool for performance monitoring than ELK imho.

There may be ways to do some of these things with ELK, but not out of the box and, given the indexed data store behind the stack, it's just not architected to have these same capabilities.

I suspect if you dug into some of Honeycomb's other blog posts, this difference would become more apparent. :)


600$/month for a SaaS "Basic" plan is questionable, to say the least. Get your pricing right. For 600$/month I can easily allocate resources within my company to come up with a tailored solution for my needs. Maybe check what the competition is doing:

https://papertrailapp.com/plans https://logentries.com/pricing/ https://www.loggly.com/plans-and-pricing/


"I can build a better version myself by retasking my $200K a year engineers" is the battle cry of someone screwing up the ROI analysis.


One company was paying $35k/ month in just ec2 hardware costs to run an ELK cluster, and we were able to slash that substantially, while still making money. Not to mention the savings in core engineering time.

Just goes to show you, people have no idea how muych their shit costs. And/or don't value their own time.


That seems like a lot of money. Can you give a really high level breakdown of where it was spent?


Really? $600 sounds pretty cheap to me considering what you get.


The blog post says strings are expensive but then the product is marketed as begin built around JSON. A pretty expensive way to serialize log data.

Generally, launching your startup with a passive-aggressive blog post full of false information and economically presented half-truths is..... I'll leave that one open :)

There may well be a technical use-case for this type of logging system but I'm not sure the way it is being presented is going to reach that audience.


It's modeled after the way Facebook does their own internal analytics, with Scuba.

Go ask them how they like it.


Oh yeah, I wrote the blog post. And it had nothing to do with our launch. It was supposed to be funny, not a comprehensively detailed argument.

If you haven't been woken up by any or all of these things, I wouldn't expect you to find it as amusing as those of us who have.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: