Hacker Newsnew | past | comments | ask | show | jobs | submit | dtkav's commentslogin

IMHO there are a couple axis that are interesting in this space.

1. What do the tokens look like that you are you storing in the client? This could just be the secret (but encrypted), or you could design a whole granular authz system. It seems like tokenizer is the former and Formal is the latter. I think macaroons are an interesting choice here.

2. Is the MITM proxy transparent? Node, curl, etc allow you to specify a proxy as an environment variable, but if you're willing to mess with the certificate store than you can run arbitrary unmodified code. It seems like both Tokenizer and Formal are explicit proxies.

3. What proxy are you using, and where does it run? Depending on the authz scheme/token format you could run the proxy centrally, or locally as a "sidecar" for your dev container/sandbox.


I'm working on something similar called agent-creds [0]. I'm using Envoy as the transparent (MITM) proxy and macaroons for credentials.

The idea is that you can arbitrarily scope down credentials with macaroons, both in terms of scope (only certain endpoints) and time. This really limits the damage that an agent can do, but also means that if your credentials are leaked they are already expired within a few minutes. With macaroons you can design the authz scheme that *you* want for any arbitrary API.

I'm also working on a fuse filesystem to mount inside of the container that mints the tokens client-side with short expiry times.

https://github.com/dtkav/agent-creds


> With macaroons you can design the authz scheme that you want for any arbitrary API.

How would you build such an authz scheme? When claude asks permissions to access a new endpoint, if the user allows it, then reissue the macaroons?


There are two parts here:

1. You can issue your own tokens which means you can design your own authz in front of the upstream API token.

2. Macaroons can be attenuated locally.

So at the time that you decide you want to proxy an upstream API, you can add restrictions like endpoint path to your scheme.

Then, once you have that authz scheme in place, the developer (or agent) can attenuate permissions within that authz scheme for a particular issued macaroon.

I could grant my dev machine the ability to access e.g. /api/customers and /api/products. If i want to have claude write a script to add some metadata to my products, I might attenuate my token to /api/products only and put that in the env file for the script.

Now claude can do development on the endpoint, the token is useless if leaked, and Claude can't read my customer info.

Stripe actually does offer granular authz and short lived tokens, but the friction of minting them means that people don't scope tokens down as much.


I understand that, but how do you come up with the endpoints you want claude to have access to ahead of time?

For example, how do you collect all the endpoints that have access to customer info per your example.

Thought about it and couldn't find a way how


I'm not sure I'm fully understanding you, but in my experience I have a few upstream APIs I want to use for internal tools (stripe, gmail, google cloud, anthropic, discord, my own pocketbase instance, redis) but there are a lot of different scripts/skills that need differing levels of credentials.

For example, If I want to write a skill that can pull subscription cancellations from today, research the cancellation reason, and then push a draft email to gmail, then ideally I'd have...

- a 5 minute read-only token for /subscriptions and /customers for stripe

- a 5 minute read-write token to push to gmail drafts

- a 5 minute read-only token to customer events in the last 24h

Claude understands these APIs well (or can research the docs) so it isn't a big lift to rebuild authz, and worst case you can do it by path prefix and method (GET, POST, etc) which works well for a lot of public APIs.

I feel like exposing the API capability is the easy part, and being able to get tight-fitting principle-of-least-privilege tokens is the hard part.


made with ai?

Yeah, it says so at the top of the README (though I suppose I could have put that in the comment too). I'm not building a product, just sharing a pattern for internal tooling.

Someone on another thread asked me to share it so I had claude rework it to use docker-compose and remove the references to how I run it in my internal network.


Writing your own skill is actually a lot better for context efficiency.

Your skill will be tuned to your use case over time, so if there's something that you do a lot you can hide most of the back-and-forth behind the python script / cli tool.

You can even improve the skill by saying "I want to be more token efficient, please review our chat logs for usage of our skill and factor out common operations into new functions".

If anything context waste/rot comes from documentation of features that other people need but you don't. The skill should be a sharp knife, not a multi-tool.


The name is excellent.

The main things i think are missing is (1) how much am i spending and (2) why isn't my sprite paused, and (3) how can i get my stuff out (it would be nice to be able to mount in either direction or else integrate with git/git worktrees).

I ended up using it (and enjoying yolo mode!) but then my sprites weren't pausing and i was worried about spending too much so i deleted them.


Same. Claude Opus 4.5 one-shots the basics of chrome debug protocol, and then you can go from there.

Plus, now it is personal software... just keep asking it to improve the skill based on you usage. Bake in domain knowledge or business logic or whatever you want.

I'm using this for e2e testing and debugging Obsidian plugins and it is starting to understand Obsidian inside and out.


Cool! Have you written more about this? (EDIT: from your profile, is that what https://relay.md is about?)

https://relay.md is a company I'm working on for shared knowledge management/ AI context for teams, and the Obsidian plugin is what i am driving with my live-debug and obsidian-e2e skills.

I can try to write it up (I am a bit behind this week though...), but I basically opened claude code and said "write a new skill that uses the chrome debug protocol to drive end to end tests in Obsidian" and then whenever it had problems I said "fix the skill to look up the element at the x,y coordinate before clicking" or whatever.

Skills are just markdown files, sometimes accompanied by scripts, so they work really naturally with Obsidian.


Hey FWIW Relay is AWESOME!! The granular sharing of a given dir within a vault (vs the whole thing) finally solves the split-brain problem of personal (private) vault on my own hardware vs mandated use of a company laptop... it's fast, intuitive, and SOLVES this long-time thorn in my side. Thanks for creating it, high five, hope it leads to massive success for you! :)

Thank you for the kind words <3

Sorry it took me a while. Hopefully this helps:

https://notes.danielgk.com/Obsidian/Obsidian+E2E+testing+Cla...


Thanks! It does help, it's a great blog. You shld consider posting a "show hn".

Incredible work.

fly.io is doing really good work. I've super enjoyed building our product on their platform. I love fly-replay combined with super fast start-up.

I've been thinking a lot about how to run agents (and skills) securely while giving them a lot of powerful capabilities.

I recently used their macaroons library to turn arbitrary API keys (e.g. for stripe's API) into macaroons. I route requests for an upstream host (like stripe) through Envoy as a mitm proxy which injects the real creds after verifying the macaroon.

It is such a powerful pattern. I'm always worried about leaking sensitive keys through prompt injection attacks (or just sending them to anthropic), but in this model you can attenuate the keys (both capabilities & validity window) client side. The Envoy proxy lives inside my flycast network so it can't be accessed externally.

It would be so cool if fly built something like this into sprites.dev (though I can see how it would be spooky to have fly install their own certs for stripe, etc...)


If you read Ben Toews work on the tokenizer you have a good sense of where I want Sprites to go with key leaks and prompt injection:

https://fly.io/blog/tokenized-tokens/


Awesome stuff! Thanks for the reply.

Tokenizer is an explicit proxy though right?

My use case is very similar, but I wanted a transparent proxy so I could run unmodified scripts. It is a tricky design decision though.

I also mount a little fuse filesystem that mints macaroon on read (with a shorter lifetime, probably inspired by y'all but i forget from where).

I work on realtime collaboration of markdown files (currently in Obsidian), which has become a shared-context substrate for agents, skills, etc.. Our own company workspace has skills that have scoped access to fly, stripe, gmail, etc. We're definitely drinking the file-over-app personal-software-for-teams Kool-Aid, so the problem space for us includes access control and auditing.

Love your work :)


We have enough control over the execution environment in a Sprite (unlike a Fly Machine, where the implied Linux contract we have with our users gets in the way) that we can trivially hide explicit proxies.

We can also attach Macaroons to Fly Machines and Sprites for configurable ambient privileges, something I've wanted us to expose as a feature for a very long time.


Awesome, i look forward to that. I think that could be a major differentiator for sprites. I wish i could work on that problem at fly.io scale.

What is the contract with sprites? Is it just built-with-linux but not promising Linux? Or is it more like a machine but y'all control the container image?


There's no "formal" contract in either place but people running on Fly Machines expect that there's nothing at all between them and the kernel, and we don't have that expectation in Sprites; we can do whatever we want. :)

I don't want to get too far into the rest of the details only because I'm writing this up for next week. They're not that interesting technically, but they're a really big deal for us in other ways.


Great, i look forward to reading it.

Did you write up anything about this? Is this off the shelf behavior for Envoy or did you create this API yourself?

sorry for the delayed response. I ended posting on this [0] thread where they (Formal) are doing something similar.

Here's the repo [1]. I modified it a bit to post publicly and remove the details of my setup within my tailnet/flycast network.

[0] https://news.ycombinator.com/item?id=46605155

[1] https://github.com/dtkav/agent-creds


I can open source it next week when i get a chance.

This is the way. If you symlink the .claude directory (so Obsidian can see the files) then you can also super easily add and manage claude skills.

I've spent 20 years living in the terminal, but with claude code I'm more and more drafting markdown specs, organizing context, building custom views / plugins / etc. Obsidian is a great substrate for developing personal software.


How was the migration process?

I work on a plugin that makes Obsidian real-time collaborative (relay.md), so if the migration is smooth I wonder how close we are to Obsidian being a suitable Notion replacement for small teams.


I've been waiting for Logseq DB to come out to replace Google docs for my team. So your offering is interesting, but

1) is it possible to use Obsidian like Logseq, with a primary block based system (the block based system, which allows building documents like Lego bricks, and easily cross referencing sections of other documents is key to me) and

2) Don't you expect to be sherlocked by the obsidian team?


In Obsidian you can have transclusions which is basically an embed of a section of another note. It isn't perfect, but worth looking into.

Regarding getting sherlocked; Obsidian does have realtime collaboration on their roadmap. There are likely to be important differences in approach, though.

Our offering is available now and we're learning a ton about what customers want.

If anything, I'd actually love to work more closely with them. They are a huge inspiration in how to build a business and are around the state of the art of a philosophy of software.

I'm interested in combining the unix philosophy with native collaboration (with both LLMs and other people).

That vision is inherently collaborative, anti lock-in, and also bigger than Obsidian. The important lasting part is the graph-of-local-files, not the editor (though Obsidian is fantastic).


> 1) is it possible to use Obsidian like Logseq, with a primary block based system (the block based system, which allows building documents like Lego bricks, and easily cross referencing sections of other documents is key to me) and

More or less yes, embeddable templates basically gives you that out of the box, Obsidian "Bases" let you query them.

> 2) Don't you expect to be sherlocked by the obsidian team?

I seem to remember that someone from the team once said they have no interest in building "real-time" collaboration features, but I might misremember and I cannot find it now.

And after all, Obsidian is a for-profit company who can change their mind, so as long as you don't try to build your own for-profit business on top of a use case that could be sherlocked, I think they're fine.


From their roadmap page:

> Multiplayer > > Share notes and edit them collaboratively

https://obsidian.md/roadmap


Doesn't say real-time there though? But yeah, must be what they mean, because you can in theory already collaborate on notes, via their "Sync", although it sucks for real-time collaboration.


Sorry for the late reply. The migration was really easy actually. I used the official migration plugin. There were a few things it couldn’t transfer over though (voice transcription notes)

Very helpful, thank you.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: