Specifically: “ Although our pharmaceutical armamentarium is very good at the moment (the combination of statin-ezetimibe-proprotein convertase subtilisin/kexin type 9 [PCSK9] can reduce LDL cholesterol [LDL-C] levels by 85%), new drugs are emerging through the different pitfalls of current drugs.”
The lack of self-consistency does seem like a sign of a deeper issue with reliability. In most fields of machine learning robustness to noise is something you need to "bake in" (often through data augmentation using knowledge of the domain) rather than get for free in training.
The trick with the vo2 max measurement on the apple watch though is that the person can not waste any time during their outdoor walk and needs to maintain a brisk pace.
Then there's confounders like altitude, elevation gain that can sully the numbers.
It can be pretty great, but it needs a bit of control in order to get a proper reading.
Seems like Apple's 95% accuracy estimate for VO2 max holds up.
Thirty participants wore an Apple Watch for 5-10 days to generate a VO2 max estimate. Subsequently, they underwent a maximal exercise treadmill test in accordance with the modified Åstrand protocol. The agreement between measurements from Apple Watch and indirect calorimetry was assessed using Bland-Altman analysis, mean absolute percentage error (MAPE), and mean absolute error (MAE).
Overall, Apple Watch underestimated VO2 max, with a mean difference of 6.07 mL/kg/min (95% CI 3.77–8.38). Limits of agreement indicated variability between measurement methods (lower -6.11 mL/kg/min; upper 18.26 mL/kg/min). MAPE was calculated as 13.31% (95% CI 10.01–16.61), and MAE was 6.92 mL/kg/min (95% CI 4.89–8.94).
These findings indicate that Apple Watch VO2 max estimates require further refinement prior to clinical implementation. However, further consideration of Apple Watch as an alternative to conventional VO2 max prediction from submaximal exercise is warranted, given its practical utility.
That’s saying that they’re 95% confident that the mean measurement is lower than the treadmill estimate, not that the watch is 95% accurate. In other words they’re confident that the watch underestimates VO2 max.
The basic idea was to adapt JEPA (Yann LeCun's Joint-Embedding Predictive Architecture) to multivariate time series, in order to learn a latent space of human health from purely unlabeled data. Then, we tested the model using supervised fine tuning and evaluation on on a bunch of downstream tasks, such as predicting a diagnosis of hypertension (~87% accuracy). In theory, this model could be also aligned to the latent space of an LLM--similar to how CLIP aligns a vision model to an LLM.
IMO, this shows that accuracy in consumer health will require specialized models alongside standard LLMs.
"Fibermaxxing" is admittedly a silly term, but not only does a high fiber diet reduce cardiovascular mortality by 26%, it also reduces risk of cancer by 22%. Your grandparents were right!
Long Covid patients often face post-exertional malaise (PEM).
One of the main treatments for Long Covid is graded exercise therapy. This works for some subset of patients. But for other patients, it actually makes things worse. Right now we have no objective test that tells who is in what category.
I like this study since it identifies a specific mechanism of why PEM happens. And maybe that will lead to a more objective test for different sub-types of long covid, so that we can actually prescribe the right treatments to people.
Congrats on the launch. I always love to see smart ML founders applying their talents to health and bio.
What were the biggest challenges in getting major pharma companies onboard? How do you think it was the same or different compared to previous generations of YC companies (like Benchling)?
Thanks! I think advantages we had over previous generations of companies is that demand and value for software has become much clearer for biopharma. The models are beginning to actually work for practical problems, most companies have AI, data science or bioinformatics teams that apply these workflows, and AI has management buy-in.
Some of the same problems exist, large enterprises don't want to process their un-patented, future billion-dollar drug via a startup, because leaking data could destroy 10,000 times the value of the product being bought.
Pharma companies are especially not used to buying products vs research services, there's also historical issues with the industry not being served with high quality software, so it is kind of a habit to build custom things internally.
But I think the biggest unlock was just that the tools are actually working as of a few years ago.
What tools are "actually working" as of a few years ago? Foundation models, LLMs, computer vision models? Lab automation software and hardware?
If you look at the recent research on ML/AI applications in biology, the majority of work has, for the most part, not provided any tangible benefit for improving the drug discovery pipeline (e.g. clinical trial efficiency, drugs with low ADR/high efficacy).
The only areas showing real benefit have been off-the-shelf LLMs for streamlining informatic work, and protein folding/binding research. But protein structure work is arguably a tiny fraction of the overall cost of bringing a drug to market, and the space is massively oversaturated right now with dozens of startups chasing the same solved problem post-AlphaFold.
Meanwhile, the actual bottlenecks—predicting in vivo efficacy, understanding complex disease mechanisms, navigating clinical trials—remain basically untouched by current ML approaches. The capital seems to be flowing to technically tractable problems rather than commercially important ones.
Maybe you can elaborate on what you're seeing? But from where I'm sitting, most VCs funding bio startups seem to be extrapolating from AI success in other domains without understanding where the real value creation opportunities are in drug discovery and development.
These days it's almost trivial to design a binder against a target of interest with computation alone (tools like boltzgen, many others). While that's not the main bottleneck to drug development (imo you are correct about the main bottlenecks), it's still a huge change from the state of technology even 1 or 2 years ago, where finding that same binder could take months or years, and generally with a lot more resources thrown at the problem. These kinds of computational tools only started working really well quite recently (e.g., high enough hit rates for small scale screening where you just order a few designs, good Kd, target specificity out of the box).
So both things can be true: the more important bottlenecks remain, but progress on discovery work has been very exciting.
As noted, I agree on the great strides made in the protein space. However, the over saturation and redundancy in tools and products in this space should make it pretty obvious that selling API calls and compute time for protein binding, annd related tasks, isn’t a viable business beyond the short term.
reply