<?xml version="1.0" encoding="UTF-8"?>
<rss  xmlns:atom="http://www.w3.org/2005/Atom" 
      xmlns:media="http://search.yahoo.com/mrss/" 
      xmlns:content="http://purl.org/rss/1.0/modules/content/" 
      xmlns:dc="http://purl.org/dc/elements/1.1/" 
      version="2.0">
<channel>
<title>Kevin&#39;s Homepage</title>
<link>https://kjablonka.com/</link>
<atom:link href="https://kjablonka.com/index.xml" rel="self" type="application/rss+xml"/>
<description>Kevin Maik Jablonka&#39;s personal homepage</description>
<image>
<url>https://kjablonka.com/quarto.png</url>
<title>Kevin&#39;s Homepage</title>
<link>https://kjablonka.com/</link>
</image>
<generator>quarto-1.8.24</generator>
<lastBuildDate>Tue, 24 Mar 2026 23:00:00 GMT</lastBuildDate>
<item>
  <title>Science is Too Successful</title>
  <link>https://kjablonka.com/blog/posts/scientific_freedom/</link>
  <description><![CDATA[ 




<p>I am, again, in a personal crisis about science, my career, and stuff like this. I should probably be doing something about that. Instead, I am writing a blog post.</p>
<p>This is, I recognize, a form of procrastination. But it is also, maybe, a form of thinking — and as we all know, <a href="https://www.nature.com/articles/s44222-025-00323-4">writing is thinking</a>. Since the crisis is partly about not having enough time to think — because the time goes to administration, to procurement, to review rebuttals, to forms — perhaps writing about it counts as a small act of resistance. Or perhaps I’m rationalizing. Either way, here we are.</p>
<section id="the-money-i-cant-spend" class="level2">
<h2 class="anchored" data-anchor-id="the-money-i-cant-spend">The money I can’t spend</h2>
<p>I should start with something that will sound strange to most scientists: I have too much money. Not personally — personally, as I’ve written <a href="https://kjablonka.com/blog/posts/finances/">elsewhere</a>, due to my lack of prioritization of bureaucracy and administration, I pay for my research tools out of my own pocket. But my group has funding that sits unspent. Not because there’s nothing to do with it, but because spending it responsibly would mean hiring more people, and hiring more people would most likely mean growing my group past the size where I can still understand what everyone is working on.</p>
<p>At the moment, I keep my group small on purpose, and even so it’s already hard. We just came back from a group retreat where I told everyone that I’m struggling to keep all our projects in my head, and that this means we’re not leveraging my strengths in the way we should. If I doubled the group, I’d gain output (more publications) but lose whatever grasp I still have. I’d become even more a manager, not a scientist. The funding system would be delighted — a larger group, more publications, more “impact.” But I would understand less about my own research, which is a trade I’m not willing to make.</p>
<p>I would happily give the money back if it meant removing the administrative apparatus that comes with having it. The reporting, the oversight, the hierarchy, the coordination overhead — all of it exists because the money exists. The money was supposed to enable the science. Instead, a growing share of it pays for the system that manages the money. I am sometimes not sure who is working for whom.</p>
</section>
<section id="how-we-got-here" class="level2">
<h2 class="anchored" data-anchor-id="how-we-got-here">How we got here</h2>
<p>None of this was built by villains. Science succeeded. It worked. Governments funded it, and the funding produced results, and the results justified more funding. The system grew — more researchers, more institutions, more publications, more money. Science is great, and we should all be grateful for having science.</p>
<p><a href="https://press.stripe.com/scientific-freedom">But growth created coordination problems.</a> When you have ten researchers at an institute, you can talk to each of them over lunch. <a href="https://www.experimental-history.com/p/the-rise-and-fall-of-peer-review">When you have a thousand, you need an organizational chart, a procurement office, a grants administration team, an HR department, a compliance unit.</a> Each layer solves a real problem. Each layer also costs money, takes time, and adds friction. And each layer, once established, develops its own logic, its own incentives, its own instinct for self-preservation.</p>
<p>The funding also required a story. To justify public investment at this scale, science needed to promise something large. The story it told was: we produce truth. Give us money, and we will deliver facts about the world that are reliable, verified, and objective. This is my speculation — I haven’t dug into the historical evidence for this reading, and I would be curious to see it tested (reach out to me!). Metrics also come from a different pressure — science consumes resources that could go to other things society values and needs, and there is an urge to justify that expense. So you build metrics, and metrics need oversight, and oversight needs administrators. And so the system grows — not to produce more understanding, but to demonstrate that it’s delivering on its promises.</p>
<p>Science doesn’t produce truth. It produces better working hypotheses. That’s a different thing, and it might imply a different kind of institution. A hypothesis is provisional. It’s meant to be revised. It’s useful precisely because it might be wrong. An institution built around truth needs to verify and control. An institution built around hypotheses needs to explore and tolerate uncertainty. We built the first kind and are now surprised that it doesn’t feel like the second.</p>
</section>
<section id="we-also-just-got-icml-reviews-back" class="level2">
<h2 class="anchored" data-anchor-id="we-also-just-got-icml-reviews-back">We also just got ICML reviews back</h2>
<p>I am writing this shortly after getting reviews back on our ICML papers, so I should be transparent that the timing is not a coincidence. Many of the reviews ask for more benchmarks, more models, more ablations. I don’t see when this actually stops, and I don’t see when adding this form of “more” adds to more insight.</p>
<p>But I notice the same thing in myself when I review. The form has a field for strengths and a field for weaknesses. The weaknesses field is empty, and it’s waiting for me to fill it. I have noticed that this field, by existing, changes what I do. Even when a paper is good — when I read it and think, yes, this is interesting, this advances understanding — I find myself (subconsciously) scanning for something to put in the weaknesses box. There is always another benchmark the authors didn’t run, another model they didn’t compare against, another dataset they didn’t test on. These are technically valid criticisms. They are also, in most cases, beside the point. The paper taught me something. The missing benchmark would not have taught me more.</p>
<p>The form asks for weaknesses, so I supply them. The authors write a rebuttal, addressing each weakness with more experiments, more comparisons, more pages. The paper gets longer. I’m not sure it gets better. The cycle consumes weeks of work on both sides, and the decision at the end — accept or reject — is, as multiple studies have shown, <a href="https://arxiv.org/pdf/2109.09774">sometimes barely more reliable than random assignment</a>. We all know this. We do it anyway.</p>
<p>The request for “more” is the easiest criticism to make because it’s unfalsifiable. There will always be another experiment you didn’t run. If we require proof that something works on everything, we will never be done with anything, and we can stop the business of doing research in the first place.</p>
<p>And now we have AI reviews. Researchers submit papers, and on the other end, an LLM fills in the same form — strengths, weaknesses, questions for the authors — producing the shape of evaluation without necessarily having any understanding behind it. <a href="https://www.argmin.net/p/information-transit-got-the-wrong">Whatever relationship LLM-generated text has to semantic content and truth is always accidental or incidental.</a> The conferences respond with detection policies, new rules, more oversight. The system treats the symptom by growing the apparatus. <a href="https://artificialbureaucracy.substack.com/p/context-widows">Kevin Baker, put it well “Systems can persist in dysfunction indefinitely, and absurdity is not self-correcting.”</a></p>
<p>I sometimes think about what peer review was before it scaled. When a field was small enough, a program committee sat in a room and discussed each paper. They argued, they disagreed, they changed their minds. The process was biased, clubby, and imperfect. But it happened at a human scale, which meant that understanding was at least possible. At the current scale, it’s not. No committee can discuss thousands of submissions. So we distribute the work to anonymous individuals, give them forms, aggregate their scores, and pretend the numbers mean something. We scaled the process and lost the thing that made it work.</p>
</section>
<section id="science-is-human" class="level2">
<h2 class="anchored" data-anchor-id="science-is-human">Science is human</h2>
<p>I think the thing I keep circling around is that science is a human activity. Understanding — the actual thing science is for — happens inside a human mind. A person reads a paper, thinks about it, connects it to what they already know, and updates their picture of how something works. That process doesn’t scale. You can’t make it faster by adding metrics. You can’t outsource it to an AI and call the output “understanding.” You can produce more papers, more data, more benchmarks, but understanding is still bounded by the pace at which a person can absorb and integrate ideas.</p>
<p>Hartmut Rosa has written about this as a dissonance between the pace of production and the pace of comprehension. We publish more than anyone can read — and perhaps I am also producing blog posts at a higher pace than anyone cares to read. We produce data faster than anyone can analyze. We run experiments faster than anyone can think about what they mean. The institution optimizes for production. Understanding happens on its own schedule, and that schedule has not gotten faster. I honestly don’t see how it can, unless we upgrade the human hardware — and the evidence from recent studies suggests that measured intelligence <a href="https://www.bbc.com/future/article/20190709-has-humanity-reached-peak-intelligence">is maybe even declining</a>.</p>
<p>And the questions we choose to ask in science are, in the end, value judgments. What’s worth studying? What matters? These are human decisions, rooted in human experience and human priorities. I think, on a societal level, we shouldn’t leave these to an AI scientist or an optimization algorithm or a metric that rewards citation counts. The selection of questions is where human judgment matters most, and it’s exactly the part of the process that gets squeezed when the system demands more output.</p>
<p>If we’re honest about this fact that science is irreducibly human — then we can start questioning the institutions we’ve built. Because many of them are, in a meaningful sense, inhuman. They operate at scales where no individual can comprehend the whole. They substitute measurement for judgment and volume for depth. They were built to manage a system too large for humans to manage, and in doing so they created an environment that is increasingly difficult for humans to do good work inside.</p>
</section>
<section id="a-german-asking-for-austerity" class="level2">
<h2 class="anchored" data-anchor-id="a-german-asking-for-austerity">A German asking for austerity</h2>
<p>And now I am here: The German asking for austerity. But I think the answer, for at least parts of the system, is to want less. Not because cutting budgets is virtuous — I have lived through enough austerity discourse to know it isn’t — but because some things that matter in science only survive at a scale where humans can still be fully present.</p>
<p>Smaller groups where a PI understands every project. Peer review at a scale where people can discuss papers rather than fill in forms. Less pressure to produce more, and more room to understand what’s already been produced. Fewer metrics, more trust. And I mean trust in the full human sense — I crave it, almost, from my institution. The feeling that they trust me to spend wisely the money I brought in through third-party funding. That they trust my judgment about whom I talk to, what I work on, how I run my group. The word “enough” as a legitimate answer to “how much?”</p>
<p>Is more even better? More papers, more citations, more funding, more people, more benchmarks, more administration to manage all of the above? The system says yes. My experience says: past a certain point, more produces more of everything except the thing science is actually for.</p>
<p>I would give back money to have less overhead. I would accept fewer publications to have deeper ones. In my group, we have already started saying no — to new projects, to quick workshop papers, to collaborations that would spread us thinner. At the retreat, we discussed doing even more of that. These are not popular positions in a system that equates growth with success. But I think they might be correct. And I also realize that this cannot be the universal prescription. We have important questions to answer and we need science to answer them. And I also recognize that a lot of great science happens in large groups and that humanity benefitted a lot from the growth of science. But I think we can also recognize that there are some things that only happen in small groups, and that we should be careful not to lose those things in the rush to produce more.</p>
</section>
<section id="a-caveat" class="level2">
<h2 class="anchored" data-anchor-id="a-caveat">A caveat</h2>
<p>There is frustration in all of this, and frustration is not always a reliable guide to systemic problems. Maybe some of what bothers me is just the normal friction of working inside any institution. Maybe I could solve some of it by being more patient, more organized, better at administration. Maybe I am a person with a higher desire for freedom and trust than is usual, and maybe I am coupling some personal frustrations into what I’m presenting as a systemic critique. I don’t claim to have perfectly separated the two. And I don’t claim that any of what I describe is universally true or can be used as an answer for systemtic problems.</p>
<p>This blog post is itself a symptom. I am a researcher, procrastinating on my research, writing about why the system makes it hard to do research. In a system that worked, the frustration wouldn’t accumulate to the point where it demands an essay. It should just be smoother from the start.</p>
<p>Where, if not in academia, should we be willing to dream about what the thing could be? Where, if not among people whose job is to ask hard questions, should we ask whether our own institutions are the right ones? Maybe this is utopian. But we are in the business of ideas that sound unrealistic until someone tests them. The least we can do is extend that courtesy to ourselves.</p>


<!-- -->

</section>

 ]]></description>
  <category>metascience</category>
  <category>peer-review</category>
  <category>scientific-freedom</category>
  <guid>https://kjablonka.com/blog/posts/scientific_freedom/</guid>
  <pubDate>Tue, 24 Mar 2026 23:00:00 GMT</pubDate>
</item>
<item>
  <title>How to Get an Interest-Free Loan</title>
  <link>https://kjablonka.com/blog/posts/finances/</link>
  <description><![CDATA[ 




<p>If you are a university, this is straightforward. Hire researchers who care about their work more than their money. Make the procurement system slow enough that they cannot use it for the tools they need. They will pay out of their own pockets and forget to ask for the money back. It is an excellent deal. I should know — I am one of those researchers.</p>
<p>Recently, my partner and I had a fight about money. She told me she doesn’t want to share bank accounts with me. I was surprised this had become a topic at all. I am Swabian — from the part of Germany with a reputation for being careful with money that borders on clinical. I cook at home, I don’t buy things I don’t need, I try to save where I can. The problem is that I use my personal credit card to pay for my research. Cloud compute, API access, dataset subscriptions, development tools — I pay for all of it myself and then <em>try</em> to get reimbursed by my university. The reimbursement forms require invoices that match the charges, converted to the right currency, filed through the right cost center. I am sitting on several thousand euros in unfiled claims right now, expenses I either didn’t find time to submit or gave up on when the paperwork didn’t match. If you can make a Swabian lose track of money he is owed, your system has a serious problem.</p>
<p>From my partner’s perspective, I am funneling household money into my job. She is right. From where I sit, several things are true at once. I am giving my university an interest-free loan. I am losing money outright, because some of those claims I will never file and some I have already forgotten. But more than either of those: I just want to do research. My field moves fast, and paying out of pocket is the only way I can keep up. When a new model or cloud service becomes relevant to my work, I cannot wait eight weeks for a purchase order. So I pull out my credit card. The alternative — spending my already overwhelming days navigating procurement forms, chasing approvals, matching receipts to charges in the wrong currency — would cost me something I cannot get back, which is time I should be spending on science.</p>
<p>I suspect this is not uncommon among computational researchers, though few talk about it.</p>
<section id="the-gap" class="level2">
<h2 class="anchored" data-anchor-id="the-gap">The gap</h2>
<p>My university has a straightforward mechanism for paying research costs. I hand in an invoice, and if it is under €7,000, the finance office processes it. This works for lab equipment, conference fees, even computers — anything sold by a vendor who issues a proper invoice.</p>
<p>Cloud providers and AI companies do not issue invoices to individual researchers. They charge credit cards. My university can process an invoice. OpenAI can charge a credit card. Between those two facts lies an administrative gap that no one is responsible for closing.</p>
<p>Earlier this year, I sent my partner a photo of myself carrying a stack of new MacBooks for my research group. Tens of thousands of euros worth of hardware, purchased by the university without issue, because the third-party reseller sends invoices. She saw that photo. Weeks later, we had the fight about money, because the code running on those MacBooks makes API requests that cost fractions of a cent each, and those come out of my personal credit card. The hardware cost a hundred times more than the software. The expensive purchase was easy. The cheap one was impossible. Not because anyone decided that MacBooks matter more than API calls, but because one product comes with an invoice and the other requires a credit card.</p>
<p>This mismatch was invisible ten years ago. Most research costs were physical goods sold through conventional procurement channels. But the tools that computational researchers depend on are now overwhelmingly digital, subscription-based, and billed to credit cards. Research might also have been slower. The gap between what researchers need to buy and what universities can pay for widens every month.</p>
<p>The result is a hidden tax on digital research, and it gets paid in three currencies: your own money, your own time, or the research you quietly decide not to attempt because the overhead isn’t worth it. That last one is the real cost. You don’t spin up GPU instances, when you are rushing for a deadline. You don’t try the new model. You stick with whatever is already approved, even when better options exist. The procurement system shapes which research gets attempted.</p>
<p>And it fails on its own terms. The purpose of all this bureaucracy is financial oversight — knowing what was spent, on what, by whom. But when researchers pay personally and file reimbursements months later (or never), the university has no accurate record of research costs. My several thousand euros of unfiled claims are not in anyone’s books. The system designed for accountability produces worse accounting than the alternative.</p>
</section>
<section id="why-nobody-fixes-it" class="level2">
<h2 class="anchored" data-anchor-id="why-nobody-fixes-it">Why nobody fixes it</h2>
<p>The bottleneck persists because it sits in a crack between institutions. Cloud providers sell to consumers and enterprises. Universities buy from vendors through invoices. Individual researchers fit neither category. Some providers do offer invoicing — I asked Anthropic, and the mechanism exists — but only for customers large enough to justify the setup. A single researcher spending a few hundred euros a month does not qualify. The providers won’t adapt their billing for one lab. The universities won’t adapt their procurement for one new category of expense. Researchers lack the leverage to change either side. So the gap persists, and researchers absorb the cost.</p>
<p>Underneath this lies a design error worth naming. University procurement optimizes for a world where the optimal amount of misuse is zero — every transaction verified, pre-approved, matched to documentation. But in any system, the optimal amount of fraud is non-zero. The cost of preventing the last fraction of misuse exceeds the cost of the misuse itself. Universities spend hundreds of euros in administrative overhead to control the possibility of someone misspending twenty euros on API tokens. They turn publicly funded researchers into part-time bookkeepers. The control system costs more than what it controls.</p>
<p>My university already accepts bounded trust for invoice-based purchases — hand in an invoice under €7,000, no (or at least few) questions asked. It just hasn’t extended that trust to the new category of costs that researchers actually face.</p>
</section>
<section id="the-proposal" class="level2">
<h2 class="anchored" data-anchor-id="the-proposal">The proposal</h2>
<p>Several colleagues have independently given me the same advice: just start a nonprofit to handle your invoicing. When founding a legal entity feels like the path of least resistance for paying a monthly software bill, something has gone wrong at the systems level. But the advice points toward a real solution — not one nonprofit for me, but one that serves researchers like me across institutions.</p>
<p>The idea is a thin intermediary that does one thing: it pays the credit card, and it sends the university the invoice.</p>
<p>The intermediary holds accounts with major cloud and AI providers. Each participating researcher gets a virtual account with spending limits tied to their grant or institutional budget. The intermediary tracks usage, handles currency conversion, and generates one consolidated quarterly invoice per researcher, formatted for each university’s procurement system. From the university’s side, it looks like any other vendor. From the researcher’s side, the credit card disappears.</p>
<p>This works because the invoicing infrastructure already exists on the provider side — it is gated by scale, not by technical barriers. An organization that aggregates demand from dozens of researchers becomes the kind of customer that qualifies for invoiced billing. The bridge can be built. No individual researcher is heavy enough to justify building it for themselves alone.</p>
<p>Universities would change nothing. Their finance offices process the intermediary’s invoice the way they process every other vendor invoice. They would, in fact, gain better visibility into research spending than they have now, because one clean quarterly report replaces the current mess of scattered personal reimbursements and unfiled claims.</p>
</section>
<section id="the-experiment" class="level2">
<h2 class="anchored" data-anchor-id="the-experiment">The experiment</h2>
<p>A credible pilot could be following: 15 to 30 researchers across three to five institutions, six months, umbrella accounts with a handful of major providers. The intermediary generates quarterly invoices and submits them through each university’s standard procurement process.</p>
<p>Four things to measure. Do the invoices get processed without friction — will universities accept the intermediary as a normal vendor? Do researchers increase their use of cloud and digital tools? Do researchers attempt work they previously avoided? And the financial picture: administrative time saved, reimbursement backlogs cleared, savings from volume pricing.</p>
<p>Setup costs are small. The intermediary needs a legal entity, a bookkeeping system, and credit card accounts with a few providers. If the pilot fails — if universities reject the invoices or researchers don’t change their behavior — that is informative. It tells us the barrier is not administrative plumbing but something deeper, something no intermediary can fix. I doubt it, but the experiment would show it.</p>
<p>If it works, it scales without institutional reform. Each new university adds one vendor to its approved list. Each new provider adds one billing relationship. The thing grows by making a previously painful resource accessible through a shared layer of infrastructure.</p>
<p>I am aware that I have spent a considerable number of words on invoicing. That is, in a way, the point. The bottleneck holding back a growing share of digital research is not a scientific problem, not a technical limitation, not a lack of ideas. It is a missing invoice. The fix is comically mundane, which is perhaps why no one has built it. It falls below the threshold of what feels like a problem worth solving. But for researchers paying out of their own pockets, fighting with their partners about shared finances, and quietly deciding which experiments are not worth the administrative pain — it is the problem.</p>


<!-- -->

</section>

 ]]></description>
  <category>metascience</category>
  <category>research-infrastructure</category>
  <guid>https://kjablonka.com/blog/posts/finances/</guid>
  <pubDate>Thu, 19 Mar 2026 23:00:00 GMT</pubDate>
</item>
<item>
  <title>OpenClaw Did Not Save Me</title>
  <link>https://kjablonka.com/blog/posts/openclaw/</link>
  <description><![CDATA[ 




<p>I, too, installed OpenClaw. At some point, I felt I had to. How can I be credible in developing benchmarks for LLM-based agents if I do not try the most popular LLM-based agent tool myself?</p>
<p>I had the illusion that it might help me reduce some of the mental load. I have never managed to use a todo list. After one day, there are always so many new tasks that todo lists become utterly useless. In practice, this means there is probably always some mental load dedicated to worrying that I hopefully do not miss something — or linked to the awareness that I have, in Oliver Burkeman’s lingo, already lost, and that there is no way to achieve everything I’d like to achieve.</p>
<p>I gave in to the illusion that I could get with OpenClaw what my institution does not give me: a personal assistant that can take some admin load off me.</p>
<section id="the-setup" class="level2">
<h2 class="anchored" data-anchor-id="the-setup">The Setup</h2>
<p>Still, I am a bit conservative. I set everything up on a somewhat hardened Hetzner VPS — the cheapest one possible. I did not give it full access to my accounts. OpenClaw runs as a non-root account. I was inspired by the advice <a href="https://x.com/JordanLyall/status/2019594755370545168">here</a>.</p>
<ol type="1">
<li>Create cheapest possible Hetzner VPS, add SSH key</li>
<li>Install OpenClaw via the <a href="https://github.com/openclaw/openclaw-ansible/tree/main">Ansible playbook</a></li>
<li><code>sudo su - openclaw</code>, then <code>openclaw onboard --install-daemon</code></li>
<li>Manual onboarding, local gateway with loopback</li>
<li>OpenAI as model provider (Peter Steinberger seems to like it more)</li>
<li>Telegram as chat connection with pairing</li>
</ol>
<p>I was naïve and did some of this on the train. Not in tmux.</p>
</section>
<section id="giving-it-a-soul" class="level2">
<h2 class="anchored" data-anchor-id="giving-it-a-soul">Giving It a Soul</h2>
<p>I started giving it a soul by complaining about what I always portray as the source of my problems: that my employer does not fund a personal assistant. I prompted it to be a critical personal board of advisors, taking in viewpoints from people I specifically mentioned — with diverse but valuable perspectives. I gave it instructions about my long-term high-level research direction, and a dump of my tasks that currently came to mind. The tasks for which I never managed to really consistently stick to any task management system.</p>
</section>
<section id="what-happened" class="level2">
<h2 class="anchored" data-anchor-id="what-happened">What Happened</h2>
<p>In my first attempt, I hardened the thing so hard that I locked myself out. The todo list was not even transferred completely.</p>
<p>I told it I would like to finish every day with reflection and start every day with an agenda. But right now, I said, I just want a five-minute thing to tick something off. It responded:</p>
<blockquote class="blockquote">
<p>Create a note titled “Corral — Coherent Plan v1” and write exactly 3 bullets:</p>
<ol type="1">
<li>Core question: what decision this project must answer</li>
<li>Current evidence: what results already support/contradict</li>
<li>Next experiment: single highest-value next step</li>
</ol>
</blockquote>
<p>I cannot do that in five minutes. I cannot do deep thinking in five minutes. That is the whole reason I came to this tool — I am drowning in small tasks and cannot find the space to think. I asked for the smallest possible win. And it told me to produce a coherent strategic plan.</p>
<p>My attempt at moving fast gave me something that wanted me to be even more superhuman than I already cannot be.</p>
</section>
<section id="back-to-basics" class="level2">
<h2 class="anchored" data-anchor-id="back-to-basics">Back to Basics</h2>
<p>I am now back to using Things 3. Perhaps I will even go back to paper.</p>
<p>I still have the struggle that I strive for freedom to think deeply and creatively, but that I also have just too many small admin things to do. I still did not find a way to balance them. OpenClaw certainly did not help me. Perhaps I did not invest enough. Perhaps I am also back to admitting that I have no chance — that I reached for yet another productivity trick, hoping it would finally make the difference. Perhaps I have to learn the lesson I always tell others: there is no shortcut.</p>


<!-- -->

</section>

 ]]></description>
  <category>productivity</category>
  <category>ai-agents</category>
  <guid>https://kjablonka.com/blog/posts/openclaw/</guid>
  <pubDate>Tue, 03 Mar 2026 23:00:00 GMT</pubDate>
</item>
<item>
  <title>Deep Networks as Elastic Origami</title>
  <link>https://kjablonka.com/blog/posts/deep_learning_mechanisms_lecture/</link>
  <description><![CDATA[ 




<p>My first recorded lecture (20 min) covers the manifold hypothesis and how we can view deep networks as elastic origami.</p>
<div class="quarto-video ratio ratio-16x9"><iframe data-external="1" src="https://www.youtube.com/embed/JPH0qrRefBo" title="" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""></iframe></div>
<p>The lecture explores how deep networks fold and transform data through high-dimensional space, much like origami shapes paper into new forms.</p>



 ]]></description>
  <category>teaching</category>
  <category>deep learning</category>
  <category>manifolds</category>
  <guid>https://kjablonka.com/blog/posts/deep_learning_mechanisms_lecture/</guid>
  <pubDate>Mon, 26 Jan 2026 23:00:00 GMT</pubDate>
  <media:content url="https://kjablonka.com/blog/posts/deep_learning_mechanisms_lecture" medium="image"/>
</item>
<item>
  <title>The Geometry of Not Enough Data</title>
  <link>https://kjablonka.com/blog/posts/manifold/geometry_blog.html</link>
  <description><![CDATA[ 




<style>
.key-point {
    background: #f8f6f1;
    border-left: 3px solid #b8860b;
    padding: 1.2em 1.5em;
    margin: 1.8em 0;
    border-radius: 0 4px 4px 0;
}
.aside-box {
    background: #f0f0f0;
    padding: 1em 1.2em;
    margin: 1.5em 0;
    border-radius: 4px;
    font-size: 0.95em;
}
.aside-title {
    font-weight: 600;
    margin-bottom: 0.3em;
    color: #555;
}
</style>
<section id="the-puzzle-of-too-many-parameters" class="level2">
<h2 class="anchored" data-anchor-id="the-puzzle-of-too-many-parameters">The Puzzle of Too Many Parameters</h2>
<p>Monthly airline passengers from 1949 to 1960. You want to forecast the next few years. A linear model has 2 parameters. A cubic polynomial has 4. A degree-20 polynomial has 21.<sup>1</sup></p>
<p>Classical statistics warns against the high-parameter option. More parameters, more opportunities to chase noise instead of signal. The cubic should suffice.</p>
<p>Yet modern deep learning uses models with billions of parameters trained on datasets that cover a vanishing fraction of possible inputs. By classical logic, these models should memorize their training data and fail on anything new. Still, they seem to ace very difficult benchmarks.</p>
</section>
<section id="the-arithmetic-of-impossibility" class="level2">
<h2 class="anchored" data-anchor-id="the-arithmetic-of-impossibility">The Arithmetic of Impossibility</h2>
<p>A language model predicts the next token given all previous tokens:</p>
<p><img src="https://latex.codecogs.com/png.latex?p(w_%7B1001%7D%20%5Cmid%20w_%7B1000%7D,%20w_%7B999%7D,%20%5Cldots,%20w_1)"></p>
<p>With vocabulary <img src="https://latex.codecogs.com/png.latex?V%20%5Capprox%20100%7B,%7D000"> and context length <img src="https://latex.codecogs.com/png.latex?L%20%5Capprox%208%7B,%7D000">, the input space contains:</p>
<p><img src="https://latex.codecogs.com/png.latex?V%5EL%20=%20100%7B,%7D000%5E%7B8%7B,%7D000%7D%20=%2010%5E%7B40%7B,%7D000%7D"></p>
<p>possible sequences. Training corpora contain perhaps <img src="https://latex.codecogs.com/png.latex?10%5E%7B14%7D"> tokens. The ratio of observed to possible:</p>
<p><img src="https://latex.codecogs.com/png.latex?%5Cfrac%7B10%5E%7B14%7D%7D%7B10%5E%7B40%7B,%7D000%7D%7D%20=%2010%5E%7B-39%7B,%7D986%7D"></p>
<p>If every atom in the observable universe were a training example, we’d still have seen essentially nothing. The model encounters novel inputs on every forward pass, yet produces sensible outputs.</p>
<p>The arithmetic says learning is impossible. The models work anyway. One of these must be wrong.</p>
</section>
<section id="real-data-is-not-random-data" class="level2">
<h2 class="anchored" data-anchor-id="real-data-is-not-random-data">Real Data Is Not Random Data</h2>
<p>Sample an image uniformly at random from the space of all 256×256 RGB images. You get noise. Sample a million. Still noise. The space of natural images—photographs, paintings, anything a human would recognize—occupies a measure-zero subset of pixel space.</p>
<p>A striking way to see this: take two random points in pixel space and walk linearly between them. Every step is noise. Now take two real images and walk between them along the data manifold (as a generative model learns to do). The intermediate points look like images—blurry perhaps, but recognizably structured.</p>
<div id="cell-fig-random-walk" class="cell" data-execution_count="1">
<details class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> numpy <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> np</span>
<span id="cb1-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> matplotlib.pyplot <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> plt</span>
<span id="cb1-3"></span>
<span id="cb1-4">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">42</span>)</span>
<span id="cb1-5"></span>
<span id="cb1-6">fig, axes <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> plt.subplots(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">6</span>, figsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">4.5</span>))</span>
<span id="cb1-7"></span>
<span id="cb1-8"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Top row: random walk in pixel space</span></span>
<span id="cb1-9">start_random <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.random.rand(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">32</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">32</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)</span>
<span id="cb1-10">end_random <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.random.rand(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">32</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">32</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)</span>
<span id="cb1-11"></span>
<span id="cb1-12"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i, alpha <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">enumerate</span>(np.linspace(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">6</span>)):</span>
<span id="cb1-13">    interp <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> alpha) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> start_random <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> end_random</span>
<span id="cb1-14">    axes[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, i].imshow(interp)</span>
<span id="cb1-15">    axes[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, i].axis(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'off'</span>)</span>
<span id="cb1-16">    axes[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, i].set_title(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f'α=</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>alpha<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:.1f}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">'</span>, fontsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>)</span>
<span id="cb1-17"></span>
<span id="cb1-18">axes[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>].set_ylabel(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'Random</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">interpolation'</span>, fontsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">11</span>, rotation<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, ha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'right'</span>, va<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'center'</span>)</span>
<span id="cb1-19"></span>
<span id="cb1-20"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Bottom row: "manifold" interpolation (simulated with smooth structure)</span></span>
<span id="cb1-21"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i, alpha <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">enumerate</span>(np.linspace(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">6</span>)):</span>
<span id="cb1-22">    freq_blend <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1.5</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> alpha</span>
<span id="cb1-23">    x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.linspace(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">32</span>)</span>
<span id="cb1-24">    y <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.linspace(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">32</span>)</span>
<span id="cb1-25">    X, Y <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.meshgrid(x, y)</span>
<span id="cb1-26">    </span>
<span id="cb1-27">    img <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.zeros((<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">32</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">32</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>))</span>
<span id="cb1-28">    base <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.25</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> np.sin(freq_blend <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> X <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> np.cos(freq_blend <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> Y <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> alpha)</span>
<span id="cb1-29">    img[:,:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> base <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.15</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> np.sin(X <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)</span>
<span id="cb1-30">    img[:,:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> base <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> np.cos(Y <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb1-31">    img[:,:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> base <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.12</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> np.sin((X <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> Y) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> alpha)</span>
<span id="cb1-32">    img <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.clip(img, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb1-33">    </span>
<span id="cb1-34">    axes[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, i].imshow(img)</span>
<span id="cb1-35">    axes[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, i].axis(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'off'</span>)</span>
<span id="cb1-36"></span>
<span id="cb1-37">axes[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>].set_ylabel(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'Manifold</span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">\n</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">interpolation'</span>, fontsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">11</span>, rotation<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, ha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'right'</span>, va<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'center'</span>)</span>
<span id="cb1-38"></span>
<span id="cb1-39">plt.tight_layout()</span>
<span id="cb1-40">plt.show()</span></code></pre></div></div>
</details>
<div class="cell-output cell-output-display">
<div id="fig-random-walk" class="quarto-float quarto-figure quarto-figure-center anchored">
<figure class="quarto-float quarto-float-fig figure">
<div aria-describedby="fig-random-walk-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
<img src="https://kjablonka.com/blog/posts/manifold/geometry_blog_files/figure-html/fig-random-walk-output-1.png" width="1142" height="406" class="figure-img">
</div>
<figcaption class="quarto-float-caption-bottom quarto-float-caption quarto-float-fig" id="fig-random-walk-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
Figure&nbsp;1: Walking through image space. Top: linear interpolation between random points yields noise throughout. Bottom: interpolation along a learned manifold produces structured images at every step.
</figcaption>
</figure>
</div>
</div>
</div>
<p>The same holds for text. Random token sequences are gibberish. Coherent sentences, paragraphs, arguments—these live on a thin subspace of token-sequence space, constrained by grammar, semantics, and the structure of ideas worth expressing.</p>
<p>This resolves the arithmetic paradox. The impossibility proof assumes you need to cover the full input space. But the data we care about concentrates on a low-dimensional structure, and coverage of that structure is tractable.</p>
</section>
<section id="the-manifold-hypothesis" class="level2">
<h2 class="anchored" data-anchor-id="the-manifold-hypothesis">The Manifold Hypothesis</h2>
<p>The standard formalization: data lies on or near a <img src="https://latex.codecogs.com/png.latex?d">-dimensional manifold <img src="https://latex.codecogs.com/png.latex?%5Cmathcal%7BM%7D"> embedded in the ambient space <img src="https://latex.codecogs.com/png.latex?%5Cmathbb%7BR%7D%5ED">, where <img src="https://latex.codecogs.com/png.latex?d%20%5Cll%20D">.</p>
<p>Covering a <img src="https://latex.codecogs.com/png.latex?d">-dimensional manifold requires samples scaling as <img src="https://latex.codecogs.com/png.latex?%5Cepsilon%5E%7B-d%7D">, not <img src="https://latex.codecogs.com/png.latex?%5Cepsilon%5E%7B-D%7D">. If images live on a manifold of dimension 1,000 rather than filling a space of dimension 200,000, the sample complexity drops from astronomical to merely large.</p>
<p><span class="citation" data-cites="pope2021intrinsic">Pope et al. (2021)</span> measured intrinsic dimensions of standard image datasets and found values in the hundreds to low thousands—far below ambient dimension, though not trivially small.</p>
<div class="aside-box">
<div class="aside-title">
<p>A circularity to notice</p>
</div>
<p>The manifold hypothesis is typically invoked after observing that deep learning works. Rarely does anyone estimate the intrinsic dimension before training and verify it’s small enough for the available data. The hypothesis is plausible, but its use is often post-hoc rationalization.</p>
</div>
</section>
<section id="the-geometry-of-piecewise-linear-functions" class="level2">
<h2 class="anchored" data-anchor-id="the-geometry-of-piecewise-linear-functions">The Geometry of Piecewise Linear Functions</h2>
<p>ReLU networks compute piecewise linear functions. Not approximately linear. Exactly linear, within each piece.</p>
<p>The ReLU activation <img src="https://latex.codecogs.com/png.latex?%5Ctext%7BReLU%7D(x)%20=%20%5Cmax(0,%20x)"> is piecewise linear with two pieces. Compositions of affine transformations and coordinate-wise ReLUs yield piecewise linear functions on convex polytopes. Within each polytope, the network computes <img src="https://latex.codecogs.com/png.latex?f(x)%20=%20Wx%20+%20b"> for some <img src="https://latex.codecogs.com/png.latex?W"> and <img src="https://latex.codecogs.com/png.latex?b"> specific to that region.</p>
<div id="cell-fig-relu-regions" class="cell" data-execution_count="2">
<details class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> numpy <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> np</span>
<span id="cb2-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> matplotlib.pyplot <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> plt</span>
<span id="cb2-3"></span>
<span id="cb2-4">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">42</span>)</span>
<span id="cb2-5">n_units <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span></span>
<span id="cb2-6"></span>
<span id="cb2-7">w1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.random.randn(n_units, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb2-8">b1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.random.randn(n_units) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span></span>
<span id="cb2-9">w2 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.random.randn(n_units) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> np.sqrt(n_units)</span>
<span id="cb2-10">b2 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.random.randn() <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.2</span></span>
<span id="cb2-11"></span>
<span id="cb2-12"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> forward(x, y):</span>
<span id="cb2-13">    inp <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.stack([x.flatten(), y.flatten()], axis<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb2-14">    hidden <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.maximum(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, inp <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">@</span> w1.T <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> b1)</span>
<span id="cb2-15">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> (hidden <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">@</span> w2 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> b2).reshape(x.shape)</span>
<span id="cb2-16"></span>
<span id="cb2-17">g <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span></span>
<span id="cb2-18">xs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.linspace(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, g)</span>
<span id="cb2-19">ys <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.linspace(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, g)</span>
<span id="cb2-20">X, Y <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.meshgrid(xs, ys)</span>
<span id="cb2-21">Z <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> forward(X, Y)</span>
<span id="cb2-22"></span>
<span id="cb2-23">fig, ax <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> plt.subplots(figsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">7</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">6</span>))</span>
<span id="cb2-24">contour <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> ax.contourf(X, Y, Z, levels<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">30</span>, cmap<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'RdBu'</span>)</span>
<span id="cb2-25">plt.colorbar(contour, ax<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>ax, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'f(x)'</span>)</span>
<span id="cb2-26"></span>
<span id="cb2-27"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(n_units):</span>
<span id="cb2-28">    a, b <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> w1[i]</span>
<span id="cb2-29">    c <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> b1[i]</span>
<span id="cb2-30">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">abs</span>(b) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.01</span>:</span>
<span id="cb2-31">        line_x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.array([<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>])</span>
<span id="cb2-32">        line_y <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>(a <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> line_x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> c) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> b</span>
<span id="cb2-33">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">any</span>(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">abs</span>(line_y) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&lt;</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2.5</span>):</span>
<span id="cb2-34">            ax.plot(line_x, np.clip(line_y, <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>), <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'k-'</span>, linewidth<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.8</span>, alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb2-35">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">elif</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">abs</span>(a) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.01</span>:</span>
<span id="cb2-36">        xi <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>c <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> a</span>
<span id="cb2-37">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">abs</span>(xi) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&lt;</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>:</span>
<span id="cb2-38">            ax.axvline(xi, color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'k'</span>, linewidth<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.8</span>, alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb2-39"></span>
<span id="cb2-40">ax.set_xlim(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb2-41">ax.set_ylim(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb2-42">ax.set_xlabel(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'$x_1$'</span>)</span>
<span id="cb2-43">ax.set_ylabel(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'$x_2$'</span>)</span>
<span id="cb2-44">ax.set_aspect(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'equal'</span>)</span>
<span id="cb2-45">plt.tight_layout()</span>
<span id="cb2-46">plt.show()</span></code></pre></div></div>
</details>
<div class="cell-output cell-output-stderr">
<pre><code>/var/folders/gk/s1v9_48163q2rxpc1x2gq21m0000gn/T/ipykernel_49826/3517710329.py:14: RuntimeWarning:

divide by zero encountered in matmul

/var/folders/gk/s1v9_48163q2rxpc1x2gq21m0000gn/T/ipykernel_49826/3517710329.py:14: RuntimeWarning:

overflow encountered in matmul

/var/folders/gk/s1v9_48163q2rxpc1x2gq21m0000gn/T/ipykernel_49826/3517710329.py:14: RuntimeWarning:

invalid value encountered in matmul

/var/folders/gk/s1v9_48163q2rxpc1x2gq21m0000gn/T/ipykernel_49826/3517710329.py:15: RuntimeWarning:

divide by zero encountered in matmul

/var/folders/gk/s1v9_48163q2rxpc1x2gq21m0000gn/T/ipykernel_49826/3517710329.py:15: RuntimeWarning:

overflow encountered in matmul

/var/folders/gk/s1v9_48163q2rxpc1x2gq21m0000gn/T/ipykernel_49826/3517710329.py:15: RuntimeWarning:

invalid value encountered in matmul
</code></pre>
</div>
<div class="cell-output cell-output-display">
<div id="fig-relu-regions" class="quarto-float quarto-figure quarto-figure-center anchored">
<figure class="quarto-float quarto-float-fig figure">
<div aria-describedby="fig-relu-regions-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
<img src="https://kjablonka.com/blog/posts/manifold/geometry_blog_files/figure-html/fig-relu-regions-output-2.png" width="658" height="548" class="figure-img">
</div>
<figcaption class="quarto-float-caption-bottom quarto-float-caption quarto-float-fig" id="fig-relu-regions-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
Figure&nbsp;2: A ReLU network partitions the input space into convex polytopes. Within each region, the function is exactly affine.
</figcaption>
</figure>
</div>
</div>
</div>
<p>There’s a topological way to think about this. A neural network progressively transforms the input space, stretching and folding it until the data becomes linearly separable.<sup>2</sup> The network learns a coordinate system in which the problem is simple.</p>
<p>The number of linear regions grows exponentially with depth. A network can have far more regions than training points. Most regions contain no data at all.</p>
</section>
<section id="what-training-determines" class="level2">
<h2 class="anchored" data-anchor-id="what-training-determines">What Training Determines</h2>
<p>Consider a single linear region containing <img src="https://latex.codecogs.com/png.latex?n"> training points in <img src="https://latex.codecogs.com/png.latex?D">-dimensional space. The network computes <img src="https://latex.codecogs.com/png.latex?f(x)%20=%20Wx%20+%20b"> throughout this region. Training enforces:</p>
<p><img src="https://latex.codecogs.com/png.latex?Wx%5E%7B(i)%7D%20+%20b%20=%20y%5E%7B(i)%7D%20%5Cquad%20%5Ctext%7Bfor%20%7D%20i%20=%201,%20%5Cldots,%20n"></p>
<p>These are <img src="https://latex.codecogs.com/png.latex?n"> constraints on a gradient <img src="https://latex.codecogs.com/png.latex?W"> with <img src="https://latex.codecogs.com/png.latex?D"> components. When <img src="https://latex.codecogs.com/png.latex?n%20%3C%20D">—the typical case in high dimensions—infinitely many gradients satisfy the constraints.</p>
<p>The training points span at most an <img src="https://latex.codecogs.com/png.latex?(n-1)">-dimensional subspace. Along directions in this subspace, the gradient is pinned down. Along orthogonal directions, it’s arbitrary.</p>
<p>This connects to a classical result in learning theory. The representer theorem says that in kernel methods, the optimal solution can be written as a linear combination of kernel evaluations at the training points—only the training data matters, and only along the directions it spans. The geometry here is analogous: training constrains the function along directions spanned by the data and nowhere else.</p>
<p>If training data lies on a low-dimensional manifold, the constrained directions align with the manifold’s tangent space. Normal directions—perpendicular to the manifold—remain free. Training pins down the function on the manifold but leaves it underdetermined elsewhere.</p>
<div class="key-point">
<p>Training determines the function along directions spanned by the data. Orthogonal directions are unconstrained. The function off the data manifold is not learned—it’s an artifact of whatever the optimization happened to find/the initalization <span class="citation" data-cites="he2023side">(He, Tsai, and Ward 2023)</span>.</p>
</div>
</section>
<section id="pretraining-as-learning-the-manifold" class="level2">
<h2 class="anchored" data-anchor-id="pretraining-as-learning-the-manifold">Pretraining as Learning the Manifold</h2>
<p>Large-scale pretraining can be understood as learning the data manifold itself. When a language model predicts the next token on a massive text corpus, it learns the structure of “valid text space”—which sequences are probable, which transitions are natural, what the local geometry of text looks like.</p>
<p>This perspective is supported by work showing that deep networks learn representations capturing manifold structure <span class="citation" data-cites="bengio2013representation">(Bengio, Courville, and Vincent 2013)</span>. The hidden layers build a coordinate system aligned with the data manifold, making downstream tasks easier by providing a representation where relevant variation is explicit.</p>
<p>This explains why pretraining helps so much. A randomly initialized network must simultaneously learn manifold structure and the task-specific function. A pretrained network already knows the manifold; fine-tuning only needs to learn the function on it.</p>
<div class="aside-box">
<div class="aside-title">
<p>The role of scale</p>
</div>
<p>Larger models and more data allow learning finer manifold details. This may partly explain “emergent abilities” in large language models—capabilities appearing suddenly at scale. The model may need enough capacity and data to capture relevant structure before certain tasks become possible.</p>
</div>
</section>
<section id="why-the-underdetermined-directions-dont-usually-matter" class="level2">
<h2 class="anchored" data-anchor-id="why-the-underdetermined-directions-dont-usually-matter">Why the Underdetermined Directions Don’t (Usually) Matter</h2>
<p>Three factors make underdetermination benign in practice.</p>
<p><strong>The data manifold is where queries live.</strong> If test data comes from the same distribution as training, test points lie on or near the same manifold. The underdetermined normal directions are never queried.</p>
<p><strong>Neural networks prefer simple functions.</strong> Not all functions consistent with training data are equally likely to emerge. Networks exhibit a bias toward low-complexity functions, formalizable via Kolmogorov complexity <span class="citation" data-cites="valle2019deep goldblum2023free">(Valle-Perez, Camargo, and Louis 2019; Goldblum et al. 2023)</span>. The functions networks actually learn tend to be simple.</p>
<p><strong>Real data is generated by simple processes.</strong> Biological structures reflect evolutionary compression. Human artifacts encode low-dimensional intentions. The data we care about is often output of structured processes.</p>
<p>The match between neural networks’ simplicity bias and the simplicity of real-world data may be the deeper reason deep learning works. The manifold hypothesis is a geometric consequence of this match, not the fundamental explanation.</p>
</section>
<section id="an-unexpected-success-language-models-for-chemistry" class="level2">
<h2 class="anchored" data-anchor-id="an-unexpected-success-language-models-for-chemistry">An Unexpected Success: Language Models for Chemistry</h2>
<p>In <span class="citation" data-cites="jablonka2024leveraging">Jablonka et al. (2024)</span>, we fine-tuned large language models to predict chemical properties: bandgaps, photoswitching wavelengths, toxicity. Language models are trained on text, not molecules. The transfer seems absurd.</p>
<p>Yet it works. Fine-tuned LLMs achieve competitive performance, sometimes matching purpose-built molecular models.</p>
<p>The geometric interpretation: LLMs have learned inductive biases—preferences for compositional, hierarchical, smoothly-varying functions—that transfer across domains. Chemistry and language are (sometimes?) both structured.</p>
<div class="aside-box">
<div class="aside-title">
<p>Structure in chemistry</p>
</div>
<p>In biology, evolution provides a strong prior. Existing structures have been selected for function, so <a href="https://medium.com/@jkbjablonka/the-road-to-biology-2-0-will-pass-through-black-box-data-bbd00fabf959">structure correlates with function</a> in ways that learning can exploit. Chemistry lacks this selection pressure. The space of possible molecules wasn’t shaped by any optimization process—we’re exploring territory with no guarantee of low-dimensional structure. This may make chemical property prediction fundamentally harder than biological function prediction.</p>
</div>
</section>
<section id="when-deep-learning-fails" class="level2">
<h2 class="anchored" data-anchor-id="when-deep-learning-fails">When Deep Learning Fails</h2>
<p>The geometric picture predicts specific failure modes, all involving queries that leave the training manifold.</p>
<p><strong>Distribution shift.</strong> <span class="citation" data-cites="zech2018variable">Zech et al. (2018)</span> analyzed a deep learning model for detecting pneumonia in chest X-rays. The model achieved high accuracy—but partly by detecting which hospital the X-ray came from, using equipment artifacts like metal tokens. At a new hospital with different equipment, performance collapsed. The test manifold diverged from training.</p>
<div id="cell-fig-shift" class="cell" data-execution_count="3">
<details class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> numpy <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> np</span>
<span id="cb4-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> matplotlib.pyplot <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> plt</span>
<span id="cb4-3"></span>
<span id="cb4-4">train_x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.linspace(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">40</span>)</span>
<span id="cb4-5">train_y <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.sin(train_x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.8</span></span>
<span id="cb4-6">test_x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.linspace(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2.5</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">25</span>)</span>
<span id="cb4-7">test_y <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.sin(test_x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.8</span></span>
<span id="cb4-8"></span>
<span id="cb4-9">model_x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.linspace(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2.5</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">200</span>)</span>
<span id="cb4-10">slope <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (train_y[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> train_y[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>]) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> (train_x[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> train_x[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>])</span>
<span id="cb4-11">model_y <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.where(model_x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&lt;=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, np.sin(model_x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.8</span>, train_y[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> (model_x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> slope)</span>
<span id="cb4-12"></span>
<span id="cb4-13">fig, ax <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> plt.subplots(figsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>))</span>
<span id="cb4-14">ax.axvspan(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>, color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'red'</span>)</span>
<span id="cb4-15">ax.plot(model_x, model_y, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'k-'</span>, linewidth<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'Model'</span>)</span>
<span id="cb4-16">ax.scatter(train_x, train_y, c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'#3b82f6'</span>, s<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">30</span>, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'Training'</span>, zorder<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>)</span>
<span id="cb4-17">ax.scatter(test_x, test_y, c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'#dc2626'</span>, s<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">30</span>, marker<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'x'</span>, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'Test (shifted)'</span>, zorder<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>)</span>
<span id="cb4-18">ax.axvline(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'gray'</span>, linestyle<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">':'</span>, linewidth<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb4-19">ax.set_xlabel(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'x'</span>)</span>
<span id="cb4-20">ax.set_ylabel(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'f(x)'</span>)</span>
<span id="cb4-21">ax.set_xlim(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2.5</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)</span>
<span id="cb4-22">ax.set_ylim(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1.5</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1.5</span>)</span>
<span id="cb4-23">ax.legend(loc<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'upper right'</span>)</span>
<span id="cb4-24">plt.tight_layout()</span>
<span id="cb4-25">plt.show()</span></code></pre></div></div>
</details>
<div class="cell-output cell-output-display">
<div id="fig-shift" class="quarto-float quarto-figure quarto-figure-center anchored">
<figure class="quarto-float quarto-float-fig figure">
<div aria-describedby="fig-shift-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
<img src="https://kjablonka.com/blog/posts/manifold/geometry_blog_files/figure-html/fig-shift-output-1.png" width="758" height="374" class="figure-img">
</div>
<figcaption class="quarto-float-caption-bottom quarto-float-caption quarto-float-fig" id="fig-shift-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
Figure&nbsp;3: Distribution shift: the model extrapolates linearly into regions without training data.
</figcaption>
</figure>
</div>
</div>
</div>
<p><strong>Adversarial examples.</strong> Small perturbations orthogonal to the data manifold move inputs into regions where the function is underdetermined. The output changes dramatically because the gradient in that direction was never constrained.</p>
<p><strong>Extrapolation.</strong> ReLU networks extend linearly beyond the convex hull of training data. If the true function curves, the linear extrapolation diverges.</p>
<p><strong>High intrinsic dimension.</strong> If data doesn’t concentrate on a low-dimensional manifold, most directions are unconstrained and generalization fails.</p>
</section>
<section id="the-problem-of-missing-negative-examples" class="level2">
<h2 class="anchored" data-anchor-id="the-problem-of-missing-negative-examples">The Problem of Missing Negative Examples</h2>
<p>A subtler failure mode matters enormously in practice: the model can only learn the manifold from data it sees.</p>
<p>Consider predicting which chemical reactions succeed. Published literature overwhelmingly reports reactions that worked. Failed reactions—attempts producing no product, conditions causing decomposition—are rarely published. The model sees only one side of the decision boundary.</p>
<p>This creates a systematic blind spot. The model learns what successful reactions look like but has no information about the landscape of failures. It can’t distinguish “this will work” from “this is unlike anything I’ve seen.” Both map to the same uncertainty—off the manifold of observed successes.</p>
<p>To learn a manifold’s boundary, you need to see both sides. Without negative examples, the learned manifold may be far too permissive.</p>
<div class="aside-box">
<div class="aside-title">
<p>Verification versus generation</p>
</div>
<p>This asymmetry suggests a strategy: verifying whether something is on the manifold is often easier than generating valid points. A verifier trained on both positive and negative examples can check a generator’s outputs. This is the logic behind RLVR: creating systems that identify good outputs from bad is easier than creating systems that generate good outputs directly.<sup>3</sup></p>
</div>
</section>
<section id="the-limits-of-the-framework" class="level2">
<h2 class="anchored" data-anchor-id="the-limits-of-the-framework">The Limits of the Framework</h2>
<p>The “real data is simple” argument works for perception, language, games—domains with abundant data from stable distributions. Scientific discovery is different. It seeks patterns in domains where structure is unknown (or questioning the structure is the point).</p>
<p>The domains where machine learning would be most valuable—genuine discovery, not pattern-matching on known distributions—are exactly where the assumptions might not hold.</p>
</section>
<section id="conclusions" class="level2">
<h2 class="anchored" data-anchor-id="conclusions">Conclusions</h2>
<p>The counting argument was correct: learning an arbitrary function in <img src="https://latex.codecogs.com/png.latex?10%5E%7B40,000%7D"> dimensions is impossible with <img src="https://latex.codecogs.com/png.latex?10%5E%7B14%7D"> samples. But the functions we care about aren’t arbitrary. They’re generated by structured processes—physics, biology, human intention—that produce structured outputs.</p>
<p>Neural networks work because they share this preference for structure. They’re biased toward simple, compositional functions, and real data happens to be simple and compositional. The manifold hypothesis is a symptom of this alignment, not the cause.</p>
<p>Understanding the geometry clarifies both successes and failures. Successes come from fitting smooth functions to structured data on low-dimensional manifolds. Failures come from queries leaving the manifold, shifting distributions, or domains where structure assumptions don’t hold.</p>
<p>For perception and pattern-matching, the framework suggests continued progress. For scientific discovery—finding patterns where structure is unknown—it counsels caution. The assumptions making deep learning work are precisely those we can’t verify in genuinely novel domains.</p>
<hr>
<p><em>This post draws on work by Andrew Gordon Wilson on inductive bias and Bayesian deep learning, Ben Recht on what machine learning can and cannot do, and Mikhail Belkin on interpolation and generalization. The geometric intuitions build on <a href="https://colah.github.io/posts/2014-03-NN-Manifolds-Topology/">Chris Olah’s writing on topology</a> and <a href="https://12gramsofcarbon.com/p/deep-learning-is-applied-topology">Riley Goodside’s essays</a>. Errors are my own.</em></p>
</section>
<section id="references" class="level2">
<h2 class="anchored" data-anchor-id="references">References</h2>
<div id="refs" class="references csl-bib-body hanging-indent" data-entry-spacing="0">
<div id="ref-bengio2013representation" class="csl-entry">
Bengio, Yoshua, Aaron Courville, and Pascal Vincent. 2013. <span>“Representation Learning: A Review and New Perspectives.”</span> <em>IEEE Transactions on Pattern Analysis and Machine Intelligence</em> 35 (8): 1798–828.
</div>
<div id="ref-goldblum2023free" class="csl-entry">
Goldblum, Micah, Marc Finzi, Keefer Rowan, and Andrew Gordon Wilson. 2023. <span>“The No Free Lunch Theorem, Kolmogorov Complexity, and the Role of Inductive Biases in Machine Learning.”</span> <em>arXiv Preprint arXiv: 2304.05366</em>.
</div>
<div id="ref-he2023side" class="csl-entry">
He, Juncai, Richard Tsai, and Rachel Ward. 2023. <span>“Side Effects of Learning from Low-Dimensional Data Embedded in a Euclidean Space.”</span> <em>Research in the Mathematical Sciences</em> 10 (1): 13.
</div>
<div id="ref-jablonka2024leveraging" class="csl-entry">
Jablonka, Kevin Maik, Philippe Schwaller, Andres Ortega-Guerrero, and Berend Smit. 2024. <span>“Leveraging Large Language Models for Predictive Chemistry.”</span> <em>Nature Machine Intelligence</em> 6: 161–69.
</div>
<div id="ref-pope2021intrinsic" class="csl-entry">
Pope, Phillip, Chen Zhu, Ahmed Abdelkader, Micah Goldblum, and Tom Goldstein. 2021. <span>“The Intrinsic Dimension of Images and Its Impact on Learning.”</span> <em>arXiv Preprint arXiv:2104.08894</em>.
</div>
<div id="ref-valle2019deep" class="csl-entry">
Valle-Perez, Guillermo, Chico Q Camargo, and Ard A Louis. 2019. <span>“Deep Learning Generalizes Because the Parameter-Function Map Is Biased Towards Simple Functions.”</span> <em>arXiv Preprint arXiv:1805.08522</em>.
</div>
<div id="ref-zech2018variable" class="csl-entry">
Zech, John R, Marcus A Badgeley, Manway Liu, Anthony B Costa, Joseph J Titano, and Eric Karl Oermann. 2018. <span>“Variable Generalization Performance of a Deep Learning Model to Detect Pneumonia in Chest Radiographs: A Cross-Sectional Study.”</span> <em>PLoS Medicine</em> 15 (11): e1002683.
</div>
</div>


</section>


<div id="quarto-appendix" class="default"><section id="footnotes" class="footnotes footnotes-end-of-document"><h2 class="anchored quarto-appendix-heading">Footnotes</h2>

<ol>
<li id="fn1"><p>This example is from Andrew Gordon Wilson’s <a href="https://www.youtube.com/watch?v=lhwk4ESlyMA">talk on the foundations of deep learning</a>.↩︎</p></li>
<li id="fn2"><p>For visualizations of this perspective, see <a href="https://colah.github.io/posts/2014-03-NN-Manifolds-Topology/">Chris Olah’s post on neural networks and topology</a>.↩︎</p></li>
<li id="fn3"><p>As <a href="https://12gramsofcarbon.com/p/deep-learning-is-applied-topology">noted by Riley Goodside</a>: “Creating systems that can identify good reasoning from bad is a much easier task than creating systems that can reason well to begin with.”↩︎</p></li>
</ol>
</section></div> ]]></description>
  <category>ml</category>
  <category>deeplearning</category>
  <guid>https://kjablonka.com/blog/posts/manifold/geometry_blog.html</guid>
  <pubDate>Sun, 28 Dec 2025 23:00:00 GMT</pubDate>
  <media:content url="https://kjablonka.com/blog/posts/manifold" medium="image"/>
</item>
<item>
  <title>Why I Start Things Uncomfortably Early</title>
  <link>https://kjablonka.com/blog/posts/write_early/</link>
  <description><![CDATA[ 




<p>Over time, I’ve developed a habit that feels wrong: I start writing, integrating, and testing way earlier than seems reasonable. When I have just the first “signs of life” in a project, I’m already drafting paper sections. When my code modules are 70% done, I’m already trying to wire them together.</p>
<p>This goes against the instinct to wait until things are “ready.” But I think there are good reasons for doing it this way.</p>
<p>Ambitious research has a fundamental difficulty: rewards are sparse and delayed. I won’t know if my approach works until months in, when I finally have the complete system and results. But I need to make decisions now: which direction to explore, which component to prioritize, what to try next.</p>
<p>This is essentially the credit assignment problem from reinforcement learning. When the signal (the discovery of a new method, a successful validation) is very far away, how do I navigate?</p>
<section id="the-working-memory-problem" class="level2">
<h2 class="anchored" data-anchor-id="the-working-memory-problem">The Working Memory Problem</h2>
<p>I can only hold about 4 things in working memory at once. When I’m juggling 15 design decisions and vague intuitions about what might work, most of it is slipping away. I need concrete signals.</p>
<p>Writing and sketching force me to make things concrete. <a href="https://www.nature.com/articles/s44222-025-00323-4">Writing is thinking.</a> When I try to draft a results section for experiments I haven’t run yet, I immediately hit questions: “Wait, what’s the y-axis here? How would I actually measure this?” Those questions become my experimental plan.</p>
<p>Similarly, when I integrate code modules before they’re polished, the failures tell me what actually matters. Often the “polish” I was planning turns out to be unnecessary. And the failures tell me what actually matters.</p>
</section>
<section id="what-i-actually-do" class="level2">
<h2 class="anchored" data-anchor-id="what-i-actually-do">What I Actually Do</h2>
<ul>
<li>I start writing paper drafts at the first signs of life. Playing with different introductions helps me figure out which story actually makes sense. Sketching figure layouts exposes which experiments I’m missing.</li>
<li>I wire together code modules when they’re partially done. The integration failures are informative.</li>
<li>I run simplified experiments first—1000 examples instead of 1M, 2 layers instead of 12. I can add complexity once I know the direction has promise. This is to maximize <a href="https://web.stanford.edu/class/cs197/lectures/cs197-05-velocity.pdf">research velocity</a>.</li>
<li>I submit to workshops before papers feel “fully baked.” The writing process and feedback catch blind spots.</li>
<li>I share half-baked ideas. A conversation can save months of heading the wrong direction.</li>
</ul>
</section>
<section id="pride-vs.-progress" class="level2">
<h2 class="anchored" data-anchor-id="pride-vs.-progress">Pride vs.&nbsp;Progress</h2>
<p>These practices aren’t technically hard. What makes them difficult is the discomfort of showing work that isn’t impressive yet, of risking visible mistakes, of accepting that my initial intuitions might be wrong.</p>
<p>But I’ve noticed that waiting for things to feel “ready” often means operating in the dark for too long. The cost of being wrong in private—spending months on the wrong path—tends to be much higher than the awkwardness of being wrong in public. And if we talk about spending public research money, <a href="https://kjablonka.com/blog/posts/on_impactful_research/">we should minimize this cost</a>.</p>
</section>
<section id="how-fast-is-too-fast" class="level2">
<h2 class="anchored" data-anchor-id="how-fast-is-too-fast">How Fast Is Too Fast?</h2>
<p>There is a limit, of course. If I’m moving so fast that I can’t reproduce last week’s experiments or I’ve lost track of what I’ve tried, I’ve gone too far. Clean experiments matter even when iterating quickly.</p>
<p>But I find I’m rarely limited by going too fast. More often, I’m limited by waiting too long—by postponing the test that would have given me signal.</p>
<p>So now when I catch myself thinking “this isn’t ready yet,” I try to ask: ready for what? Often the answer is: ready to learn something from. And that bar is much lower than I think.</p>


</section>

 ]]></description>
  <category>academia</category>
  <guid>https://kjablonka.com/blog/posts/write_early/</guid>
  <pubDate>Thu, 30 Oct 2025 23:00:00 GMT</pubDate>
  <media:content url="https://kjablonka.com/blog/posts/write_early" medium="image"/>
</item>
<item>
  <title>On Impactful Research</title>
  <link>https://kjablonka.com/blog/posts/on_impactful_research/</link>
  <description><![CDATA[ 




<p>In today’s literature seminar, the participants noticed that I perhaps have grown cynical. On most papers presented, I scribble the same verdict: not impactful. My perception—perhaps wrong, perhaps not—is that the fraction of impactful research is shrinking. Not because we’ve stopped producing good work, but because it is drowned in an avalanche of papers screaming for attention.</p>
<p>I am part of the problem. I cannot call most of my research truly impactful. A small subset may be important if I’m generous.</p>
<div class="page-columns page-full"><p>It is hard not to be part of the problem. PhD students must graduate, preferably with cumulative theses that meet minimum paper requirements. The old-fashioned monograph still exists, but it’s a hard sell when paper counts determine everything: tenure decisions, grant reviews, a graduate’s prospects on the job market. In industry, different pressures apply—internal funding mechanisms, research as PR, research as recruitment. The result is the same.</p><div class="no-row-height column-margin column-container"><span class="margin-aside">Academia has the additional challenge that it is an <a href="https://www.sam-rodriques.com/post/academia-is-an-educational-institution">educational institution</a></span></div></div>
<p>We assemble talented people. We invest public money. We do not advance research optimally. The opportunity cost is real. That funding could have expanded healthcare coverage, supported developing nations, and accelerated renewable energy. Instead, we feed the paper mill.</p>
<p>Students, confronted with this tension, asked me how to identify impactful work. Borrowing from Kierkegaard I can only say “Research can only be understood backwards; but must be executed forwards.” Some kinds of impact are only “obvious” in hindsight.</p>
<p>Still, some heuristics exist. In my field, research proves impactful by providing a tool that others use, by opening a new perspective, or by crystallizing evidence that had been merely anecdotal.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://kjablonka.com/blog/posts/on_impactful_research/cover.png" class="img-fluid figure-img"></p>
<figcaption>Tree of ideas.</figcaption>
</figure>
</div>
<p>Think of research as a tree of ideas. Three paths to impact emerge:</p>
<p>First, take a branch and plant a new tree. This grows into a product or tool that supports novel growth elsewhere.</p>
<p>Second, grow one branch deeper, enabling its transition into something usable.</p>
<p>Third, grow a new branch entirely.</p>
<p>You can identify the first by checking for a usable tool. If nothing exists that I can use, it cannot support new growth.</p>
<p>You can identify the second by counting the hypotheses explored, ablation studies conducted. The branch grows deeper and deeper.</p>
<p>You can identify the third by its complete novelty—not applying an existing technique to a new system, not adding one twist to extend a branch slightly, but doing something without precedent.</p>
<p>What does this mean for leading a research group? I want to push harder for our work to fall into one of these categories. If something can start as its own tree, give it everything: documentation, user support, tests, and community building.</p>
<p>If something can go deep, let it go deep. So deep that the next work can cut this branch—either to discard it or to plant a new tree.</p>
<p>If something is novel, go broad. Explore how far the novelty extends.</p>
<p>I worry about my career and my students’ careers. But I also think about the research money I manage. It could have funded some other public good. In a leadership seminar, I learned that “Leadership operates in areas of tension that cannot be resolved but can be balanced.” The tension in research is real. So is our failure to balance it.</p>



 ]]></description>
  <category>academia</category>
  <guid>https://kjablonka.com/blog/posts/on_impactful_research/</guid>
  <pubDate>Mon, 13 Oct 2025 22:00:00 GMT</pubDate>
  <media:content url="https://kjablonka.com/blog/posts/on_impactful_research" medium="image"/>
</item>
<item>
  <title>Lessons Learned from Writing My First ERC Proposal</title>
  <link>https://kjablonka.com/blog/posts/writing_erc/</link>
  <description><![CDATA[ 




<p>Dear younger me,</p>
<p>I’m writing this instead of revising the proposal. I know, I know—classic avoidance behavior. But maybe writing this down will help me (you? us?) make sense of what just happened. Or rather: what is happening.</p>
<p>You’re about to write your first ERC proposal, and it’s going to take a bigger emotional toll than any writing you’ve done before. Yes, even though you love writing. Especially because you love writing.</p>
<p>Let me tell you what I wish I’d known.</p>
<section id="youre-going-to-forget-which-game-youre-playing" class="level2">
<h2 class="anchored" data-anchor-id="youre-going-to-forget-which-game-youre-playing">You’re Going to Forget Which Game You’re Playing</h2>
<p>Here’s what’s going to happen: You’ll get so caught up in what James Carse calls the “finite game” <span class="citation" data-cites="carse1986finite">(Carse 1986)</span> that you’ll forget about the infinite game entirely.</p>
<p>The finite game is winning the ERC. The infinite game? That’s the research itself—the questions that genuinely excite you, the work you’d want to do regardless of whether anyone gives you money for it.</p>
<p>Carse distinguishes between games we play to win (finite) and games we play to keep playing (infinite). Your proposal will be full of research you find genuinely important, exciting, and novel. But somewhere along the way, the narrative will become “winning the ERC is the only way to do this exciting work.”</p>
<p>That’s not true. You know that’s not true. But you’re going to forget it anyway.</p>
<p>And look—I know you’re hungry for a win right now. After the losses you’ve experienced recently (and we both know what I’m talking about), you need this. I get it. But being thirsty for a win makes it almost impossible to stay in the infinite game mindset.</p>
<p>Try anyway.</p>
</section>
<section id="someone-will-tell-you-youre-late-youre-not" class="level2">
<h2 class="anchored" data-anchor-id="someone-will-tell-you-youre-late-youre-not">Someone Will Tell You You’re Late (You’re Not)</h2>
<p>An advisor is going to tell you you’re late. This will happen when you have a 17-page draft completed, 1.5 months before the deadline.</p>
<p>You’re not late.</p>
<p>But their anxiety will become your anxiety, and you’ll carry that weight through the rest of the process. I wish I could tell you how to avoid internalizing this, but I can’t. Just… know that it’s happening. Maybe that awareness will help a little.</p>
</section>
<section id="youre-writing-for-everyone-which-means-no-one" class="level2">
<h2 class="anchored" data-anchor-id="youre-writing-for-everyone-which-means-no-one">You’re Writing for Everyone, Which Means No One</h2>
<p>You’re going to write a transdisciplinary proposal that fits into multiple panels. This will feel strategic. It’s actually a trap.</p>
<p>You can’t assume background knowledge in anything. Every discipline has its own language, its own implicit assumptions, its own way of framing problems. How do you write for everyone without writing for no one? How do you balance depth and accessibility when your readers might come from entirely different fields?</p>
<p>Maybe the answer is simpler than you think: assume less in general, but write it in an interesting way. Don’t dumb it down—just explain more. Make your enthusiasm for the ideas carry the reader through the explanations they need.</p>
<p>I still don’t know if you solved this problem. I hope the reviewers will tell us.</p>
</section>
<section id="the-structure-you-think-you-have" class="level2">
<h2 class="anchored" data-anchor-id="the-structure-you-think-you-have">The Structure You Think You Have</h2>
<p>You think you have good structure. You don’t.</p>
<p>What you need: the same subheadings for each Work Package. Visual emphasis—color, even. Clear blocks for deliverables and contingency planning. Make it obvious what you’ll produce and what you’ll do when (not if) things don’t go according to plan.</p>
<p>Here’s what structure is really about: convincing reviewers that you have a plan and that you’re worth the money. Every Work Package should make it crystal clear what you’re going to deliver, when you’re going to deliver it, and what could go wrong.</p>
<p>Also, try to write it so readers don’t need to read everything linearly. Each section should stand somewhat independently. Reviewers are busy. Give them permission to jump to what matters to them. They should be able to skip around and still understand that you know what you’re doing and that funding you makes sense.</p>
<p>You won’t get this right on the first try. Or the second. Keep iterating.</p>
</section>
<section id="on-intensity-and-timing" class="level2">
<h2 class="anchored" data-anchor-id="on-intensity-and-timing">On Intensity and Timing</h2>
<p>You’re going to wonder if you should start earlier. Don’t.</p>
<p>Work in waves. One full day exclusively on the proposal, then wait for new feedback or inspiration before opening the document again. This rhythm keeps the intensity without burning you out.</p>
<p>And yes, the intensity is necessary. Some things can’t be done slowly.</p>
</section>
<section id="lead-with-the-answer" class="level2">
<h2 class="anchored" data-anchor-id="lead-with-the-answer">Lead with the Answer</h2>
<p>You’re going to slowly—too slowly—move toward using what’s called the McKinsey Pyramid Principle <span class="citation" data-cites="minto2009pyramid">(Minto 2009)</span>. Stop resisting this. Just do it from the start.</p>
<p>Here’s the deal: Instead of building up to your point (how you naturally think), start with the answer and then provide supporting arguments and evidence. Barbara Minto developed this approach at McKinsey for communicating with busy executives—which is basically what grant reviewers are.</p>
<p>Why? Because reviewers are busy. They need to grasp your core contribution immediately. Lead with the answer. Tell them what you’re going to do and why it matters. Then—and only then—show them the supporting details.</p>
<p>You’re going to resist this because it feels unnatural, like you’re giving away the ending. Do it anyway.</p>
</section>
<section id="ai-is-helpful-but-forgetful" class="level2">
<h2 class="anchored" data-anchor-id="ai-is-helpful-but-forgetful">AI Is Helpful But Forgetful</h2>
<p>You’re going to use AI extensively:</p>
<ul>
<li>Prompting it to act as reviewers from various disciplines</li>
<li>Using Deep Research and platforms like FutureHouse for literature review</li>
<li>Getting help finding effective examples</li>
</ul>
<p>But here’s what you need to know: <strong>AI forgets things across long documents.</strong></p>
<p>When you’re working on a 15-20 page proposal, edits in one section require changes elsewhere. The introduction needs updating after you rework the objectives. A change in methodology impacts your timeline. Your risk mitigation needs to align with your deliverables.</p>
<p>AI doesn’t naturally track these dependencies. You become the keeper of the narrative arc, the one who remembers what you said ten pages ago. The iterative process of checking for consistency, updating multiple sections, ensuring coherence—that’s all you.</p>
<p>Is it worth using AI? Yes. But it’s not magic. It’s a tool that requires active management.</p>
</section>
<section id="give-yourself-permission-to-write-badly" class="level2">
<h2 class="anchored" data-anchor-id="give-yourself-permission-to-write-badly">Give Yourself Permission to Write Badly</h2>
<p>Anne Lamott writes in <em>Bird by Bird</em> that perfectionism is “the voice of the oppressor” and “the main obstacle between you and a shitty first draft.” <span class="citation" data-cites="lamott1994bird">(Lamott 1994)</span></p>
<p>Listen to her.</p>
<p>Don’t start in the official ERC template. Write one long messy document first. Give yourself permission to write badly, to explore, to ramble. When you share it for feedback, you can say: “This is not yet the ERC proposal.” That distinction matters. It gives both you and your readers permission to focus on the ideas rather than the polish.</p>
<p>Then, after feedback, move it to the template. Write Part B1 first, then B2. Share both. Iterate. Rewrite. Iterate again.</p>
<p>The messiness is part of the process, not a sign that you’re doing it wrong.</p>
</section>
<section id="that-harsh-feedback-you-need-it" class="level2">
<h2 class="anchored" data-anchor-id="that-harsh-feedback-you-need-it">That Harsh Feedback? You Need It</h2>
<p>Someone is going to destroy your Part B2. Just absolutely tear it apart.</p>
<p>It’s going to hurt. And then, surprisingly, it’s going to help enormously.</p>
<p>The harsh feedback will give you the push you need to attempt another complete rewrite when you’re feeling unmotivated and lacking confidence. Sometimes the feedback that stings the most is exactly what you need to hear.</p>
<p>Try to remember that when you’re in the moment of pain.</p>
</section>
<section id="what-youd-change-and-what-you-wouldnt" class="level2">
<h2 class="anchored" data-anchor-id="what-youd-change-and-what-you-wouldnt">What You’d Change (And What You Wouldn’t)</h2>
<p>You’re going to wish you’d converged earlier on which panel to target. But the story develops in the process of writing it. How can you know which panel you’re targeting before you know what story you’re telling? Maybe this is one of those things that can’t be optimized.</p>
<p>You’re also going to almost miss formal requirements—confirming your PhD defense date, for instance. Without institutional support, you would have completely overlooked these. Next time, make a checklist of procedural matters from the start.</p>
<p>But starting earlier? No.&nbsp;The intensity of the compressed timeline is necessary, not just stressful.</p>
</section>
<section id="what-this-is-really-about" class="level2">
<h2 class="anchored" data-anchor-id="what-this-is-really-about">What This Is Really About</h2>
<p>Here’s what you need to understand: Grant writing is as much about managing yourself—your anxiety, your relationship with success and failure, your writing process—as it is about the research.</p>
<p>Are you playing the finite game or the infinite game? Are you optimizing for winning, or for continuing to play?</p>
<p>The proposal you’re about to write is valuable beyond its immediate outcome. It’s a crystallization of your thinking, a test of your communication, a forcing function for clarity. Whether you win this particular finite game or not, the infinite game—your research, your contribution to the field, the questions that genuinely excite you—continues.</p>
<p>Writing the proposal will help you crystallize ideas. It will help you find people who are good sounding boards. It will give you clear signals that your writing still isn’t as clear as you think it is.</p>
<p>These things are valuable regardless of the outcome.</p>
</section>
<section id="a-final-note" class="level2">
<h2 class="anchored" data-anchor-id="a-final-note">A Final Note</h2>
<p>I don’t know yet what the outcome will be. I’m still skeptical about whether this proposal will “fly”—some ideas may require too much shared background knowledge that reviewers won’t have.</p>
<p>But I’m excited about the ideas regardless. We’ll find ways to pursue them no matter what happens.</p>
<p>Remember which game you’re really playing.</p>
<p>And now, I suppose, back to those revisions.</p>
<p>With solidarity and hope,<br>
Your slightly-less-young self</p>



</section>

<div id="quarto-appendix" class="default"><section class="quarto-appendix-contents" id="quarto-bibliography"><h2 class="anchored quarto-appendix-heading">References</h2><div id="refs" class="references csl-bib-body hanging-indent" data-entry-spacing="0">
<div id="ref-carse1986finite" class="csl-entry">
Carse, James P. 1986. <em>Finite and Infinite Games</em>. New York: Free Press.
</div>
<div id="ref-lamott1994bird" class="csl-entry">
Lamott, Anne. 1994. <em>Bird by Bird: Some Instructions on Writing and Life</em>. New York: Anchor Books.
</div>
<div id="ref-minto2009pyramid" class="csl-entry">
Minto, Barbara. 2009. <em>The Pyramid Principle: Logic in Writing and Thinking</em>. 3rd ed. Harlow, England: Pearson Education.
</div>
</div></section></div> ]]></description>
  <category>academia</category>
  <category>writing</category>
  <category>grants</category>
  <guid>https://kjablonka.com/blog/posts/writing_erc/</guid>
  <pubDate>Tue, 30 Sep 2025 22:00:00 GMT</pubDate>
</item>
<item>
  <title>All of AI is Context Engineering</title>
  <link>https://kjablonka.com/blog/posts/context/</link>
  <description><![CDATA[ 




<p>Recent discussions around context engineering (for LLMs) have exploded across the AI community.</p>
<p>Our work follows this trend—we optimize and understand context to improve agent performance. But zooming out reveals a deeper truth: context has always been the way to make things work.</p>
<section id="context-shapes-everything" class="level2">
<h2 class="anchored" data-anchor-id="context-shapes-everything">Context Shapes Everything</h2>
<p>Theodore Roszak constructed a thought experiment that illustrates this perfectly <span class="citation" data-cites="roszak1992">(Roszak 1992)</span>:</p>
<p>Imagine watching a skilled psychiatrist at work. His waiting room overflows with patients suffering various emotional and mental disorders—some nearly hysterical, others plagued by suicidal thoughts, hallucinations, nightmares, or paranoid delusions about being watched by people who will hurt them.</p>
<p>The psychiatrist listens attentively and tries his best to help, but without success. His patients worsen despite heroic efforts.</p>
<p>Now Roszak asks us to consider the larger context: The psychiatrist’s office sits in a building. The building sits in a place. That place is Buchenwald, and the patients are concentration camp prisoners.</p>
<p><strong>Context changes everything.</strong></p>
<p>This principle extends beyond clinical settings. We experience this daily in human interactions—we can only meaningfully connect with others when we understand their full context: their background, experiences, current circumstances, and unspoken assumptions. Without this context, even well-intentioned communication fails.</p>
</section>
<section id="the-systems-science-perspective" class="level2">
<h2 class="anchored" data-anchor-id="the-systems-science-perspective">The Systems Science Perspective</h2>
<p>Systems science teaches us that emergent properties arise from interactions between components, not from components themselves <span class="citation" data-cites="meadows2008">(Meadows 2008)</span>. A material’s performance emerges from atomic structure <em>within</em> its manufacturing context, operating environment, and lifecycle constraints.</p>
<p>Yet current AI for science systems remain myopically focused on small subsets, failing to integrate broader contextual factors. We optimize binding energies while ignoring synthesis routes. We predict properties while ignoring cost, scalability, or environmental impact.</p>
<p>Kenneth Stanley and Jeff Clune argue in “Greatness Cannot Be Planned” that optimizing for ambitious goals fails because the stepping stones to greatness are deceptive <span class="citation" data-cites="stanley2015">(Stanley and Clune 2015)</span>.</p>
<p>In practice, designing materials with single goals that might seem sensible (e.g., optimize CO₂ binding energy) proves deceptive. Materials exist within complex systems, and myopically optimizing one metric prevents success.</p>
<p>Consider a battery material with perfect energy density that degrades after ten cycles, costs $10,000 per gram, or requires mining rare elements from conflict zones. The system context reveals why single-metric optimization fails.</p>
</section>
<section id="the-tacit-knowledge-gap" class="level2">
<h2 class="anchored" data-anchor-id="the-tacit-knowledge-gap">The Tacit Knowledge Gap</h2>
<p>This context problem connects to missing tacit knowledge in our AI systems.</p>
<p>Philosopher-chemist Michael Polanyi famously observed: “We know more than we can tell” <span class="citation" data-cites="polanyi1966">(Polanyi 1966)</span>. Much of this unverbalized knowledge drives scientific greatness.</p>
<p>Tacit knowledge helps scientists prune search spaces and recognize when experiments or spectra “don’t look right.” This intuition emerges from years of contextual experience—understanding how synthesis conditions affect structure, how processing history influences properties, how real-world operating conditions differ from laboratory ideals.</p>
<p>Current AI systems lack this systems-level understanding, focusing instead on isolated predictions divorced from broader context.</p>
</section>
<section id="moving-forward" class="level2">
<h2 class="anchored" data-anchor-id="moving-forward">Moving Forward</h2>
<p>Building AI systems that understand context requires integrating knowledge across scales, disciplines, and domains. We need systems that consider not just atomic structure but also synthesis pathways, processing conditions, operating environments, economic constraints, and sustainability impacts. They need to be able to gather this context by experiencing it.</p>
<p>The best materials aren’t just optimized—they’re appropriate for their contexts: manufacturability, cost, environmental impact, supply chains, regulatory approval, and real-world operating conditions.</p>
<p><strong>Context engineering isn’t just another AI technique—it’s the foundation of intelligent systems that work in the real world.</strong></p>
<p><strong>Just as meaningful human connection requires understanding full context, breakthrough AI for science demands systems-level thinking that integrates the complex, interconnected reality in which materials actually exist and function.</strong></p>
</section>
<section id="references" class="level2">
<h2 class="anchored" data-anchor-id="references">References</h2>
<div id="refs" class="references csl-bib-body hanging-indent" data-entry-spacing="0">
<div id="ref-meadows2008" class="csl-entry">
Meadows, Donella H. 2008. <em>Thinking in Systems: A Primer</em>. White River Junction, VT: Chelsea Green Publishing.
</div>
<div id="ref-polanyi1966" class="csl-entry">
Polanyi, Michael. 1966. <em>The Tacit Dimension</em>. Chicago: University of Chicago Press.
</div>
<div id="ref-roszak1992" class="csl-entry">
Roszak, Theodore. 1992. <em>The Voice of the Earth: An Exploration of Ecopsychology</em>. New York: Simon &amp; Schuster.
</div>
<div id="ref-stanley2015" class="csl-entry">
Stanley, Kenneth O., and Joel Clune. 2015. <em>Why Greatness Cannot Be Planned: The Myth of the Objective</em>. New York: Springer.
</div>
</div>


</section>

 ]]></description>
  <category>ai</category>
  <category>science</category>
  <guid>https://kjablonka.com/blog/posts/context/</guid>
  <pubDate>Mon, 04 Aug 2025 22:00:00 GMT</pubDate>
  <media:content url="https://kjablonka.com/blog/posts/context" medium="image"/>
</item>
<item>
  <title>The Epistemic Risks of AI-Only Science</title>
  <link>https://kjablonka.com/blog/posts/ai_scientists/</link>
  <description><![CDATA[ 




<p>I work on machine learning for science, and I believe deeply in its transformative potential. From predicting protein structures <span class="citation" data-cites="Abramson_2024">(Abramson et al. 2024)</span> to modeling climate systems <span class="citation" data-cites="Bodnar_2025 Allen_2025">(Bodnar et al. 2025; Allen et al. 2025)</span>, AI is accelerating research in ways that seemed implausible just a few years ago. The potential is genuinely remarkable, and I’m not here to argue otherwise.</p>
<p>But there’s something troubling about the current trajectory toward fully autonomous “AI scientists”—systems designed to independently formulate hypotheses, design experiments, and draw conclusions. While these systems will certainly produce impressive results, their widespread adoption might risk something more subtle and perhaps more dangerous: the gradual erosion of epistemic diversity that makes science robust.</p>
<section id="on-value-lock-in-and-scientific-monocultures" class="level2">
<h2 class="anchored" data-anchor-id="on-value-lock-in-and-scientific-monocultures">On Value Lock-In and Scientific Monocultures</h2>
<p>Value lock-in describes what happens when systems become so entrenched that they perpetuate specific ways of thinking, making alternatives prohibitively difficult to pursue.<sup>1</sup></p>
<p>Science faces an analogous risk. Once AI scientists become the dominant research paradigm—and given their speed and cost advantages, this seems likely—they won’t simply reflect current scientific values. They’ll crystallize them. What appears today as one approach among many could become the only viable approach, not through conscious choice but through path dependence.</p>
<p>The concern isn’t that AI scientists will be poorly designed, but rather that they’ll be too successful at optimizing for current definitions of scientific progress. This creates the “lock-in effect”: the more we invest in AI-driven research infrastructure, the harder it becomes to pursue alternatives, regardless of their potential merit.</p>
</section>
<section id="the-myth-of-value-neutral-science" class="level2">
<h2 class="anchored" data-anchor-id="the-myth-of-value-neutral-science">The Myth of Value-Neutral Science</h2>
<p>Science has never been value-neutral, though we often pretend otherwise. Every research program embeds choices about what questions matter, what methods are legitimate, what constitutes adequate evidence <span class="citation" data-cites="Huff_2017 Bronowski_1961">(Huff 2017; Bronowski 1961)</span>.</p>
<p>Take computer vision’s relationship with ImageNet. This single benchmark, created by a particular research community with specific assumptions about visual recognition, shaped an entire field for over a decade. It privileged approaches that performed well on a large, static, pre-labeled photographs <span class="citation" data-cites="dotan2019value0laden">(Dotan and Milli 2019)</span>.</p>
<p>These kind of value-laden choices are the along the entire chain of (AI) research <span class="citation" data-cites="dehghani2021benchmark hooker2020hardware alampara2025lessons">(Dehghani et al. 2021; Hooker 2020; Alampara, Schilling-Wilhelmi, and Jablonka 2025)</span>.</p>
</section>
<section id="concentration-and-its-discontents" class="level2">
<h2 class="anchored" data-anchor-id="concentration-and-its-discontents">Concentration and Its Discontents</h2>
<p>Contemporary AI scientists emerge from a remarkably homogeneous ecosystem: similar academic backgrounds, shared technical assumptions, a handful of dominant companies providing the underlying infrastructure. This concentration creates troubling dynamics <span class="citation" data-cites="paul2019disastrous crawford2021atlas">(Paul 2019; Crawford 2021)</span>.</p>
<p>First, epistemic monoculture becomes increasingly likely. When AI systems trained on similar data with comparable objectives dominate research, they’ll systematically favor certain types of questions. Approaches that don’t translate well into current AI paradigms—perhaps because they rely on tacit knowledge, or require forms of reasoning that resist formalization—risk being dismissed as unscientific rather than simply incompatible with our current tools.</p>
<p>Second, we face the prospect of algorithmic gatekeeping. As AI scientists become more productive, human researchers will encounter mounting pressure to adopt these tools or become irrelevant (and performing “current science” as a form of “art”). A small number of AI platforms could effectively determine which ideas get explored and which get ignored—not through explicit censorship, but through the subtler mechanism of making alternatives economically unviable: Power tends to concentrate, and scientific institutions aren’t immune to this tendency.</p>
</section>
<section id="normal-science-and-revolutionary-potential" class="level2">
<h2 class="anchored" data-anchor-id="normal-science-and-revolutionary-potential">Normal Science and Revolutionary Potential</h2>
<p>According to Thomas Kuhn’s philosophy of science, science alternates between periods of “normal science”—where researchers work productively within established paradigms—and revolutionary episodes that fundamentally reframe entire fields <span class="citation" data-cites="kuhn1962structure">(Kuhn 1962)</span>. Crucially, Kuhn observed that normal science “often suppresses fundamental novelties because those novelties are necessarily subversive of its basic commitments.”</p>
<p>AI scientists, optimized on existing scientific literature, might end up being just sophisticated normal science machines. They could excel at incremental advances within current paradigms but struggle with the radical reconceptualization that drives scientific revolutions. This is because current paradigms of machine learning systems easily question their foundational assumptions because those assumptions are embedded in their training data and optimization targets . It’s an inevitable consequence of how these systems work: They are trained to maximize the likelihood of the training data. But it suggests that a science dominated by AI scientists might become extraordinarily good at certain kinds of progress while systematically failing at others.</p>
</section>
<section id="toward-epistemic-pluralism" class="level2">
<h2 class="anchored" data-anchor-id="toward-epistemic-pluralism">Toward Epistemic Pluralism</h2>
<p>None of this constitutes an argument against AI in science, which would be both futile and counterproductive. Rather, it’s a case for what we might call epistemic pluralism: the deliberate maintenance of diverse approaches to scientific inquiry.</p>
<p>We need something like a portfolio approach to scientific methodology. Some research should leverage AI scientists for their remarkable speed and scale. Other investigations should preserve space for human-led inquiry. Still other work should explore hybrid approaches that combine artificial and human intelligence in novel configurations.</p>
<p>Monocultures are efficient under stable conditions but catastrophically vulnerable to unexpected challenges. Diverse ecosystems sacrifice some efficiency for resilience. Given the stakes involved in scientific knowledge production, resilience seems worth prioritizing.</p>
<p>The robustness of scientific knowledge depends not on any single approach, however sophisticated, but on the productive tension between multiple ways of understanding the world. Preserving that tension, even as AI transforms scientific practice, may be one of the most important challenges facing the scientific community today.</p>
<hr>
</section>
<section id="references" class="level2">
<h2 class="anchored" data-anchor-id="references">References</h2>
<div id="refs" class="references csl-bib-body hanging-indent" data-entry-spacing="0">
<div id="ref-Abramson_2024" class="csl-entry">
Abramson, Josh, Jonas Adler, Jack Dunger, Richard Evans, Tim Green, Alexander Pritzel, Olaf Ronneberger, et al. 2024. <span>“Accurate Structure Prediction of Biomolecular Interactions with AlphaFold 3.”</span> <em>Nature</em> 630 (8016): 493–500. <a href="https://doi.org/10.1038/s41586-024-07487-w">https://doi.org/10.1038/s41586-024-07487-w</a>.
</div>
<div id="ref-alampara2025lessons" class="csl-entry">
Alampara, Nawaf, Mara Schilling-Wilhelmi, and Kevin Maik Jablonka. 2025. <span>“Lessons from the Trenches on Evaluating Machine-Learning Systems in Materials Science.”</span> <em>arXiv Preprint arXiv:2503.10837</em>.
</div>
<div id="ref-Allen_2025" class="csl-entry">
Allen, Anna, Stratis Markou, Will Tebbutt, James Requeima, Wessel P. Bruinsma, Tom R. Andersson, Michael Herzog, et al. 2025. <span>“End-to-End Data-Driven Weather Prediction.”</span> <em>Nature</em> 641 (8065): 1172–79. <a href="https://doi.org/10.1038/s41586-025-08897-0">https://doi.org/10.1038/s41586-025-08897-0</a>.
</div>
<div id="ref-Bodnar_2025" class="csl-entry">
Bodnar, Cristian, Wessel P. Bruinsma, Ana Lucic, Megan Stanley, Anna Allen, Johannes Brandstetter, Patrick Garvan, et al. 2025. <span>“A Foundation Model for the Earth System.”</span> <em>Nature</em> 641 (8065): 1180–87. <a href="https://doi.org/10.1038/s41586-025-09005-y">https://doi.org/10.1038/s41586-025-09005-y</a>.
</div>
<div id="ref-Bronowski_1961" class="csl-entry">
Bronowski, Jacob. 1961. <em>Science and Human Values</em>. ISSR Library. London: Hutchinson.
</div>
<div id="ref-crawford2021atlas" class="csl-entry">
Crawford, Kate. 2021. <em>Atlas of AI: Power, Politics, and the Planetary Costs of Artificial Intelligence</em>. New Haven, CT: Yale University Press. <a href="https://yalebooks.yale.edu/book/9780300209570/atlas-of-ai/">https://yalebooks.yale.edu/book/9780300209570/atlas-of-ai/</a>.
</div>
<div id="ref-dehghani2021benchmark" class="csl-entry">
Dehghani, Mostafa, Yi Tay, Alexey A. Gritsenko, Zhe Zhao, Neil Houlsby, Fernando Diaz, Donald Metzler, and Oriol Vinyals. 2021. <span>“The Benchmark Lottery.”</span> <em>arXiv Preprint arXiv: 2107.07002</em>.
</div>
<div id="ref-dotan2019value0laden" class="csl-entry">
Dotan, Ravit, and S. Milli. 2019. <span>“Value-Laden Disciplinary Shifts in Machine Learning.”</span> <em>FAT*</em>. <a href="https://doi.org/10.1145/3351095.3373157">https://doi.org/10.1145/3351095.3373157</a>.
</div>
<div id="ref-hooker2020hardware" class="csl-entry">
Hooker, Sara. 2020. <span>“The Hardware Lottery.”</span> <em>Communications of the ACM</em>. <a href="https://doi.org/10.1145/3467017">https://doi.org/10.1145/3467017</a>.
</div>
<div id="ref-Huff_2017" class="csl-entry">
Huff, Toby E. 2017. <em>The Rise of Early Modern Science</em>. 1st ed. Cambridge: Cambridge University Press.
</div>
<div id="ref-kuhn1962structure" class="csl-entry">
Kuhn, Thomas S. 1962. <em>The Structure of Scientific Revolutions</em>. University of Chicago Press.
</div>
<div id="ref-macaskill2022what" class="csl-entry">
MacAskill, William. 2022. <em>What We Owe the Future</em>. New York: Basic Books.
</div>
<div id="ref-paul2019disastrous" class="csl-entry">
Paul, Kari. 2019. <span>“<span>‘Disastrous’</span> Lack of Diversity in AI Industry Perpetuates Bias, Study Finds.”</span> <em>The Guardian</em>. <a href="https://www.theguardian.com/technology/2019/apr/16/artificial-intelligence-lack-diversity-new-york-university-study">https://www.theguardian.com/technology/2019/apr/16/artificial-intelligence-lack-diversity-new-york-university-study</a>.
</div>
</div>


</section>


<div id="quarto-appendix" class="default"><section id="footnotes" class="footnotes footnotes-end-of-document"><h2 class="anchored quarto-appendix-heading">Footnotes</h2>

<ol>
<li id="fn1"><p>Consider American urban development: what began as apparently rational choices about highways and suburbs eventually made walkable neighborhoods economically unfeasible to build. The infrastructure didn’t merely reflect certain values—it enforced them, long after those values might have been questioned. Similarly, the economic investments in American slavery created powerful incentives to maintain the system—once millions of dollars were invested in enslaved people as “property,” with entire industries and political systems built around protecting those investments, the institution became extremely resistant to change even as moral opposition grew. <span class="citation" data-cites="macaskill2022what">(MacAskill 2022)</span>↩︎</p></li>
</ol>
</section></div> ]]></description>
  <category>ai</category>
  <category>science</category>
  <guid>https://kjablonka.com/blog/posts/ai_scientists/</guid>
  <pubDate>Sat, 31 May 2025 22:00:00 GMT</pubDate>
  <media:content url="https://kjablonka.com/blog/posts/ai_scientists" medium="image"/>
</item>
<item>
  <title>The Optimal Amount of Wasted Research Funding Is Non-Zero</title>
  <link>https://kjablonka.com/blog/posts/ai4science_european_academia/</link>
  <description><![CDATA[ 




<p>Europe’s universities are caught between measured caution and the breakneck pace of AI-driven discovery. We prize rigorous scholarship and public service, yet solving climate crises or designing new materials demands speed, scale—and yes, daring. This means accepting that some portion of research funding will be wasted—or even misused—but the optimal amount of “waste” is non‑zero: enough risk to ignite breakthroughs, without unraveling public trust <span class="citation" data-cites="Klein2025-rf">(Klein and Thompson 2025)</span>.</p>
<section id="counting-beans-vs.-cultivating-breakthroughs" class="level2 page-columns page-full">
<h2 class="anchored" data-anchor-id="counting-beans-vs.-cultivating-breakthroughs">Counting Beans vs.&nbsp;Cultivating Breakthroughs</h2>
<p>Imagine a director of university research budgets aiming for zero waste. Every euro precisely tracked, every grant stringently justified. At first glance, it sounds prudent—until innovation grinds to a halt. PIs slice projects into the smallest publishable nibble, chase citations instead of ideas, and hide minor shortcuts under the rug. Edwards and Roy call this the <a href="https://blog.regehr.org/archives/632">perverse incentive</a>: metrics as targets erode integrity and stifle creativity <span class="citation" data-cites="EdwardsRoy2017">(Edwards and Roy 2017)</span>.</p>
<div class="page-columns page-full"><p>In AI4Science bureaucracy does more than nudge corners: it constrains bright minds to granular targets and routine forms. We select researchers through <a href="https://de.wikipedia.org/wiki/Artikel_33_des_Grundgesetzes_f%C3%BCr_die_Bundesrepublik_Deutschland#Prinzip_der_Bestenauslese,_Art._33_Absatz_2_GG">“Bestenauslese,”</a>  celebrating the “highest achievers”, then tie them to Excel sheets and compliance checklists. Such perverse incentives shrink ambitions and discourage moonshots. Yet true leaps require room to experiment: some ideas will flop, data pipelines will fail, and yes, a few grants will yield nothing of note. That’s not waste alone; it’s the price of possibility.</p><div class="no-row-height column-margin column-container"><span class="margin-aside">“Bestenauslese” (“selection of the best”) is Germany’s rigorous (ostensibly) merit‑based academic selection process for professors, involving multiple stages of peer review, public lectures, and politiekorale vetting to appoint only the top candidates. Ironically, those once hailed as the nation’s brightest are then often constrained by procedural minutiae that discourage bold thinking.</span></div></div>
</section>
<section id="a-fraudinspired-analogy" class="level2">
<h2 class="anchored" data-anchor-id="a-fraudinspired-analogy">A Fraud‑Inspired Analogy</h2>
<p><a href="https://www.bitsaboutmoney.com/archive/optimal-amount-of-fraud/">In credit‑card fraud, businesses calculate an acceptable fraud rate—say</a>, 0.5% of transactions—because the cost of preventing every single fraudulent swipe would choke off legitimate commerce. They bake that “waste” into budgets, balancing losses against user friction. Too little fraud tolerance, and customers face endless identity checks; too much, and bad actors thrive.</p>
<p>Similarly, Europe’s research ecosystem must decide its fraud‑rate equivalent: how many dead‑end experiments, unused datasets or stalled hires do we permit to enable the rest to flourish? The answer is not zero.</p>
</section>
<section id="building-the-right-ecosystem" class="level2">
<h2 class="anchored" data-anchor-id="building-the-right-ecosystem">Building the Right Ecosystem</h2>
<ul>
<li><p><em>New Data Organizations.</em> Establish mission‑driven entities whose sole purpose is to create and share scientific data at minimal cost per replicable datapoint. Fund them to accept researcher proposals, execute experiments—robotic or computational—and retain public rights so that every dataset becomes a reusable building block. Those organizations would also be best place to organize competitions such as CASP to measure real-world impact of AI innovations.</p></li>
<li><p><em>Data as Public Good.</em> Complement these organizations with micro‑grants for labs and individual researchers to curate and submit annotated datasets—including negative results—to a pan‑European repository. .</p></li>
<li><p><em>Engineering Partnerships and Product Teams.</em> Embed research software engineers and product managers in academic groups to build, maintain and ship AI tools, applications and data products. Treat code libraries and computational platforms as first‑class research outputs, fostering shared solutions rather than isolated prototypes.</p></li>
</ul>
</section>
<section id="catalyzing-realworld-impact" class="level2">
<h2 class="anchored" data-anchor-id="catalyzing-realworld-impact">Catalyzing Real‑World Impact</h2>
<p>Only by stepping beyond the lab walls—talking with clinicians, industry users, policy makers and citizens—can AI4Science tackle system‑level problems and rediscover its path toward truth. As Daniel Sarewitz argues <span class="citation" data-cites="Sarewitz2016">(Sarewitz 2016)</span>, science must shed its ivory‑tower aloofness, embrace accountability, and co‑create solutions with the communities it aims to serve.</p>
<p>Imagine an “Academic Free Zone” pilot in which select institutions are empowered to hire swiftly, manage their own budgets, and report not on form‑counts but on real societal outcomes. Such “bubbles of exploration”-intensive zones where shared optimism, dedicated infrastructure and a tolerable failure rate yield transformative breakthroughs might reflect the best of positive bubble dynamics <span class="citation" data-cites="Sargeant2025">(Sargeant 2025)</span>.</p>
<p>Rather than endless approvals, researchers would report toward tangible public benefits. In this way, we treat researchers as accountable professionals and align incentives with Europe’s mission: ensuring that AI4Science delivers societal value, not just publication counts.</p>
</section>
<section id="personal-reflections" class="level2">
<h2 class="anchored" data-anchor-id="personal-reflections">Personal Reflections</h2>
<p>I often ask myself: “where can my AI4Science efforts matter most?”. I want that a skill set as mine remains in public service: I worry about centralization of power in AI <span class="citation" data-cites="Harari2018">(Harari 2018)</span>. Yet I share colleagues’ frustration at bureaucratic inertia: a promising algorithm may sit unused for years behind grant cycles and compliance checks. If Europe is serious about impact, we must dismantle these barriers.</p>
</section>
<section id="trustdriven-transformation" class="level2 page-columns page-full">
<h2 class="anchored" data-anchor-id="trustdriven-transformation">Trust‑Driven Transformation</h2>
<div class="page-columns page-full"><p>Europe’s strength is freedom <span class="citation" data-cites="Charlemagne2025">(Charlemagne 2025)</span>. By shifting incentives from bean‑counting to value‑driven autonomy, investing boldly in shared data and infrastructure, and tolerating a non‑zero rate of “waste,” we can lead the AI for Science revolution on our own terms.</p><div class="no-row-height column-margin column-container"><span class="margin-aside">In the words of the Economist’s Charlemange: “But in their own plodding way, Europeans have created a place where they are guaranteed rights to what others yearn for: life, liberty, and the pursuit of happiness.” <span class="citation" data-cites="Charlemagne2025">(Charlemagne 2025)</span></span></div></div>


<!-- -->


</section>

<div id="quarto-appendix" class="default"><section class="quarto-appendix-contents" id="quarto-bibliography"><h2 class="anchored quarto-appendix-heading">References</h2><div id="refs" class="references csl-bib-body hanging-indent" data-entry-spacing="0">
<div id="ref-Charlemagne2025" class="csl-entry">
Charlemagne. 2025. <span>“The Thing about Europe: It’s the Actual Land of the Free Now.”</span> <em>The Economist</em>, April. <a href="https://www.economist.com/europe/2025/04/10/the-thing-about-europe-its-the-actual-land-of-the-free-now">https://www.economist.com/europe/2025/04/10/the-thing-about-europe-its-the-actual-land-of-the-free-now</a>.
</div>
<div id="ref-EdwardsRoy2017" class="csl-entry">
Edwards, Marc A., and Siddhartha Roy. 2017. <span>“Academic Research in the 21st Century: Maintaining Scientific Integrity in a Climate of Perverse Incentives and Hypercompetition.”</span> <em>Environmental Engineering Science</em> 34 (1): 51–61. <a href="https://doi.org/10.1089/ees.2016.0223">https://doi.org/10.1089/ees.2016.0223</a>.
</div>
<div id="ref-Harari2018" class="csl-entry">
Harari, Yuval Noah. 2018. <span>“Why Technology Favors Tyranny.”</span> <em>Foreign Affairs</em>, October.
</div>
<div id="ref-Klein2025-rf" class="csl-entry">
Klein, Ezra, and Derek Thompson. 2025. <em>Abundance</em>. Reno, NV: Simon &amp; Schuster.
</div>
<div id="ref-Sarewitz2016" class="csl-entry">
Sarewitz, Daniel. 2016. <span>“Saving Science.”</span> <em>The New Atlantis</em> Spring/Summer: 6–41.
</div>
<div id="ref-Sargeant2025" class="csl-entry">
Sargeant, Leah Libresco. 2025. <span>“Are We Under‐bubbled?”</span> <em>The New Atlantis</em> Spring: 118–22.
</div>
</div></section></div> ]]></description>
  <category>academia</category>
  <category>Europe</category>
  <guid>https://kjablonka.com/blog/posts/ai4science_european_academia/</guid>
  <pubDate>Thu, 24 Apr 2025 22:00:00 GMT</pubDate>
</item>
<item>
  <title>Autoencoders as Digital Archaeologists for Spectroscopic Data</title>
  <link>https://kjablonka.com/blog/posts/autencoder_spectroscopy/</link>
  <description><![CDATA[ 




<section id="introduction" class="level2">
<h2 class="anchored" data-anchor-id="introduction">Introduction</h2>
<p>Picture this: An archaeologist stands at a dig site, surrounded by layers of earth that haven’t seen sunlight since dinosaurs were a hot new trend. With painstaking care, they brush away dirt and sediment, revealing pottery shards, and that one graduate student who fell asleep on the job.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://kjablonka.com/blog/posts/autencoder_spectroscopy/archeology_process.png" class="img-fluid figure-img"></p>
<figcaption>The process of archeology</figcaption>
</figure>
</div>
<p>Now imagine replacing dirt with noise, shards with molecular signatures, and the reconstructed vase with a clean spectrum. Welcome to the world of spectroscopic data analysis using autoencoders: where we excavate molecular treasures from layers of noise and complexity through what the fancy folks call “unsupervised representation learning” (which is “teaching computers to find patterns without telling them what patterns to find”).</p>
<p>Every spectroscopic measurement is like an archaeological dig, except instead of finding ancient coins, we’re finding molecular transitions between energy states.</p>
<p>But just as ancient artifacts come to us covered in dirt and damaged by time (and occasionally by that one archaeologist who thought dynamite was a good excavation tool), our spectroscopic data arrives buried under multiple layers of contamination: such as</p>
<ul>
<li><em>noise</em> due to random fluctuations (“electrons having a dance party”)</li>
<li><em>environmental interference</em>: for example, Water vapor and CO₂ absorption bands</li>
<li><em>instrumental artifacts</em>: baseline drift (“detector getting tired”)</li>
<li><em>physical degradation</em>: sample fluorescence and aging effects</li>
</ul>
<p>Traditional smoothing techniques are the equivalent of using a bulldozer to dust off a delicate vase. Sure, you’ll remove the dirt, but you might also remove, well, everything else. One usecase of autoencoder is to do this in a better way.</p>
</section>
<section id="the-digital-archaeologists-toolkit" class="level2">
<h2 class="anchored" data-anchor-id="the-digital-archaeologists-toolkit">The Digital Archaeologist’s Toolkit</h2>
<p>In our metaphor, an autoencoder is like a three-phase archaeological expedition.</p>
<section id="phase-1-the-excavation-encoding" class="level3">
<h3 class="anchored" data-anchor-id="phase-1-the-excavation-encoding">Phase 1: The Excavation (Encoding)</h3>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://kjablonka.com/blog/posts/autencoder_spectroscopy/excavation_comp.png" class="img-fluid figure-img"></p>
<figcaption>Comparison of archeological and spectroscopic excavations.</figcaption>
</figure>
</div>
<p>The first phase in our archeology mission begins by cleaning the artifacts we find: removing the sand, cleaning them. We can think of it as removing the unnecessary. Similarly, our spectroscopic application starts with removing the unnecessary and “compressing” the spectrum to its essentials.</p>
<p>In the simplest form this can be done using a sequence of linear layers:</p>
<div id="eef31d52" class="cell" data-execution_count="2">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">class</span> ExcavationBrush(nn.Module):</span>
<span id="cb1-2">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">__init__</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, spectral_channels<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span>, artifact_dimensions<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">32</span>):</span>
<span id="cb1-3">        <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">super</span>().<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">__init__</span>()</span>
<span id="cb1-4"></span>
<span id="cb1-5">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.compressor <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> nn.Sequential(</span>
<span id="cb1-6">            nn.Linear(spectral_channels, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">512</span>),</span>
<span id="cb1-7">            nn.ReLU(),</span>
<span id="cb1-8">            nn.BatchNorm1d(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">512</span>),</span>
<span id="cb1-9">            nn.Dropout(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.2</span>),</span>
<span id="cb1-10">            nn.Linear(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">512</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">256</span>),</span>
<span id="cb1-11">            nn.ReLU(),</span>
<span id="cb1-12">            nn.BatchNorm1d(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">256</span>),</span>
<span id="cb1-13">            nn.Linear(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">256</span>, artifact_dimensions)</span>
<span id="cb1-14">        )</span>
<span id="cb1-15"></span>
<span id="cb1-16">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> forward(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, buried_spectrum):</span>
<span id="cb1-17">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.compressor(buried_spectrum)</span></code></pre></div></div>
</div>
<p>The encoder takes our high-dimensional spectrum and compresses it into something more manageable. But this isn’t a simple compression algorithm: the model learns which features matter most, like an experienced archaeologist who can tell the difference between “priceless artifact” and “rock that looks vaguely interesting.”</p>
</section>
<section id="phase-2-the-latent-space" class="level3">
<h3 class="anchored" data-anchor-id="phase-2-the-latent-space">Phase 2: The latent space</h3>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://kjablonka.com/blog/posts/autencoder_spectroscopy/latent_space_comp.png" class="img-fluid figure-img"></p>
<figcaption>The archeological vs.&nbsp;spectroscopic latent space</figcaption>
</figure>
</div>
<p>The latent space is our archaeological museum’s storage room—not the fancy public galleries with mood lighting and gift shops, but the back room where things actually get done. Here, each spectrum becomes a neat little index card with just the essential information.</p>
<div id="e4c0cc74" class="cell" data-execution_count="3">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># In our latent space, each spectrum becomes coordinates on an ancient map</span></span>
<span id="cb2-2"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># It's like Google Maps, but for molecules</span></span>
<span id="cb2-3"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> encode_spectrum(encoder, noisy_spectrum):</span>
<span id="cb2-4">    latent_artifacts <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> encoder(noisy_spectrum)  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Returns a point in hyperspace</span></span>
<span id="cb2-5">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> latent_artifacts</span></code></pre></div></div>
</div>
<p>But this is no ordinary storage room. It’s a magical space where similar artifacts naturally cluster together, like teenagers at a high school cafeteria. Polymers hang out in their valley, ceramics claim the mountain peaks, and metal oxides spread across their plains like they own the place.</p>
<div class="callout callout-style-default callout-note callout-titled">
<div class="callout-header d-flex align-content-center collapsed" data-bs-toggle="collapse" data-bs-target=".callout-1-contents" aria-controls="callout-1" aria-expanded="false" aria-label="Toggle callout">
<div class="callout-icon-container">
<i class="callout-icon"></i>
</div>
<div class="callout-title-container flex-fill">
<span class="screen-reader-only">Note</span>The Math Behind Clustering
</div>
<div class="callout-btn-toggle d-inline-block border-0 py-1 ps-1 pe-0 float-end"><i class="callout-toggle"></i></div>
</div>
<div id="callout-1" class="callout-1-contents callout-collapse collapse">
<div class="callout-body-container callout-body">
<p>The clustering emerges from what we call the <a href="http://colah.github.io/posts/2014-03-NN-Manifolds-Topology/">manifold hypothesis</a>—the idea that high-dimensional data actually lives on a lower-dimensional surface.</p>
<p>Mathematically, our encoder learns a mapping: <img src="https://latex.codecogs.com/png.latex?%0Af_%5Cphi:%20%5Cmathcal%7BX%7D%20%5Crightarrow%20%5Cmathcal%7BZ%7D%0A"> Where <img src="https://latex.codecogs.com/png.latex?%5Cmathcal%7BX%7D%20%5Csubset%20%5Cmathbb%7BR%7D%5En"> is where our data lives (the messy real world) and <img src="https://latex.codecogs.com/png.latex?%5Cmathcal%7BZ%7D%20%5Csubset%20%5Cmathbb%7BR%7D%5Em"> is our nice, clean latent space. This mapping preserves important properties:</p>
<p><strong>Distance preservation</strong>: Similar inputs map to nearby points <img src="https://latex.codecogs.com/png.latex?%0Ad_%7B%5Cmathcal%7BZ%7D%7D(f_%5Cphi(x_i),%20f_%5Cphi(x_j))%20%5Capprox%20d_%7B%5Cmathcal%7BX%7D%7D(x_i,%20x_j)%0A"> <strong>Continuity</strong>: Small changes in input create small changes in output</p>
<p><img src="https://latex.codecogs.com/png.latex?%0A%5C%7Cf_%5Cphi(x_1)%20-%20f_%5Cphi(x_2)%5C%7C%20%5Cleq%20L%5C%7Cx_1%20-%20x_2%5C%7C%0A"></p>
<p>(The Lipschitz condition.)</p>
<p>So materials with similar spectra end up as neighbors in latent space, forming these natural clusters. It’s like chemical social networking!</p>
</div>
</div>
</div>
</section>
<section id="phase-3-bringing-it-back-together-reconstruction" class="level3">
<h3 class="anchored" data-anchor-id="phase-3-bringing-it-back-together-reconstruction">Phase 3: Bringing it back together (Reconstruction)</h3>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://kjablonka.com/blog/posts/autencoder_spectroscopy/reconstruction.png" class="img-fluid figure-img"></p>
<figcaption>Archeological vs.&nbsp;spectroscopic reconstruction.</figcaption>
</figure>
</div>
<p>Using only our compressed representation (those index cards), we attempt to reconstruct the original spectrum. It’s like trying to rebuild a dinosaur from a few bones and a lot of imagination, except our imagination is constrained by mathematics.</p>
<div id="346bbd50" class="cell" data-execution_count="4">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">class</span> Reconstructor(nn.Module):</span>
<span id="cb3-2">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">__init__</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, artifact_dimensions<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">32</span>, spectral_channels<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span>):</span>
<span id="cb3-3">        <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">super</span>().<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">__init__</span>()</span>
<span id="cb3-4">        </span>
<span id="cb3-5">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.reconstruction_process <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> nn.Sequential(</span>
<span id="cb3-6">            nn.Linear(artifact_dimensions, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">256</span>),</span>
<span id="cb3-7">            nn.ReLU(),</span>
<span id="cb3-8">            nn.BatchNorm1d(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">256</span>),</span>
<span id="cb3-9">            </span>
<span id="cb3-10">            nn.Linear(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">256</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">512</span>),</span>
<span id="cb3-11">            nn.ReLU(),</span>
<span id="cb3-12">            nn.BatchNorm1d(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">512</span>),</span>
<span id="cb3-13">            </span>
<span id="cb3-14">            nn.Linear(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">512</span>, spectral_channels),</span>
<span id="cb3-15">            nn.Sigmoid() </span>
<span id="cb3-16">        )</span>
<span id="cb3-17">    </span>
<span id="cb3-18">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> forward(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, artifact_description):</span>
<span id="cb3-19">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.reconstruction_process(artifact_description)</span></code></pre></div></div>
</div>
<p>The decoder takes our compressed representation and attempts to rebuild the original spectrum. If we’ve done our job right (and haven’t accidentally trained our network to just output pictures of cats), the reconstruction should be faithful to the original.</p>
</section>
</section>
<section id="the-mathematics-of-archaeological-documentation" class="level2">
<h2 class="anchored" data-anchor-id="the-mathematics-of-archaeological-documentation">The Mathematics of Archaeological Documentation</h2>
<p>Just as physical conservation laws govern the preservation of matter and energy (thanks, Emmy Noether!), information theory dictates how we can compress and reconstruct data without turning it into digital gibberish.</p>
<p>The fundamental equation governing our autoencoder is the reconstruction loss:</p>
<p><img src="https://latex.codecogs.com/png.latex?%0A%5Cmathcal%7BL%7D_%7B%5Ctext%7Breconstruction%7D%7D%20=%20%5C%7Cx%20-%20%5Chat%7Bx%7D%5C%7C%5E2%0A"></p>
<p>Where <img src="https://latex.codecogs.com/png.latex?(x)"> is our original spectrum (the truth, the whole truth, and nothing but the truth) and <img src="https://latex.codecogs.com/png.latex?(%5Chat%7Bx%7D)"> is our reconstruction.</p>
<div class="callout callout-style-default callout-tip callout-titled">
<div class="callout-header d-flex align-content-center collapsed" data-bs-toggle="collapse" data-bs-target=".callout-2-contents" aria-controls="callout-2" aria-expanded="false" aria-label="Toggle callout">
<div class="callout-icon-container">
<i class="callout-icon"></i>
</div>
<div class="callout-title-container flex-fill">
<span class="screen-reader-only">Tip</span>Why MSE Makes Statistical Sense
</div>
<div class="callout-btn-toggle d-inline-block border-0 py-1 ps-1 pe-0 float-end"><i class="callout-toggle"></i></div>
</div>
<div id="callout-2" class="callout-2-contents callout-collapse collapse">
<div class="callout-body-container callout-body">
<p>Let me tell you a tale about why MSE and Gaussian noise are BFFs.</p>
<p>If we assume our noise is Gaussian with mean 0 and variance <img src="https://latex.codecogs.com/png.latex?%5Csigma%5E2">:</p>
<p><img src="https://latex.codecogs.com/png.latex?%0Ap(x%7Cz)%20=%20%5Cmathcal%7BN%7D(x;%20f_%5Ctheta(z),%20%5Csigma%5E2I)%0A"></p>
<p>The likelihood for a single data point becomes: <img src="https://latex.codecogs.com/png.latex?%0Ap(x%7Cz)%20=%20%5Cfrac%7B1%7D%7B(2%5Cpi%5Csigma%5E2)%5E%7Bn/2%7D%7D%20%5Cexp%5Cleft(-%5Cfrac%7B%5C%7Cx%20-%20f_%5Ctheta(z)%5C%7C%5E2%7D%7B2%5Csigma%5E2%7D%5Cright)%0A"></p>
<p>Taking the negative log-likelihood:</p>
<p><img src="https://latex.codecogs.com/png.latex?%0A-%5Clog%20p(x%7Cz)%20=%20%5Cfrac%7Bn%7D%7B2%7D%5Clog(2%5Cpi%5Csigma%5E2)%20+%20%5Cfrac%7B%5C%7Cx%20-%20f_%5Ctheta(z)%5C%7C%5E2%7D%7B2%5Csigma%5E2%7D%0A"></p>
<p>Since the first term is constant w.r.t. θ (our parameters), minimizing negative log-likelihood is equivalent to minimizing:</p>
<p><img src="https://latex.codecogs.com/png.latex?%0A%5Cfrac%7B%5C%7Cx%20-%20f_%5Ctheta(z)%5C%7C%5E2%7D%7B2%5Csigma%5E2%7D%0A"></p>
<p>Which is just MSE in a fancy hat! So when you use MSE loss, you’re implicitly assuming Gaussian noise.</p>
</div>
</div>
</div>
<p>But we can be fancier with a composite loss function:</p>
<p><img src="https://latex.codecogs.com/png.latex?%0A%5Cmathcal%7BL%7D_%7B%5Ctext%7Btotal%7D%7D%20=%20%5Cunderbrace%7B%5C%7Cx%20-%20%5Chat%7Bx%7D%5C%7C%5E2%7D_%7B%5Ctext%7BBe%20accurate%7D%7D%20+%20%5Clambda_1%20%5Cunderbrace%7B%5C%7C%5Cnabla%20x%20-%20%5Cnabla%20%5Chat%7Bx%7D%5C%7C%5E2%7D_%7B%5Ctext%7BBe%20smooth%7D%7D%20+%20%5Clambda_2%20%5Cunderbrace%7B%5Csum_%7Bp%20%5Cin%20%5Ctext%7Bpeaks%7D%7D%20%7Cx_p%20-%20%5Chat%7Bx_p%7D%7C%7D_%7B%5Ctext%7BDon't%20mess%20up%20the%20peaks%7D%7D%20+%20%5Clambda_3%20%5Cunderbrace%7B%5Cmathcal%7BR%7D(%5Cphi,%20%5Ctheta)%7D_%7B%5Ctext%7BDon't%20go%20crazy%7D%7D%0A"></p>
<p>Each term has a job:</p>
<ul>
<li>Fidelity term: “Make it look like the original”</li>
<li>Gradient penalty: “Keep it smooth, no sudden jumps”</li>
<li>Feature preservation: “Those peaks are important, don’t lose them!”</li>
<li>Regularization: “Stay humble, don’t overfit”</li>
</ul>
</section>
<section id="the-manifold-hypothesis-why-this-archaeological-dig-makes-sense-at-all" class="level2">
<h2 class="anchored" data-anchor-id="the-manifold-hypothesis-why-this-archaeological-dig-makes-sense-at-all">The Manifold Hypothesis: Why This Archaeological Dig Makes Sense At All</h2>
<p>Let’s address a fundamental question: why should this even work? Shouldn’t compressing our beautiful high-dimensional spectrum lose valuable information? Welcome to the manifold hypothesis, the reason dimensionality reduction isn’t just mathematical vandalism.</p>
<p>The manifold hypothesis suggests that high-dimensional data (like our spectroscopic signals) aren’t actually using all those dimensions effectively. Instead, the data lies on or near a lower-dimensional surface (a manifold) embedded in that high-dimensional space. It’s like discovering that what looks like a complex 3D sculpture is actually just a cleverly folded 2D sheet of paper.</p>
<div class="callout callout-style-default callout-note callout-titled">
<div class="callout-header d-flex align-content-center collapsed" data-bs-toggle="collapse" data-bs-target=".callout-3-contents" aria-controls="callout-3" aria-expanded="false" aria-label="Toggle callout">
<div class="callout-icon-container">
<i class="callout-icon"></i>
</div>
<div class="callout-title-container flex-fill">
<span class="screen-reader-only">Note</span>Why spectroscopic data probably lives on a manifold
</div>
<div class="callout-btn-toggle d-inline-block border-0 py-1 ps-1 pe-0 float-end"><i class="callout-toggle"></i></div>
</div>
<div id="callout-3" class="callout-3-contents callout-collapse collapse">
<div class="callout-body-container callout-body">
<p>Spectroscopic data is fundamentally constrained by:</p>
<ul>
<li><strong>Physics</strong>: Certain combinations of absorption bands are physically impossible due to quantum mechanical selection rules. You can’t just have arbitrary patterns of peaks!</li>
<li><strong>Chemistry</strong>: Molecular structures create specific patterns of vibrations, rotations, and electronic transitions. A carbonyl group will always give you that telltale peak around 1700 cm⁻¹ in IR spectroscopy. And the space of possible chemicals is constrained (you cannot combine all atoms in all possible ways)</li>
<li><strong>Instrumental limitations</strong>: Your spectrometer has a specific resolution and response function, further constraining the space of possible measurements.</li>
</ul>
<p>These constraints mean that despite having thousands of wavelength points, your spectrum is likely determined by a much smaller number of underlying variables—chemical compositions, molecular structures, temperature, etc.</p>
<p>Mathematically, if your spectral data points <img src="https://latex.codecogs.com/png.latex?%5C%7Bx_1,%20x_2,%20%5Cdots,%20x_n%5C%7D%20%5Cin%20%5Cmathbb%7BR%7D%5Ed"> (where d might be thousands of wavelengths), they likely lie on or near a <img src="https://latex.codecogs.com/png.latex?k">-dimensional manifold <img src="https://latex.codecogs.com/png.latex?%5Cmathcal%7BM%7D%20%5Csubset%20%5Cmathbb%7BR%7D%5Ed"> where <img src="https://latex.codecogs.com/png.latex?k%5Cll%20d">.</p>
<p>The goal of our autoencoder is to learn this manifold—the archaeological site map, if you will.</p>
</div>
</div>
</div>
<p>To visualize this, imagine our spectra are actually faces of ancient masks (stay with me here). Each mask has thousands of pixels (dimensions), but you could describe any mask with far fewer parameters: eye size, mouth width, nose shape, etc. That’s your manifold! Autoencoders discover these “facial features” of spectra automatically. ![You might be familiar with <a href="https://en.wikipedia.org/wiki/Eigenface">eigenfaces</a>, which are “basis vectors” of human faces one can derive with PCA.]</p>
<div id="cell-fig-manifold" class="cell" data-execution_count="5">
<div class="cell-output cell-output-display">
<div id="fig-manifold" class="quarto-float quarto-figure quarto-figure-center anchored">
<figure class="quarto-float quarto-float-fig figure">
<div aria-describedby="fig-manifold-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
<img src="https://kjablonka.com/blog/posts/autencoder_spectroscopy/index_files/figure-html/fig-manifold-output-1.png" width="919" height="471" class="figure-img">
</div>
<figcaption class="quarto-float-caption-bottom quarto-float-caption quarto-float-fig" id="fig-manifold-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
Figure&nbsp;1: Illustration of how data might lie on a lower-dimensional manifold in a higher-dimensional space.
</figcaption>
</figure>
</div>
</div>
</div>
</section>
<section id="from-classical-to-neural-the-connection-between-pca-and-linear-autoencoders" class="level2">
<h2 class="anchored" data-anchor-id="from-classical-to-neural-the-connection-between-pca-and-linear-autoencoders">From Classical to Neural: The Connection Between PCA and Linear Autoencoders</h2>
<p>Long before neural networks were cool, archaeologists (well, statisticians) had their own dimensionality reduction technique: Principal Component Analysis, or PCA.</p>
<div class="callout callout-style-default callout-note callout-titled">
<div class="callout-header d-flex align-content-center collapsed" data-bs-toggle="collapse" data-bs-target=".callout-4-contents" aria-controls="callout-4" aria-expanded="false" aria-label="Toggle callout">
<div class="callout-icon-container">
<i class="callout-icon"></i>
</div>
<div class="callout-title-container flex-fill">
<span class="screen-reader-only">Note</span>The Mathematical Connection Between PCA and Linear Autoencoders
</div>
<div class="callout-btn-toggle d-inline-block border-0 py-1 ps-1 pe-0 float-end"><i class="callout-toggle"></i></div>
</div>
<div id="callout-4" class="callout-4-contents callout-collapse collapse">
<div class="callout-body-container callout-body">
<p>Let’s consider a linear autoencoder with:</p>
<ul>
<li>Input dimension <img src="https://latex.codecogs.com/png.latex?d"></li>
<li>Latent dimension <img src="https://latex.codecogs.com/png.latex?k"> (where <img src="https://latex.codecogs.com/png.latex?k%20%3C%20d">)</li>
<li>Encoder weight matrix <img src="https://latex.codecogs.com/png.latex?W_1%20%5Cin%20%20%5Cmathbb%7BR%7D%5E%7Bk%20%5Ctimes%20d%7D"></li>
<li>Decoder weight matrix <img src="https://latex.codecogs.com/png.latex?W_2%20%5Cin%20%20%5Cmathbb%7BR%7D%5E%7Bd%20%5Ctimes%20k%7D"></li>
<li>No biases or activation functions</li>
</ul>
<p>For an input <img src="https://latex.codecogs.com/png.latex?x%20%5Cin%20%5Cmathbb%7BR%7D%5Ed">, the encoding and reconstruction process is:</p>
<ol type="1">
<li>Encode: <img src="https://latex.codecogs.com/png.latex?z%20=%20W_1%20x"> (where <img src="https://latex.codecogs.com/png.latex?z%20%5Cin%20%5Cmathbb%7BR%7D%5Ek">)</li>
<li>Decode: <img src="https://latex.codecogs.com/png.latex?%5Chat%7Bx%7D%20=%20W_2%20z%20=%20W_2W_1x"></li>
</ol>
<p>The reconstruction error we minimize is: <img src="https://latex.codecogs.com/png.latex?%0A%5Cmathcal%7BL%7D%20=%20%5C%7Cx%20-%20%5Chat%7Bx%7D%5C%7C%5E2%20=%20%5C%7Cx%20-%20W_2W_1x%5C%7C%5E2%0A"></p>
<p>Under the constraint that <img src="https://latex.codecogs.com/png.latex?W_1"> and <img src="https://latex.codecogs.com/png.latex?W_2"> minimize this reconstruction error, the optimal solution has the following properties:</p>
<ul>
<li><img src="https://latex.codecogs.com/png.latex?W_2%20=%20W_1%5ET"> (the decoder is the transpose of the encoder)</li>
<li>The rows of <img src="https://latex.codecogs.com/png.latex?W_1"> are the first k principal components of the data</li>
</ul>
<p>To see why, let’s decompose our data matrix <img src="https://latex.codecogs.com/png.latex?X"> using SVD: <img src="https://latex.codecogs.com/png.latex?%0AX%20=%20U%5CSigma%20V%5ET%0A"></p>
<p>Where:</p>
<ul>
<li><img src="https://latex.codecogs.com/png.latex?U"> contains the left singular vectors</li>
<li><img src="https://latex.codecogs.com/png.latex?%5CSigma"> contains the singular values on its diagonal</li>
<li><img src="https://latex.codecogs.com/png.latex?V%5ET"> contains the right singular vectors</li>
</ul>
<p>The optimal linear projection to <img src="https://latex.codecogs.com/png.latex?k"> dimensions is given by:</p>
<p><img src="https://latex.codecogs.com/png.latex?%0AW_1%20=%20U_k%5ET%0A"></p>
<p>Where <img src="https://latex.codecogs.com/png.latex?U_k"> contains the first <img src="https://latex.codecogs.com/png.latex?k"> columns of <img src="https://latex.codecogs.com/png.latex?U"> (corresponding to the <img src="https://latex.codecogs.com/png.latex?k"> largest singular values).</p>
<p>And the optimal reconstruction matrix is: <img src="https://latex.codecogs.com/png.latex?%0AW_2%20=%20U_k%0A"></p>
<p>Which is exactly <img src="https://latex.codecogs.com/png.latex?W_1%5ET">.</p>
<p>Therefore, our reconstructed data is: <img src="https://latex.codecogs.com/png.latex?%0A%5Chat%7BX%7D%20=%20W_2W_1X%20=%20U_kU_k%5ETX%0A"></p>
<p>Which is precisely the reconstruction you’d get from projecting X onto the first k principal components and back.</p>
<p>This means our linear autoencoder will learn the same subspace as PCA, just with more computational effort and the possibility of getting stuck in local minima. It’s like taking a road trip to your neighbor’s house—you’ll get there, but was the scenic route necessary?</p>
</div>
</div>
</div>
<section id="a-practical-example-finding-the-redundant-dimension" class="level3">
<h3 class="anchored" data-anchor-id="a-practical-example-finding-the-redundant-dimension">A Practical Example: Finding the Redundant Dimension</h3>
<p>Let’s make this concrete with an example. Imagine we have a spectrum where two neighboring wavelengths always vary together—perhaps due to a broad absorption band or some instrumental correlation.</p>
<div id="2cc21370" class="cell" data-execution_count="6">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Create a dataset with a redundant dimension</span></span>
<span id="cb4-2"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> create_redundant_spectrum(num_samples<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span>):</span>
<span id="cb4-3">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Independent features</span></span>
<span id="cb4-4">    independent_features <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.random.randn(num_samples, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)</span>
<span id="cb4-5">    </span>
<span id="cb4-6">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Create a 5D spectrum where dimensions 2 and 3 are correlated</span></span>
<span id="cb4-7">    spectra <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.zeros((num_samples, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>))</span>
<span id="cb4-8">    spectra[:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> independent_features[:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>]  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Independent</span></span>
<span id="cb4-9">    spectra[:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> independent_features[:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Independent</span></span>
<span id="cb4-10">    spectra[:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> independent_features[:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>]  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Independent</span></span>
<span id="cb4-11">    spectra[:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.95</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> independent_features[:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> np.random.randn(num_samples)  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Correlated with dim 2</span></span>
<span id="cb4-12">    spectra[:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> independent_features[:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> independent_features[:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Another linear combination</span></span>
<span id="cb4-13">    </span>
<span id="cb4-14">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> spectra</span>
<span id="cb4-15"></span>
<span id="cb4-16"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Create a linear autoencoder</span></span>
<span id="cb4-17"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">class</span> LinearAutoencoder(nn.Module):</span>
<span id="cb4-18">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">__init__</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, input_dim<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>, latent_dim<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>):</span>
<span id="cb4-19">        <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">super</span>().<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">__init__</span>()</span>
<span id="cb4-20">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.encoder <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> nn.Linear(input_dim, latent_dim, bias<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">False</span>)  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># No bias</span></span>
<span id="cb4-21">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.decoder <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> nn.Linear(latent_dim, input_dim, bias<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">False</span>)  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># No bias</span></span>
<span id="cb4-22">    </span>
<span id="cb4-23">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> forward(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, x):</span>
<span id="cb4-24">        latent <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.encoder(x)</span>
<span id="cb4-25">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.decoder(latent)</span>
<span id="cb4-26">    </span>
<span id="cb4-27">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> tie_weights(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>):</span>
<span id="cb4-28">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># This enforces W_2 = W_1^T </span></span>
<span id="cb4-29">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.decoder.weight.data <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.encoder.weight.data.t()</span></code></pre></div></div>
</div>
<p>When we train this model, it should learn to identify dimension 3 as redundant (since it’s nearly identical to dimension 2). Also dimension 4 is only a linear combination of other dimensions. A 3-dimensional latent space will capture all the variance in the 5-dimensional input.</p>
<div id="cell-fig-pca" class="cell" data-execution_count="7">
<details class="code-fold">
<summary>Code</summary>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Generate redundant spectrum</span></span>
<span id="cb5-2">spectra <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> create_redundant_spectrum()</span>
<span id="cb5-3"></span>
<span id="cb5-4"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Apply PCA</span></span>
<span id="cb5-5">pca <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> PCA(n_components<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>)  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Get all components to see variance</span></span>
<span id="cb5-6">spectra_reduced <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pca.fit_transform(spectra)</span>
<span id="cb5-7"></span>
<span id="cb5-8">pca_three <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> PCA(n_components<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)</span>
<span id="cb5-9">spectra_reduced_three <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pca_three.fit_transform(spectra)</span>
<span id="cb5-10">spectra_reconstructed <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pca_three.inverse_transform(spectra_reduced_three)</span>
<span id="cb5-11"></span>
<span id="cb5-12"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Calculate reconstruction error</span></span>
<span id="cb5-13">reconstruction_error <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.mean((spectra <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> spectra_reconstructed) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb5-14"></span>
<span id="cb5-15"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Plot the explained variance</span></span>
<span id="cb5-16">plt.figure(figsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>))</span>
<span id="cb5-17">plt.bar(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">6</span>), pca.explained_variance_ratio_)</span>
<span id="cb5-18">plt.xlabel(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Principal Component"</span>)</span>
<span id="cb5-19">plt.ylabel(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Explained Variance Ratio"</span>)</span>
<span id="cb5-20">plt.title(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"PCA Explained Variance (Reconstruction Error: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>reconstruction_error<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:.6f}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">)"</span>)</span>
<span id="cb5-21">plt.xticks(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">6</span>))</span>
<span id="cb5-22">plt.ylim(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb5-23">plt.tight_layout()</span>
<span id="cb5-24">plt.show()</span></code></pre></div></div>
</details>
<div class="cell-output cell-output-stderr">
<pre><code>/opt/miniconda3/lib/python3.13/site-packages/sklearn/decomposition/_base.py:152: RuntimeWarning:

divide by zero encountered in matmul

/opt/miniconda3/lib/python3.13/site-packages/sklearn/decomposition/_base.py:152: RuntimeWarning:

overflow encountered in matmul

/opt/miniconda3/lib/python3.13/site-packages/sklearn/decomposition/_base.py:152: RuntimeWarning:

invalid value encountered in matmul

/opt/miniconda3/lib/python3.13/site-packages/sklearn/decomposition/_base.py:152: RuntimeWarning:

divide by zero encountered in matmul

/opt/miniconda3/lib/python3.13/site-packages/sklearn/decomposition/_base.py:152: RuntimeWarning:

overflow encountered in matmul

/opt/miniconda3/lib/python3.13/site-packages/sklearn/decomposition/_base.py:152: RuntimeWarning:

invalid value encountered in matmul

/opt/miniconda3/lib/python3.13/site-packages/sklearn/decomposition/_base.py:205: RuntimeWarning:

divide by zero encountered in matmul

/opt/miniconda3/lib/python3.13/site-packages/sklearn/decomposition/_base.py:205: RuntimeWarning:

overflow encountered in matmul

/opt/miniconda3/lib/python3.13/site-packages/sklearn/decomposition/_base.py:205: RuntimeWarning:

invalid value encountered in matmul
</code></pre>
</div>
<div class="cell-output cell-output-display">
<div id="fig-pca" class="quarto-float quarto-figure quarto-figure-center anchored">
<figure class="quarto-float quarto-float-fig figure">
<div aria-describedby="fig-pca-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
<img src="https://kjablonka.com/blog/posts/autencoder_spectroscopy/index_files/figure-html/fig-pca-output-2.png" width="950" height="470" class="figure-img">
</div>
<figcaption class="quarto-float-caption-bottom quarto-float-caption quarto-float-fig" id="fig-pca-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
Figure&nbsp;2: PCA analysis of the redundant spectrum data, showing the explained variance ratio.
</figcaption>
</figure>
</div>
</div>
</div>
<p>In the real world, spectroscopic data often has many such redundancies. Neighboring wavelengths are correlated, certain patterns of peaks occur together, and baseline effects introduce further correlations. These redundancies are exactly what autoencoders exploit—the manifold structure of our data.</p>
<p>The difference is that nonlinear autoencoders can capture more complex manifolds that PCA misses. It’s like upgrading from a 2D map to a 3D hologram of your archaeological site.</p>
</section>
</section>
<section id="beyond-linear-maps-where-neural-networks-actually-shine" class="level2">
<h2 class="anchored" data-anchor-id="beyond-linear-maps-where-neural-networks-actually-shine">Beyond Linear Maps: Where Neural Networks Actually Shine</h2>
<p>Now we’ve seen that linear autoencoders are just PCA in disguise, let’s talk about why we still bother with neural networks.</p>
<p>The magic happens when we add nonlinearities: those lovely activation functions like ReLU, sigmoid, or tanh. These allow autoencoders to learn complex, curved manifolds that PCA could never dream of capturing.</p>
<div id="78003da1" class="cell" data-execution_count="8">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb7-1"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">class</span> NonlinearArchaeologist(nn.Module):</span>
<span id="cb7-2">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">__init__</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, input_dim<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span>, latent_dim<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">32</span>):</span>
<span id="cb7-3">        <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">super</span>().<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">__init__</span>()</span>
<span id="cb7-4">        </span>
<span id="cb7-5">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Now with extra nonlinear goodness!</span></span>
<span id="cb7-6">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.encoder <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> nn.Sequential(</span>
<span id="cb7-7">            nn.Linear(input_dim, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">512</span>),</span>
<span id="cb7-8">            nn.ReLU(),  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># This is where the magic happens</span></span>
<span id="cb7-9">            nn.Linear(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">512</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">256</span>),</span>
<span id="cb7-10">            nn.ReLU(),  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># More magic!</span></span>
<span id="cb7-11">            nn.Linear(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">256</span>, latent_dim)</span>
<span id="cb7-12">        )</span>
<span id="cb7-13">        </span>
<span id="cb7-14">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.decoder <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> nn.Sequential(</span>
<span id="cb7-15">            nn.Linear(latent_dim, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">256</span>),</span>
<span id="cb7-16">            nn.ReLU(),</span>
<span id="cb7-17">            nn.Linear(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">256</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">512</span>),</span>
<span id="cb7-18">            nn.ReLU(),</span>
<span id="cb7-19">            nn.Linear(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">512</span>, input_dim)</span>
<span id="cb7-20">        )</span></code></pre></div></div>
</div>
<div class="callout callout-style-default callout-note callout-titled">
<div class="callout-header d-flex align-content-center collapsed" data-bs-toggle="collapse" data-bs-target=".callout-5-contents" aria-controls="callout-5" aria-expanded="false" aria-label="Toggle callout">
<div class="callout-icon-container">
<i class="callout-icon"></i>
</div>
<div class="callout-title-container flex-fill">
<span class="screen-reader-only">Note</span>The Power of Nonlinearity
</div>
<div class="callout-btn-toggle d-inline-block border-0 py-1 ps-1 pe-0 float-end"><i class="callout-toggle"></i></div>
</div>
<div id="callout-5" class="callout-5-contents callout-collapse collapse">
<div class="callout-body-container callout-body">
<p>Consider a simple nonlinear manifold: data points lying on a curved surface, like a <a href="https://en.wikipedia.org/wiki/Swiss_roll">swiss roll</a> or a spiral. Linear methods like PCA can only find a flat subspace that minimizes the average distance to all points.</p>
<p>But with nonlinear transformations, we can “unroll” or “straighten” the manifold.</p>
<p>For autoencoders, this means:</p>
<ul>
<li>The encoder can learn a function <img src="https://latex.codecogs.com/png.latex?f:%20%5Cmathbb%7BR%7D%5Ed%20%5Cto%20%5Cmathbb%7BR%7D%5Em"> that maps the curved manifold to a flat latent space</li>
<li>The decoder learns the inverse mapping <img src="https://latex.codecogs.com/png.latex?g:%20%5Cmathbb%7BR%7D%5Em%20%5Cto%20%5Cmathbb%7BR%7D%5Ed"> to bring it back</li>
</ul>
<p>The nonlinear functions effectively learn to “straighten” the manifold in latent space, making it more amenable to analysis and visualization.</p>
<p>It’s like being able to translate an ancient text written on a curved vase simply by “unwrapping” it digitally!</p>
</div>
</div>
</div>
<section id="the-nonlinear-archaeologists-advantage" class="level3">
<h3 class="anchored" data-anchor-id="the-nonlinear-archaeologists-advantage">The Nonlinear Archaeologist’s Advantage</h3>
<p>Imagine two archaeological sites with similar artifacts. A traditional archaeologist might classify them identically based on simple metrics. But our advanced neural archaeologist notices subtle nonlinear patterns.</p>
<p>Similarly, nonlinear autoencoders can distinguish between spectral patterns that would be indistinguishable to linear methods. They can capture:</p>
<ul>
<li><strong>Peak shifting</strong> - When peaks move slightly based on local environment</li>
<li><strong>Multiplicative interactions</strong> - When components don’t just add linearly</li>
<li><strong>Complex baselines</strong> - When background signals have complicated, nonlinear forms</li>
</ul>
<p>This is why, despite the elegance and interpretability of PCA, we still train these complex nonlinear beasts for real spectroscopic data. The archaeology of molecules is rarely a linear affair!</p>
</section>
</section>
<section id="the-probabilistic-excavation-variational-autoencoders" class="level2">
<h2 class="anchored" data-anchor-id="the-probabilistic-excavation-variational-autoencoders">The Probabilistic Excavation: Variational Autoencoders</h2>
<p>What if our archaeologist isn’t completely certain about what they’ve found? Enter <a href="https://jaan.io/what-is-variational-autoencoder-vae-tutorial/">the Variational Autoencoder (VAE)</a>—the probabilistic archaeologist who deals in uncertainties rather than absolutes.</p>
<div id="069ae57d" class="cell" data-execution_count="9">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb8-1"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">class</span> ProbabilisticArchaeologist(nn.Module):</span>
<span id="cb8-2">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">__init__</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, input_dim<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span>, latent_dim<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">32</span>):</span>
<span id="cb8-3">        <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">super</span>().<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">__init__</span>()</span>
<span id="cb8-4">        </span>
<span id="cb8-5">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Encoder produces distribution parameters</span></span>
<span id="cb8-6">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.encoder_base <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> nn.Sequential(</span>
<span id="cb8-7">            nn.Linear(input_dim, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">512</span>),</span>
<span id="cb8-8">            nn.ReLU(),</span>
<span id="cb8-9">            nn.Linear(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">512</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">256</span>),</span>
<span id="cb8-10">            nn.ReLU()</span>
<span id="cb8-11">        )</span>
<span id="cb8-12">        </span>
<span id="cb8-13">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Two outputs: mean and log-variance</span></span>
<span id="cb8-14">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.fc_mu <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> nn.Linear(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">256</span>, latent_dim)</span>
<span id="cb8-15">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.fc_logvar <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> nn.Linear(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">256</span>, latent_dim)</span>
<span id="cb8-16">        </span>
<span id="cb8-17">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Decoder reconstructs from samples</span></span>
<span id="cb8-18">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.decoder <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> nn.Sequential(</span>
<span id="cb8-19">            nn.Linear(latent_dim, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">256</span>),</span>
<span id="cb8-20">            nn.ReLU(),</span>
<span id="cb8-21">            nn.Linear(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">256</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">512</span>),</span>
<span id="cb8-22">            nn.ReLU(),</span>
<span id="cb8-23">            nn.Linear(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">512</span>, input_dim)</span>
<span id="cb8-24">        )</span>
<span id="cb8-25">    </span>
<span id="cb8-26">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> encode(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, x):</span>
<span id="cb8-27">        h <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.encoder_base(x)</span>
<span id="cb8-28">        mu <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.fc_mu(h)         <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># "I think the artifact is here"</span></span>
<span id="cb8-29">        logvar <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.fc_logvar(h)  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># "But I could be wrong by this much"</span></span>
<span id="cb8-30">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> mu, logvar</span>
<span id="cb8-31">    </span>
<span id="cb8-32">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> reparameterize(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, mu, logvar):</span>
<span id="cb8-33">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># The famous reparameterization trick</span></span>
<span id="cb8-34">        std <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.exp(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> logvar)</span>
<span id="cb8-35">        eps <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.randn_like(std)</span>
<span id="cb8-36">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> mu <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> eps <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> std</span>
<span id="cb8-37">    </span>
<span id="cb8-38">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> forward(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, x):</span>
<span id="cb8-39">        mu, logvar <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.encode(x)</span>
<span id="cb8-40">        z <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.reparameterize(mu, logvar)</span>
<span id="cb8-41">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.decoder(z), mu, logvar</span></code></pre></div></div>
</div>
<section id="manifold-cartography-the-kl-divergence-as-map-making" class="level3">
<h3 class="anchored" data-anchor-id="manifold-cartography-the-kl-divergence-as-map-making">Manifold Cartography: The KL Divergence as Map-Making</h3>
<p>Here’s where the VAE truly shines: it doesn’t just learn the manifold, it learns a <em>probabilistic</em> manifold with a well-behaved coordinate system. The VAE loss function has two terms:</p>
<p><img src="https://latex.codecogs.com/png.latex?%0A%5Cmathcal%7BL%7D_%7B%5Ctext%7BVAE%7D%7D%20=%20%5Cunderbrace%7B%5Cmathbb%7BE%7D_%7Bq_%5Cphi(z%7Cx)%7D%5B%5Clog%20p_%5Ctheta(x%7Cz)%5D%7D_%7B%5Ctext%7BReconstruction:%20Make%20it%20look%20right%7D%7D%20-%20%5Cunderbrace%7BD_%7B%5Ctext%7BKL%7D%7D(q_%5Cphi(z%7Cx)%20%5C%7C%20p(z))%7D_%7B%5Ctext%7BKL%20divergence:%20Keep%20it%20reasonable%7D%7D%0A"></p>
<p>The first part is our familiar reconstruction loss - “make the reconstruction look like the input.”</p>
<p>The second part is the Kullback-Leibler divergence, which measures how much our learned distribution <img src="https://latex.codecogs.com/png.latex?q_%5Cphi(z%7Cx)"> differs from a prior distribution <img src="https://latex.codecogs.com/png.latex?p(z)"> (typically a standard normal distribution).</p>
<div class="callout callout-style-default callout-important callout-titled">
<div class="callout-header d-flex align-content-center">
<div class="callout-icon-container">
<i class="callout-icon"></i>
</div>
<div class="callout-title-container flex-fill">
<span class="screen-reader-only">Important</span>Why the KL Term Matters for the Manifold
</div>
</div>
<div class="callout-body-container callout-body">
<p>The KL divergence term in VAEs serves multiple crucial purposes that make it perfect for learning manifolds:</p>
<ol type="1">
<li><p><strong>It creates a continuous latent space</strong>: By encouraging overlap between the distributions of similar data points, the KL term ensures that nearby points in input space map to overlapping regions in latent space. This creates a smooth manifold where interpolation makes sense.</p></li>
<li><p><strong>It regularizes the coordinate system</strong>: Without the KL term, the autoencoder could learn any arbitrary mapping that preserves information. The KL term acts like a cartographer imposing a standard coordinate system on a newly discovered land.</p></li>
<li><p><strong>It enables generative sampling</strong>: By forcing the aggregate posterior to match the prior distribution, we can sample from the prior and generate new data points that lie on the learned manifold - essentially “discovering” new artifacts that could plausibly exist.</p></li>
<li><p><strong>It prevents overfitting</strong>: The KL term acts as a complexity penalty that prevents the model from learning an overly complex mapping that might not generalize well.</p></li>
</ol>
</div>
</div>
<p>When applied to spectroscopic data, this is particularly powerful because:</p>
<ol type="1">
<li>We can generate new realistic spectra by sampling from the latent space</li>
<li>We can perform meaningful interpolation between spectra</li>
<li>We can quantify uncertainty in our representations</li>
</ol>
<div id="00cab27b" class="cell" data-execution_count="10">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb9-1"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> vae_loss(reconstruction, x, mu, logvar, beta<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1.0</span>):</span>
<span id="cb9-2">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""Calculate the VAE loss with reconstruction and KL terms"""</span></span>
<span id="cb9-3">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Reconstruction loss (how well does the output match the input?)</span></span>
<span id="cb9-4">    recon_loss <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> F.mse_loss(reconstruction, x, reduction<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'sum'</span>)</span>
<span id="cb9-5">    </span>
<span id="cb9-6">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># KL divergence (how much does our distribution differ from the prior?)</span></span>
<span id="cb9-7">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># For the standard normal prior, this has a nice closed form</span></span>
<span id="cb9-8">    kl_loss <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> torch.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">sum</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> logvar <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> mu.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">pow</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> logvar.exp())</span>
<span id="cb9-9">    </span>
<span id="cb9-10">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Total loss with β weighting</span></span>
<span id="cb9-11">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> recon_loss <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> beta <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> kl_loss</span></code></pre></div></div>
</div>
<p>By adjusting the β parameter, we can control the trade-off between reconstruction quality and the “niceness” of our latent space. Higher β values force the latent space to be more like a standard normal distribution, while lower values prioritize reconstruction accuracy.</p>
<p>This gives us a powerful tool for exploring the manifold of spectroscopic data - not just finding it, but mapping it in a way that makes it useful for generation, interpolation, and understanding the underlying physical parameters.</p>
</section>
</section>
<section id="conclusion-the-journey-continues" class="level2">
<h2 class="anchored" data-anchor-id="conclusion-the-journey-continues">Conclusion: The Journey Continues</h2>
<p>Our archaeological expedition through the world of autoencoders has revealed powerful tools for uncovering the hidden structure in spectroscopic data. We’ve seen how:</p>
<ol type="1">
<li>Linear autoencoders connect to classical methods like PCA</li>
<li>Nonlinear autoencoders can capture complex manifold structures</li>
<li>Variational autoencoders add a probabilistic perspective that enables generation and interpolation</li>
</ol>
<p>Just as archaeologists piece together ancient civilizations from fragments, we can piece together the underlying molecular and material properties from noisy, complex spectral data.</p>
<p>And just like archaeology, the field continues to evolve with new techniques and approaches. From graph neural networks to attention mechanisms to diffusion models, the tools for spectroscopic data analysis keep getting more sophisticated - allowing us to uncover ever more subtle patterns and relationships in our molecular artifacts.</p>
<p>So grab your digital trowel and start digging!</p>
<div id="cell-fig-vae" class="cell" data-execution_count="11">
<div class="cell-output cell-output-display">
<div id="fig-vae" class="quarto-float quarto-figure quarto-figure-center anchored">
<figure class="quarto-float quarto-float-fig figure">
<div aria-describedby="fig-vae-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
<img src="https://kjablonka.com/blog/posts/autencoder_spectroscopy/index_files/figure-html/fig-vae-output-1.png" width="1142" height="566" class="figure-img">
</div>
<figcaption class="quarto-float-caption-bottom quarto-float-caption quarto-float-fig" id="fig-vae-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
Figure&nbsp;3: Visualization of latent space sampling in a VAE, showing how we can generate new spectra.
</figcaption>
</figure>
</div>
</div>
</div>
<p>You can find a short lecture on this on <a href="https://youtu.be/fibGQX3nlM0?si=cWmN3VQnBLtEu5j2">YouTube</a>. <iframe width="560" height="315" src="https://www.youtube.com/embed/fibGQX3nlM0?si=sXc7Ne5f7mBSMITn" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen=""></iframe></p>


<!-- -->

</section>

 ]]></description>
  <category>machine-learning</category>
  <category>teaching</category>
  <guid>https://kjablonka.com/blog/posts/autencoder_spectroscopy/</guid>
  <pubDate>Tue, 22 Apr 2025 22:00:00 GMT</pubDate>
  <media:content url="https://kjablonka.com/blog/posts/autencoder_spectroscopy/cover.png" medium="image" type="image/png" height="96" width="144"/>
</item>
<item>
  <title>Notes on My Peer Review Process: An Invitation to Compare Practices</title>
  <link>https://kjablonka.com/blog/posts/reviewing/</link>
  <description><![CDATA[ 




<section id="how-i-approach-peer-review" class="level2">
<h2 class="anchored" data-anchor-id="how-i-approach-peer-review">How I Approach Peer Review</h2>
<p>Peer review is something most of us learn by doing, with little formal training. After stumbling through my early reviews, I’ve gradually developed practices that work for me. I’m sharing them here not as a model to follow, but to start a conversation about how we might all improve this critical part of science.</p>
<section id="impact-neutrality" class="level3">
<h3 class="anchored" data-anchor-id="impact-neutrality">Impact Neutrality</h3>
<p>I rarely suggest rejection. This comes from my attempt to be “impact neutral” in reviewing, an approach I’ve found useful after reading <a href="https://proteinsandwavefunctions.blogspot.com/2016/01/writing-impact-neutral-review.html">Jan Jensen’s thoughts</a> and seeing the <a href="https://en.wikipedia.org/wiki/PLOS_ONE#Publication_concept">PLoS ONE</a> model in action.</p>
<p>By “relatively impact neutral,” I mean I focus primarily on scientific soundness. I’ll still praise work I find particularly important and note when novelty seems lacking, but these observations inform rather than dictate my recommendations. I try to keep acceptance/rejection opinions out of my review’s main body. If there are forms, I do not fill them if I do not have to: knowledge work is notoriously difficult to measure <span class="citation" data-cites="drucker1999knowledge">(Drucker 1999)</span>, and the stepping stones that lead to discoveries often cannot be anticipated <span class="citation" data-cites="stanley2015why">(Stanley and Lehman 2015)</span>.</p>
</section>
<section id="looking-for-value" class="level3">
<h3 class="anchored" data-anchor-id="looking-for-value">Looking for Value</h3>
<p>Even in papers I initially find underwhelming, I deliberately search for strengths. What makes this work add to our existing knowledge? How could the authors better emphasize these aspects?</p>
<p>This isn’t just kindness—it’s also practical. By focusing on what works, I can help authors build on their strengths and often discover value I initially missed.</p>
</section>
<section id="making-feedback-actionable" class="level3">
<h3 class="anchored" data-anchor-id="making-feedback-actionable">Making Feedback Actionable</h3>
<p>Vague criticism helps no one. I try to make every comment actionable by offering specific suggestions and quoting the relevant text.</p>
</section>
<section id="maintaining-an-objective-tone" class="level3">
<h3 class="anchored" data-anchor-id="maintaining-an-objective-tone">Maintaining an Objective Tone</h3>
<p>Throughout my reviews, I write in an objective, non-judgmental tone. Critical analysis doesn’t require harsh language. Scientific evaluation can be thorough and rigorous while remaining respectful of the authors’ efforts and expertise.</p>
</section>
</section>
<section id="my-review-structure" class="level2">
<h2 class="anchored" data-anchor-id="my-review-structure">My Review Structure</h2>
<p>My reviews typically include:</p>
<ol type="1">
<li><strong>Summary</strong>: My understanding of the work, which helps authors see if I’ve missed something crucial.</li>
<li><strong>Major Points</strong>: Critical flaws in design, analysis, or unsupported claims.</li>
<li><strong>Minor Points</strong>: Suggestions that don’t affect the core message but would strengthen the paper.</li>
<li><strong>Reproducibility</strong>: Assessment of code and data availability.</li>
<li><strong>Limitations of Expertise</strong>: Areas where my knowledge is limited, particularly important for interdisciplinary work.</li>
</ol>
</section>
<section id="my-process" class="level2">
<h2 class="anchored" data-anchor-id="my-process">My Process</h2>
<p>Good reviews take time. My typical approach:</p>
<ol type="1">
<li>Read the manuscript thoroughly first.</li>
<li>Do a quick literature check using tools like <a href="https://paperqa.app/">PaperQA</a> or <a href="https://scholarqa.allen.ai/chat">Ai2 ScholarQA</a>.</li>
<li>Take a few days away to let thoughts settle.</li>
<li>Write the review.</li>
<li>Get feedback from a local LLM to check if I’ve followed my own guidelines.</li>
</ol>
<p>I’ve found acknowledging <a href="https://implicit.harvard.edu/implicit/takeatest.html">my own biases</a> helps me compensate for them. We all bring preferences to reviews—naming them doesn’t eliminate them but makes them visible.</p>
<p>I don’t review for publishers <a href="https://www.predatoryjournals.org/news/is-mdpi-predatory">I consider predatory, such as MDPI</a>.</p>
</section>
<section id="the-case-for-kindness-and-diversity" class="level2">
<h2 class="anchored" data-anchor-id="the-case-for-kindness-and-diversity">The Case for Kindness and Diversity</h2>
<p>Our field would benefit from more kindness in the review process. The harsh, dismissive tone of some reviews doesn’t improve science—it discourages innovative thinking and disproportionately impacts early-career researchers and those from underrepresented groups.</p>
<p>Similarly, greater diversity of thought would strengthen our collective work. When reviewers from varied backgrounds, methodological traditions, and theoretical perspectives evaluate research, we catch blind spots and identify new possibilities. Homogeneous reviewing leads to homogeneous science.</p>
<p>Both kindness and diversity ultimately serve the same goal: creating an environment where the best ideas can emerge, regardless of their source or how they challenge conventional thinking.</p>
<section id="on-anonymous-reviews" class="level3">
<h3 class="anchored" data-anchor-id="on-anonymous-reviews">On Anonymous Reviews</h3>
<p>At this point in my career, I don’t sign my reviews. This is a personal choice in a complex debate. While signed reviews might promote accountability, anonymous reviews can allow early-career scientists to evaluate work honestly without fear of repercussion, particularly when reviewing senior colleagues’ work (who might write letters for my tenure case). The power dynamics in science are real, and our review systems should acknowledge them.</p>
<p>I suspect that as our community evolves better practices around constructive criticism and reduces the career consequences of scholarly disagreement, more reviewers may feel comfortable signing their reviews. But we’re not there yet.</p>
</section>
</section>
<section id="open-questions" class="level2">
<h2 class="anchored" data-anchor-id="open-questions">Open Questions</h2>
<ul>
<li>I’m also interested in how the community might evolve publication models. Could approaches like <a href="https://yoshuabengio.org/2020/02/26/time-to-rethink-the-publication-process-in-machine-learning/">Bengio’s proposal of submitting to journals and conference chairs picking “interesting articles”</a> or <a href="https://aclrollingreview.org/">rolling reviews</a> address some current frustrations?</li>
<li>In an era of information overload, might versioned, updateable articles serve science better than our current static approach?</li>
</ul>


<!-- -->


</section>

<div id="quarto-appendix" class="default"><section class="quarto-appendix-contents" id="quarto-bibliography"><h2 class="anchored quarto-appendix-heading">References</h2><div id="refs" class="references csl-bib-body hanging-indent" data-entry-spacing="0">
<div id="ref-drucker1999knowledge" class="csl-entry">
Drucker, Peter F. 1999. <span>“Knowledge-Worker Productivity: The Biggest Challenge.”</span> <em>California Management Review</em> 41 (2): 79–94. <a href="https://doi.org/10.2307/41165987">https://doi.org/10.2307/41165987</a>.
</div>
<div id="ref-stanley2015why" class="csl-entry">
Stanley, Kenneth O., and Joel Lehman. 2015. <em>Why Greatness Cannot Be Planned: The Myth of the Objective</em>. Cham: Springer International Publishing. <a href="https://doi.org/10.1007/978-3-319-15524-1">https://doi.org/10.1007/978-3-319-15524-1</a>.
</div>
</div></section></div> ]]></description>
  <category>academia</category>
  <category>peer review</category>
  <category>scientific community</category>
  <guid>https://kjablonka.com/blog/posts/reviewing/</guid>
  <pubDate>Fri, 04 Apr 2025 22:00:00 GMT</pubDate>
</item>
<item>
  <title>Beyond the Era of Accidental Discovery</title>
  <link>https://kjablonka.com/blog/posts/materials_intelligence/</link>
  <description><![CDATA[ 




<p>The foundational challenge of materials science isn’t just creating new materials - it’s developing them systematically rather than by accident. For centuries, materials discovery has remained surprisingly artisanal despite its outsized impact on human civilization. The design of new materials is the bottleneck for solving many of society’s most pressing challenges, from sustainable energy to quantum computing.</p>
<section id="building-a-collective-scientific-intelligence" class="level2">
<h2 class="anchored" data-anchor-id="building-a-collective-scientific-intelligence">Building a Collective Scientific Intelligence</h2>
<p>One of the most tragic inefficiencies in science is how poorly we transfer experience. A PhD student spends 4-5 years developing deep experimental intuition about a specific material system or characterization technique. When they leave, most of that knowledge leaves with them.</p>
<p>A large opportunity lies in general-purpose models and alignment approaches that can:</p>
<ul>
<li>Learn from unstructured experimental data across different modalities</li>
<li>Bridge the gap between synthesis conditions and material properties</li>
<li>Surface non-obvious connections between seemingly unrelated research areas</li>
</ul>
<p>The technical breakthrough enabling this is our ability to simultaneously handle:</p>
<ol type="1">
<li>Synthesis protocols (as structured text and process graphs)</li>
<li>Characterization data (spectroscopy, microscopy, diffraction)</li>
<li>Property measurements (electronic, mechanical, optical)</li>
<li>Theoretical calculations (DFT, molecular dynamics)</li>
</ol>
</section>
<section id="expert-councils-beyond-single-models" class="level2">
<h2 class="anchored" data-anchor-id="expert-councils-beyond-single-models">Expert Councils: Beyond Single Models</h2>
<p>We do not only want to have the average representation of materials data - we need specialized expertise for various topics and the ability to let these experts interact. This mirrors how human experts work together, bringing different perspectives and expertise to complex problems.</p>
<p>The key is bootstrapping specialized models using:</p>
<ul>
<li>Integration with physics-based simulations</li>
<li>Iterative refinement through experimental feedback</li>
<li>Domain-specific inductive biases that constrain the solution space</li>
<li>Validation through robust tools and theoretical frameworks</li>
</ul>
<p>For example, we can: - Generate feedback through simulations and experiments - Use iterative training approaches similar to Beyond A* - Constrain function spaces using inductive biases - Hand over specific predictive tasks to specialized architectures</p>
<p>The specialized models can be bootstrapped with information from general-purpose models, making them more data-efficient while maintaining domain expertise.</p>
</section>
<section id="guiding-discovery-through-interestingness" class="level2">
<h2 class="anchored" data-anchor-id="guiding-discovery-through-interestingness">Guiding Discovery Through “Interestingness”</h2>
<p>Optimizations - or searches through materials space - are often compared with finding a needle in a haystack. Some try to design ML approaches as a “magnet” or “filter” to more efficiently find the needle. This could not be more misguided for two reasons:</p>
<ol type="1">
<li>We often don’t even know what we’re looking for (we often cannot define what metrics would be important before we have the solution)</li>
<li>Looking for a needle in a haystack suggests searching through an unstructured space, but materials space has rich patterns we can exploit</li>
</ol>
<p>Instead, we’re developing ways to identify scientifically promising directions through:</p>
<ul>
<li>Novelty detection that can spot meaningful deviations from known patterns</li>
<li>Uncertainty quantification that highlights areas where models disagree</li>
<li>Causal reasoning that can extract mechanistic insights</li>
</ul>


<!-- -->

</section>

 ]]></description>
  <category>science</category>
  <guid>https://kjablonka.com/blog/posts/materials_intelligence/</guid>
  <pubDate>Tue, 04 Feb 2025 23:00:00 GMT</pubDate>
</item>
<item>
  <title>10 Reasons to Aim Higher. And Higher.</title>
  <link>https://kjablonka.com/blog/posts/harder/</link>
  <description><![CDATA[ 




<p>I aim high. Most often, I don’t reach there, but here are ten reasons why I still try:</p>
<ol type="1">
<li><p>It is fun.</p></li>
<li><p>It gives me freedom. Focusing on “low-hanging fruits” or shorter-term goals shrinks the solution space. Aiming further, my solution space expands and easily accommodates detours.</p></li>
<li><p>It demands creativity. Most challenging problems can’t be solved with existing tools, forcing creative solutions.</p></li>
<li><p><a href="https://karpathy.github.io/2016/09/07/phd/">It often isn’t that much harder.</a></p></li>
<li><p><a href="https://paulgraham.com/wealth.html">Something that is hard for you likely is impossible for your competitor.</a></p></li>
<li><p>It lets me ignore noise. With a longer path ahead, I can tune out distractions (like most new papers) along the way.</p></li>
<li><p>It allows me to relax. Oliver Burkeman <a href="https://www.penguin.co.uk/books/456705/meditations-for-mortals-by-burkeman-oliver/9781847927613">references</a> <a href="https://en.wikipedia.org/wiki/Houn_Jiyu-Kennett">Hōun Jiyu-Kennett</a> who taught by making tasks so demanding that students stop struggling, relax, and then accomplish more.</p></li>
<li><p>Even failing might leave you with something quite impactful.</p></li>
<li><p>Growth only happens by crossing boundaries.</p></li>
<li><p>It changes your default.</p></li>
</ol>
<blockquote class="blockquote">
<p>Whatever you can do or dream you can, begin it; Boldness has genius, power, and magic in it.</p>
<p>– <a href="https://quoteinvestigator.com/2016/02/09/boldness/">Often attributed to Goethe</a></p>
</blockquote>


<!-- -->


 ]]></description>
  <category>life</category>
  <guid>https://kjablonka.com/blog/posts/harder/</guid>
  <pubDate>Sat, 01 Feb 2025 23:00:00 GMT</pubDate>
</item>
<item>
  <title>Dear Claude: Are We Getting Too Close?</title>
  <link>https://kjablonka.com/blog/posts/ai_thinking/</link>
  <description><![CDATA[ 




<p>Lately, I have been wondering a lot about what the biggest impact of generative models on science and society can be.</p>
<p>While <a href="https://kjablonka.com/blog/posts/why_llm/">I see many upsides</a>, I am also very puzzled and concerned by things happening to myself and many around me.</p>
<p>This week, the Atlantic ran an outstanding piece by Derek Thompson on <a href="https://www.theatlantic.com/magazine/archive/2025/02/american-loneliness-personality-politics/681091/">The anti-social century</a>, and last year, Kevin Roose <a href="https://www.nytimes.com/2024/12/13/technology/claude-ai-anthropic.html">described how some people now run everything, every decision, every thought through Claude</a> and perhaps talk more to Claude to their friends.</p>
<blockquote class="twitter-tweet blockquote">
<p lang="en" dir="ltr">
i'm starting to see differences between those who have integrated claude deeply into their lives and those who haven't. its still too early for me to put words on it… i think the ones who have feel better supported? it's been ~universally healthy so far from what i can tell
</p>
— Nick (<span class="citation" data-cites="nickcammarata">(<strong>nickcammarata?</strong>)</span>) <a href="https://twitter.com/nickcammarata/status/1862000614508777779?ref_src=twsrc%5Etfw">November 28, 2024</a>
</blockquote>
<script async="" src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
<p>I also talk to Claude a lot. I asked Claude to review this post critically. I ask it to do the same for most of my writing. Many of my friends and colleagues do the same. Basically, all of the students I work with do talk to Claude. However, I am nervous about how some of us use it.</p>
<blockquote class="bluesky-embed blockquote" data-bluesky-uri="at://did:plc:u3o6qejoryiezdwepzbl4hqm/app.bsky.feed.post/3lfiu7ewr6k23" data-bluesky-cid="bafyreie7pp4xrcaibuxgoshgdycd2ffzatqbqnezaplimji5g25hswicjy">
<p lang="en">
I had big hopes for the application of AI to education. I saw it as one of the most important problems of our time. I did not expect there could be a possibility that AI would not only keep students ignorant, but in fact make them fundamentally incapable of learning anything
</p>
— François Chollet (<a href="https://bsky.app/profile/did:plc:u3o6qejoryiezdwepzbl4hqm?ref_src=embed"><span class="citation" data-cites="fchollet.bsky.social">(</span></a><strong>fchollet.bsky.social?</strong>)) <a href="https://bsky.app/profile/did:plc:u3o6qejoryiezdwepzbl4hqm/post/3lfiu7ewr6k23?ref_src=embed">Jan 12, 2025 at 12:26 AM</a>
</blockquote>
<script async="" src="https://embed.bsky.app/static/embed.js" charset="utf-8"></script>
<p>We have a tool in our hands that could do so much good. We could provide everyone with a personal tutor. We could use the models to bounce off ideas, think more critically, find loopholes, and brainstorm new ideas.</p>
<p>The challenge isn’t just technological - it’s deeply human: perhaps our human nature makes it too tempting to take shortcuts <span class="citation" data-cites="Easter2021-gx">(Easter 2021)</span>. To just directly let Claude solve a coding or homework problem or to just let Claude be the best friend.</p>
<div class="page-columns page-full"><p>This is worrying because, as the models continue to saturate all our benchmarks,  <a href="https://fs.blog/why-write/">the marginal value of interesting, clear, and wise thought increases</a>. “Low hanging fruit” test solving and knowledge retrieval are being commoditized - but we still need people who can set the agenda and push thought beyond the current frontiers.</p><div class="no-row-height column-margin column-container"><span class="margin-aside">Very interestingly, in creating <a href="https://arxiv.org/abs/2404.01475">our own benchmark for chemistry</a> our limiting factor was human ingenuity and knowledge in coming up with questions that are challenging enough for the models.</span></div></div>
<p>Being able to do so requires a <a href="https://paulgraham.com/know.html">broad foundation of mental models</a> and playful curiosity <span class="citation" data-cites="Feynman1985-gy">(Feynman and Leighton 1985)</span>.</p>
<p>To me, one of the big challenges is how we can ensure most people and our students use generative AI as cointelligence <span class="citation" data-cites="Mollick2024-kq Mollick_2024">(E. Mollick 2024; E. R. Mollick and Mollick 2024)</span> and not as a replacement for their own thought. <a href="https://www.oneusefulthing.org/p/post-apocalyptic-education">As Ethan Mollick pointedly observed: Education is hard.</a> Growth is hard - but this is the point of it.</p>
<p>Perhaps we need to do a better job of showing <a href="https://paulgraham.com/hwh.html">the value of going through the grind</a> and the fun it takes. <a href="https://paulgraham.com/writes.html">And that shortcuts make us miss most of the journey.</a> Perhaps we need to emphasize process over outcomes and reward original thinking over execution.</p>
<p>Learning, thinking, and talking to others <span class="citation" data-cites="Yanai_2024">(Yanai and Lercher 2024)</span> is where the real magic happens — most of my best projects emerged from seemingly random discussions about seemingly unrelated topics (which some of the new <a href="https://www.firstthings.com/article/2020/03/secular-monks">secular monks</a> might see as a waste of time).</p>
<p>The shortcuts AI offers might save time, but they could cost us something far more valuable: our capacity for genuine intellectual and personal growth and connection.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://kjablonka.com/blog/posts/ai_thinking/claude_at_date.png" class="img-fluid figure-img"></p>
<figcaption>I am sure that some ask Claude for help in all situations of their life and there are situations like this out in the wild. Image generated with getimg.ai</figcaption>
</figure>
</div>




<div id="quarto-appendix" class="default"><section class="quarto-appendix-contents" id="quarto-bibliography"><h2 class="anchored quarto-appendix-heading">References</h2><div id="refs" class="references csl-bib-body hanging-indent" data-entry-spacing="0">
<div id="ref-Easter2021-gx" class="csl-entry">
Easter, Michael. 2021. <em>The Comfort Crisis</em>. Emmaus, PA: Rodale Books.
</div>
<div id="ref-Feynman1985-gy" class="csl-entry">
Feynman, Richard P, and Ralph Leighton. 1985. <em>Surely You’re Joking, <span>Mr.Feynman</span>!</em> Edited by Edward Hutchings. New York, NY: WW Norton.
</div>
<div id="ref-Mollick2024-kq" class="csl-entry">
Mollick, Ethan. 2024. <em>Co-Intelligence</em>. New York, NY: Portfolio.
</div>
<div id="ref-Mollick_2024" class="csl-entry">
Mollick, Ethan R., and Lilach Mollick. 2024. <span>“Instructors as Innovators: A Future-Focused Approach to New AI Learning Opportunities, with Prompts.”</span> <em>SSRN Electronic Journal</em>. <a href="https://doi.org/10.2139/ssrn.4802463">https://doi.org/10.2139/ssrn.4802463</a>.
</div>
<div id="ref-Yanai_2024" class="csl-entry">
Yanai, Itai, and Martin J. Lercher. 2024. <span>“It Takes Two to Think.”</span> <em>Nature Biotechnology</em> 42 (1): 18–19. <a href="https://doi.org/10.1038/s41587-023-02074-2">https://doi.org/10.1038/s41587-023-02074-2</a>.
</div>
</div></section></div> ]]></description>
  <category>llm</category>
  <category>academia</category>
  <category>society</category>
  <guid>https://kjablonka.com/blog/posts/ai_thinking/</guid>
  <pubDate>Sat, 11 Jan 2025 23:00:00 GMT</pubDate>
  <media:content url="https://kjablonka.com/blog/posts/ai_thinking" medium="image"/>
</item>
<item>
  <title>Thinking aloud about the shape of scientific data</title>
  <link>https://kjablonka.com/blog/posts/data_shape/</link>
  <description><![CDATA[ 




<section id="introduction" class="level1 page-columns page-full">
<h1>Introduction</h1>
<p>In many scientific fields, we are witnessing the emergence of “foundation models” - a term that, while widely used, often lacks precise definition. For our purposes, we consider foundation models to be those that can be readily adapted to diverse tasks within a domain, serving as a foundation for modeling various phenomena.</p>
<div class="page-columns page-full"><p>In chemistry, we observe two parallel trends. On one side, there’s growing enthusiasm for general-purpose large language models (LLMs), with some arguing that “The future of chemistry is language” <span class="citation" data-cites="White_2023">(White 2023)</span> - a perspective I largely share. Simultaneously, we see the development of specialized foundation models, such as MACE-MP <span class="citation" data-cites="batatia2024foundationmodelatomisticmaterials">(Batatia et al. 2024)</span> for molecular simulations and AlphaFold <span class="citation" data-cites="Abramson_2024">(Abramson et al. 2024)</span> for protein structure prediction .</p><div class="no-row-height column-margin column-container"><span class="margin-aside">Even though it is very interesting to ponder that <a href="https://harrisbio.substack.com/p/alphafold3-a-foundation-model-for">some equivariance features were thrown out in AF3 — in favor of scale</a>, which one might think of as the <a href="http://www.incompleteideas.net/IncIdeas/BitterLesson.html">Bitter lesson</a> hitting again.</span></div></div>
<p>This duality raises a crucial question: “When should we invest in specialized architectures that incorporate domain knowledge, and when might general-purpose approaches be more effective?” The question becomes particularly relevant as we observe both specialized models achieving remarkable success and general-purpose LLMs demonstrating unexpected capabilities across scientific domains.</p>
<div class="page-columns page-full"><p>In my research group, we’ve focused on applying general-purpose LLMs to chemistry - an approach that might seem counterintuitive. Here, I attempt a systematic (though admittedly preliminary) analysis of when different modeling approaches might be most appropriate by examining the fundamental structure of scientific data spaces. </p><div class="no-row-height column-margin column-container"><span class="margin-aside">Big parts of this discussion are inspired by the excellent <a href="https://towardsdatascience.com/the-road-to-biology-2-0-will-pass-through-black-box-data-bbd00fabf959">Biology 2.0 post from Michael Bronstein and Luca Naef</a>.</span></div></div>
</section>
<section id="the-shape-of-scientific-data" class="level1 page-columns page-full">
<h1>The Shape of Scientific Data</h1>
<p>To understand why different modeling approaches succeed or fail, we need to examine the inherent structure of their data spaces. We’ll focus on four fundamental types of scientific data that represent distinct points along the spectrum of structure and complexity: molecular properties (governed by physical laws), chemical experiments (complex real-world scenarios), biological sequences (shaped by evolution), and code (human-created structure).</p>
<table class="caption-top table">
<colgroup>
<col style="width: 20%">
<col style="width: 20%">
<col style="width: 20%">
<col style="width: 20%">
<col style="width: 20%">
</colgroup>
<thead>
<tr class="header">
<th style="text-align: left;">Aspect</th>
<th style="text-align: left;">Molecular Properties</th>
<th style="text-align: left;">Chemical Experiments</th>
<th style="text-align: left;">Biological Sequences</th>
<th style="text-align: left;">Code</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td style="text-align: left;">State Space</td>
<td style="text-align: left;"><img src="https://latex.codecogs.com/png.latex?%5Cpsi%20%5Cin%20L%5E2(%5Cmathbb%7BR%7D%5E%7B3N%7D)"></td>
<td style="text-align: left;"><img src="https://latex.codecogs.com/png.latex?%5Cmathcal%7BR%7D(t)=%5C%7B(c_i,%20n_i,%20p_i)%5C%7D"></td>
<td style="text-align: left;"><img src="https://latex.codecogs.com/png.latex?%5C%7B0,1,%5Cldots,k%5C%7D%5En"></td>
<td style="text-align: left;">Discrete tree <img src="https://latex.codecogs.com/png.latex?%5Cmathcal%7BT%7D"></td>
</tr>
<tr class="even">
<td style="text-align: left;">Governing Distribution</td>
<td style="text-align: left;"><img src="https://latex.codecogs.com/png.latex?P(%5Cpsi)%20%5Cpropto%20e%5E%7B-%5Cbeta%20E%5B%5Cpsi%5D%7D"></td>
<td style="text-align: left;">Complex, multi-modal</td>
<td style="text-align: left;"><img src="https://latex.codecogs.com/png.latex?%5Clog%20P(s)%20%5Cpropto%20f(s)"></td>
<td style="text-align: left;"><img src="https://latex.codecogs.com/png.latex?P(%5Ctext%7Bcode%7D)%20=%20P(%5Ctext%7Bsyntax%7D)%20%5Ccdot%20P(%5Ctext%7Bsemantics%7D%5C%7C%5Ctext%7Bsyntax%7D)"></td>
</tr>
<tr class="odd">
<td style="text-align: left;">Structure-to-Noise Ratio</td>
<td style="text-align: left;">High</td>
<td style="text-align: left;">Low</td>
<td style="text-align: left;">Medium</td>
<td style="text-align: left;">Very High</td>
</tr>
<tr class="even">
<td style="text-align: left;">Reproducibility</td>
<td style="text-align: left;">Very high</td>
<td style="text-align: left;">Low</td>
<td style="text-align: left;">High</td>
<td style="text-align: left;">Perfect</td>
</tr>
<tr class="odd">
<td style="text-align: left;">Hidden Variables</td>
<td style="text-align: left;">No/Few</td>
<td style="text-align: left;">Many</td>
<td style="text-align: left;">No/Few</td>
<td style="text-align: left;">No</td>
</tr>
<tr class="even">
<td style="text-align: left;">Validation</td>
<td style="text-align: left;">Physical laws</td>
<td style="text-align: left;">Empirical</td>
<td style="text-align: left;">Functional tests</td>
<td style="text-align: left;">Compiler</td>
</tr>
<tr class="odd">
<td style="text-align: left;">Causality</td>
<td style="text-align: left;">Quantum mechanics</td>
<td style="text-align: left;">Partially hidden</td>
<td style="text-align: left;">Evolutionary force</td>
<td style="text-align: left;">Explicit</td>
</tr>
</tbody>
</table>
<p>Let’s examine each domain in detail to understand why they might require different modeling approaches: Here’s Part 2, continuing with the domain-specific sections:</p>
<section id="molecular-properties" class="level2">
<h2 class="anchored" data-anchor-id="molecular-properties">Molecular Properties</h2>
<p>The quantum mechanical description of molecular properties provides perhaps the cleanest example of a structured scientific data space. Here, the state space is described by wavefunctions <img src="https://latex.codecogs.com/png.latex?%5Cpsi%20%5Cin%20L%5E2(%5Cmathbb%7BR%7D%5E%7B3N%7D)">, representing the quantum state of N particles in three-dimensional space. Several key characteristics make this domain particularly amenable to specialized models:</p>
<ul>
<li><strong>High Structure-to-Noise Ratio</strong>: The underlying physics is well-understood and deterministic (up to quantum mechanical uncertainties)</li>
<li><strong>Clear Symmetries</strong>: Physical laws impose translational and rotational invariance, providing strong inductive biases for model design</li>
<li><strong>Few Hidden Variables</strong>: All molecular properties can, in principle, be determined from the wavefunction, requiring only atomic positions and types as input</li>
<li><strong>Perfect Reproducibility</strong>: While numerical implementations introduce some noise, quantum mechanical measurements are fully determined by their corresponding operators</li>
</ul>
</section>
<section id="chemical-experiments" class="level2">
<h2 class="anchored" data-anchor-id="chemical-experiments">Chemical Experiments</h2>
<p>Chemical experiments present a striking contrast. Despite being fundamentally governed by quantum mechanics, “real world” experimental chemistry introduces numerous complexities:</p>
<ul>
<li><strong>Complex State Space</strong>: While we can represent basic parameters as <img src="https://latex.codecogs.com/png.latex?%5Cmathcal%7BR%7D(t)=%5C%7B(c_i,%20n_i,%20p_i)%5C%7D"> (concentrations, stoichiometry, phase information), many crucial variables remain hidden</li>
<li><strong>Low Structure-to-Noise Ratio</strong>: Hidden features and their interactions lead to high variability in outcomes</li>
<li><strong>Hidden Variables</strong>: Critical factors often go unrecorded or unrecognized (impurities, atmospheric conditions, surface effects) and might only be implicitly captured in experimental protocols</li>
<li><strong>Limited Reproducibility</strong>: Even carefully controlled experiments may yield different results due to uncontrolled variables</li>
</ul>
</section>
<section id="biological-sequences" class="level2 page-columns page-full">
<h2 class="anchored" data-anchor-id="biological-sequences">Biological Sequences</h2>
<div class="page-columns page-full"><p>Biological sequences present a unique case where their distribution in sequence space (<img src="https://latex.codecogs.com/png.latex?%5C%7B0,1,%5Cldots,k%5C%7D%5En">) is shaped by evolution, creating a direct link between sequence distribution and fitness <span class="citation" data-cites="Sella_2005">(Sella and Hirsh 2005)</span>. </p><div class="no-row-height column-margin column-container"><span class="margin-aside">Notably, such a driving force does not exist in chemistry, where the space of synthetic molecules seems mostly shaped by human imagination.</span></div></div>
<ul>
<li><strong>Medium Structure-to-Noise Ratio</strong>: Evolution provides underlying structure, while neutral mutations introduce noise</li>
<li><strong>Clear Alphabet</strong>: Fixed set of building blocks (amino acids, nucleotides) constrains the possible space</li>
<li><strong>Evolutionary Causality</strong>: Natural selection provides a clear driving force for sequence distributions</li>
<li><strong>High Reproducibility</strong>: Modern sequence determination is highly reliable</li>
</ul>
</section>
<section id="code" class="level2">
<h2 class="anchored" data-anchor-id="code">Code</h2>
<p>Programming languages represent a fascinating case of highly structured but human-created information:</p>
<ul>
<li><strong>Discrete, Tree-like Structure</strong>: Abstract syntax trees provide clear organization</li>
<li><strong>Perfect Reproducibility</strong>: Same input consistently produces the same output</li>
<li><strong>Explicit Causality</strong>: Control flow and data dependencies are explicit</li>
<li><strong>Human-Created Rules</strong>: Unlike physical laws, programming language rules are human-designed and well-documented</li>
<li><strong>Rich Training Data</strong>: Vast amounts of self-documenting code examples and error messages are available</li>
</ul>
<p>Here’s Part 3, continuing with the implications and conclusions:</p>
</section>
</section>
<section id="implications-for-model-choice" class="level1">
<h1>Implications for Model Choice</h1>
<p>Our analysis of data spaces reveals a nuanced framework for choosing modeling approaches, one that goes beyond simple metrics to consider the fundamental nature of structure in each domain.</p>
<section id="the-structure-to-noise-ratio-and-types-of-structure" class="level2">
<h2 class="anchored" data-anchor-id="the-structure-to-noise-ratio-and-types-of-structure">The Structure-to-Noise Ratio and Types of Structure</h2>
<p>We can (somewhat handwavily) formalize the structure-to-noise ratio as:</p>
<p><img src="https://latex.codecogs.com/png.latex?%20R%20=%20%5Cfrac%7B%5Ctext%7Bstructured%5C_information%7D%7D%7B%5Ctext%7Bunstructured%5C_variation%7D%7D%20"></p>
<p>However, this ratio alone is insufficient. We must distinguish between fundamentally different types of structure:</p>
<ol type="1">
<li><strong>Physical/Mathematical Structure</strong> (Molecular Properties):
<ul>
<li>Governed by immutable natural laws</li>
<li>Benefits from explicit architectural enforcement</li>
<li>Can data-efficiently be handled by specialized architectures (e.g., equivariant neural networks)</li>
</ul></li>
<li><strong>Human-Created Structure</strong> (Code):
<ul>
<li>Well-documented in training data</li>
<li>Can be learned statistically</li>
<li>Amenable to general-purpose models like LLMs</li>
</ul></li>
<li><strong>Mixed or Emergent Structure</strong> (Biological Sequences):
<ul>
<li>Combines physical constraints with evolutionary patterns</li>
<li>Benefits from hybrid approaches</li>
</ul></li>
</ol>
<p>This refined view explains several observed patterns in scientific machine learning:</p>
<ol type="1">
<li><strong>Domains with Physical Structure</strong> (Molecular Properties):
<ul>
<li>Specialized architectures effectively leverage conservation laws and symmetries</li>
<li>Investment in domain-specific inductive biases pays off</li>
<li>Example: Equivariant neural networks for molecular properties</li>
</ul></li>
<li><strong>Domains with Human-Created Structure</strong> (Code):
<ul>
<li>General-purpose models can learn patterns effectively</li>
<li>Benefit from large amounts of self-documenting training data</li>
<li>Example: LLMs for code generation</li>
</ul></li>
<li><strong>Low-Structure Domains</strong> (Chemical Experiments):
<ul>
<li>General-purpose models may be more effective (as we do not even know what inductive biases to design and many factors are hidden/implicit)</li>
<li>Pattern recognition and statistical approaches shine</li>
<li>Example: LLMs leveraging implicit knowledge from literature</li>
</ul></li>
<li><strong>Mixed-Structure Domains</strong> (Biological Sequences):
<ul>
<li>Hybrid approaches combining structure and statistics work well</li>
<li>Balance between specialized architectures and statistical power</li>
<li>Example: AlphaFold’s combination of structural constraints with evolutionary information</li>
</ul></li>
</ol>
</section>
<section id="the-role-of-hidden-variables" class="level2">
<h2 class="anchored" data-anchor-id="the-role-of-hidden-variables">The Role of Hidden Variables</h2>
<p>The presence and nature of hidden variables significantly impacts model choice:</p>
<ul>
<li><strong>Few Hidden Variables</strong>: Enables direct modeling with specialized architectures</li>
<li><strong>Many Unknown Hidden Variables</strong>: Benefits from models that can learn representations from data</li>
</ul>
</section>
</section>
<section id="conclusions" class="level1">
<h1>Conclusions</h1>
<p>This analysis, while admittedly preliminary, provides a framework for understanding when to apply specialized versus general-purpose models in scientific domains. The choice appears guided by three key factors:</p>
<ol type="1">
<li>The type of structure present (physical, human-created, or mixed)</li>
<li>The structure-to-noise ratio</li>
<li>The presence and nature of hidden variables</li>
</ol>
<p>In domains with physical structure and few hidden variables, specialized architectures can effectively leverage domain knowledge. However, in domains with human-created structure or many hidden variables, general-purpose models may be more appropriate. This explains why our group remains optimistic about applying LLMs to chemistry - the complexity and hidden variables in chemical experiments might make them particularly suitable for statistical pattern recognition through large language models.</p>



</section>

<div id="quarto-appendix" class="default"><section class="quarto-appendix-contents" id="quarto-bibliography"><h2 class="anchored quarto-appendix-heading">References</h2><div id="refs" class="references csl-bib-body hanging-indent" data-entry-spacing="0">
<div id="ref-Abramson_2024" class="csl-entry">
Abramson, Josh, Jonas Adler, Jack Dunger, Richard Evans, Tim Green, Alexander Pritzel, Olaf Ronneberger, et al. 2024. <span>“Accurate Structure Prediction of Biomolecular Interactions with AlphaFold 3.”</span> <em>Nature</em> 630 (8016): 493–500. <a href="https://doi.org/10.1038/s41586-024-07487-w">https://doi.org/10.1038/s41586-024-07487-w</a>.
</div>
<div id="ref-batatia2024foundationmodelatomisticmaterials" class="csl-entry">
Batatia, Ilyes, Philipp Benner, Yuan Chiang, Alin M. Elena, Dávid P. Kovács, Janosh Riebesell, Xavier R. Advincula, et al. 2024. <span>“A Foundation Model for Atomistic Materials Chemistry.”</span> <a href="https://arxiv.org/abs/2401.00096">https://arxiv.org/abs/2401.00096</a>.
</div>
<div id="ref-Sella_2005" class="csl-entry">
Sella, Guy, and Aaron E. Hirsh. 2005. <span>“The Application of Statistical Physics to Evolutionary Biology.”</span> <em>Proceedings of the National Academy of Sciences</em> 102 (27): 9541–46. <a href="https://doi.org/10.1073/pnas.0501865102">https://doi.org/10.1073/pnas.0501865102</a>.
</div>
<div id="ref-White_2023" class="csl-entry">
White, Andrew D. 2023. <span>“The Future of Chemistry Is Language.”</span> <em>Nature Reviews Chemistry</em> 7 (7): 457–58. <a href="https://doi.org/10.1038/s41570-023-00502-0">https://doi.org/10.1038/s41570-023-00502-0</a>.
</div>
</div></section></div> ]]></description>
  <category>llm</category>
  <guid>https://kjablonka.com/blog/posts/data_shape/</guid>
  <pubDate>Wed, 08 Jan 2025 23:00:00 GMT</pubDate>
  <media:content url="https://kjablonka.com/blog/posts/data_shape" medium="image"/>
</item>
<item>
  <title>A wise bird</title>
  <link>https://kjablonka.com/blog/posts/a_wise_owl/</link>
  <description><![CDATA[ 




<p>As I reflect on the past and on the upcoming year, I really enjoyed <a href="https://en.wikipedia.org/wiki/A_Wise_Old_Owl">Rockefeller’s favorite poem</a> <span class="citation" data-cites="Housel2020-tx">(Housel 2020)</span>.</p>
<blockquote class="blockquote">
<p>A wise old owl lived in an oak,</p>
<p>The more he saw, the less he spoke</p>
<p>The less he spoke, the more he heard,</p>
<p>Now, wasn’t he a wise old bird?</p>
</blockquote>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://kjablonka.com/blog/posts/a_wise_owl/owl.jpeg" class="img-fluid figure-img"></p>
<figcaption>We need more wise owls.</figcaption>
</figure>
</div>


<!-- -->



<div id="quarto-appendix" class="default"><section class="quarto-appendix-contents" id="quarto-bibliography"><h2 class="anchored quarto-appendix-heading">References</h2><div id="refs" class="references csl-bib-body hanging-indent" data-entry-spacing="0">
<div id="ref-Housel2020-tx" class="csl-entry">
Housel, Morgan. 2020. <em>The Psychology of Money</em>. Petersfield, England: Harriman House Publishing.
</div>
</div></section></div> ]]></description>
  <category>life</category>
  <guid>https://kjablonka.com/blog/posts/a_wise_owl/</guid>
  <pubDate>Sat, 04 Jan 2025 23:00:00 GMT</pubDate>
</item>
<item>
  <title>Trust Me, There’s a Method to This Madness</title>
  <link>https://kjablonka.com/blog/posts/why_llm/</link>
  <description><![CDATA[ 




<p>Even though I work in a chemistry department, much of the recent work in my team has been focused on Large Language Models (LLMs) - or, more generally, frontier models. This isn’t a departure from chemistry; rather, we believe these could be crucial building blocks for solving some of the most fundamental problems in chemistry and materials science.</p>
<p><a href="https://calvinball.substack.com/p/a-preliminary-roadmap-for-ai-assisted">Sam Rodrigues</a> (as so often) put it best: Science is about doing things for the first time. What’s remarkable is that recent frontier models show sparks of an ability to perform impressive tasks they weren’t explicitly trained for. More importantly, they’re showing promising capabilities in developing what scientists have long considered crucial: good taste in choosing what is interesting. <span class="citation" data-cites="zhang2024omniopenendednessmodelshuman">(Zhang et al. 2024)</span> This intuition, traditionally developed through years of experience, can now be augmented by models that have synthesized patterns from vast amounts of scientific literature and data.</p>
<p>One of the most striking inefficiencies in academic research is how knowledge dissipates: when a PhD student leaves after four years in the lab, their accumulated experience often vanishes with them. Imagine if we could capture and share all this tacit knowledge - the failed experiments, the subtle technique adjustments, the unwritten rules - through training models on lab notes and conversations <span class="citation" data-cites="Jablonka_2022">(Jablonka, Patiny, and Smit 2022)</span>.</p>
<p>While recent research suggests that language isn’t necessarily used for reasoning <span class="citation" data-cites="reasoning">(Fedorenko, Piantadosi, and Gibson 2024)</span>, its flexibility makes it an unparalleled tool for communicating ideas, methods, and observations (just look at how synthesis protocols are reported). Yes, schemas, figures, and equations are crucial, but language remains our most versatile medium - and with multimodal approaches, we’re pushing to combine the best of all worlds <span class="citation" data-cites="alampara2024probinglimitationsmultimodallanguage">(Alampara et al. 2024)</span>. (And there are tons of things for which we will need to go beyond naively treating everything as text <span class="citation" data-cites="alampara2024mattextlanguagemodelsneed">(Alampara, Miret, and Jablonka 2024)</span>).</p>
<p>However, the practical impact is already visible: tasks that once required a PhD thesis can now be accomplished within a Master’s project. During my PhD, training a model for a novel application without existing datasets would have consumed my entire PhD. Now, our team routinely collects custom datasets for new applications <span class="citation" data-cites="Schilling_Wilhelmi_2025">(Schilling-Wilhelmi et al. 2025)</span>. This scalability is crucial because science is inherently long-tailed: breakthrough innovations often emerge from unexpected corners of research and we have so many different instrument, techniques, questions that only a scalable technique can have a shot at capturing any of it.</p>
<p>Similarly, there have been tons of efforts in developing ontologies, defining APIs, and how to talk between different systems, and <a href="https://madices.github.io">I have been involved in those efforts</a>. But, I more and more come to the belief that we might be better off (at least for the long tail) just by letting models figure out how to talk to different things and build new tools in this way. Tools, are the way science progresses. As Sydney Brenner noted, “Progress in science depends on new techniques, new discoveries and new ideas, probably in that order” <span class="citation" data-cites="Robertson_1980 Dyson_2012">(Robertson 1980; Dyson 2012)</span>.</p>
<p>However, working with these models daily also <a href="https://michaelnotebook.com/optimism/index.html">raises concerns</a>. While there’s <a href="https://darioamodei.com/machines-of-loving-grace#4-peace-and-governance">significant</a> <a href="https://ia.samaltman.com">potential upside</a>, we who develop these tools bear responsibility for ensuring they benefit society. Beyond immediate concerns about bio- and chemical weapons <span class="citation" data-cites="peppin2025realityaibiorisk">(Peppin et al. 2025)</span>, I worry about <a href="https://www.argmin.net/p/too-much-information">information overflow</a> and the proliferation of bullshit <span class="citation" data-cites="Frankfurt2005">(Frankfurt 2005)</span> and disinformation of all sorts <span class="citation" data-cites="Europol2023">(Europol 2023)</span> along with a possibility to further increase inequalities (with some dominant players accumulating nation-state-like power and Orwellian centralization of “truth”).</p>
<p>The relative lack of some governments investment in building AI expertise is concerning, as is the potential erosion of critical thinking skills in some quarters. “We live in a society exquisitely dependent on science and technology, in which hardly anyone knows anything about science and technology” <span class="citation" data-cites="Sagan1990">(Sagan 1990)</span>. And, clearly, the scope researches beyond knowing things about science and technology and perhaps even makes a general liberal arts education more valuable then ever.</p>
<blockquote class="blockquote">
<p>For progress there is no cure. Any attempt to find automatically safe channels for the present explosive variety of progress must lead to frustration. The only safety possible is relative, and it lies in an intelligent exercise of day-to-day judgement… these transformations are not a priori predictable and… most contemporary “first guesses” concerning them are wrong…</p>
<p><a href="https://sseh.uchicago.edu/doc/von_Neumann_1955.pdf">CAN WE SURVIVE TECHNOLOGY? by John von Neumann</a></p>
</blockquote>




<div id="quarto-appendix" class="default"><section class="quarto-appendix-contents" id="quarto-bibliography"><h2 class="anchored quarto-appendix-heading">References</h2><div id="refs" class="references csl-bib-body hanging-indent" data-entry-spacing="0">
<div id="ref-alampara2024mattextlanguagemodelsneed" class="csl-entry">
Alampara, Nawaf, Santiago Miret, and Kevin Maik Jablonka. 2024. <span>“MatText: Do Language Models Need More Than Text &amp; Scale for Materials Modeling?”</span> <a href="https://arxiv.org/abs/2406.17295">https://arxiv.org/abs/2406.17295</a>.
</div>
<div id="ref-alampara2024probinglimitationsmultimodallanguage" class="csl-entry">
Alampara, Nawaf, Mara Schilling-Wilhelmi, Martiño Ríos-García, Indrajeet Mandal, Pranav Khetarpal, Hargun Singh Grover, N. M. Anoop Krishnan, and Kevin Maik Jablonka. 2024. <span>“Probing the Limitations of Multimodal Language Models for Chemistry and Materials Research.”</span> <a href="https://arxiv.org/abs/2411.16955">https://arxiv.org/abs/2411.16955</a>.
</div>
<div id="ref-Dyson_2012" class="csl-entry">
Dyson, Freeman J. 2012. <span>“Is Science Mostly Driven by Ideas or by Tools?”</span> <em>Science</em> 338 (6113): 1426–27. <a href="https://doi.org/10.1126/science.1232773">https://doi.org/10.1126/science.1232773</a>.
</div>
<div id="ref-Europol2023" class="csl-entry">
Europol. 2023. <span>“Criminal Use of ChatGPT: A Cautionary Tale about Large Language Models.”</span> 2023. <a href="https://www.europol.europa.eu/media-press/newsroom/news/criminal-use-of-chatgpt-cautionary-tale-about-large-language-models">https://www.europol.europa.eu/media-press/newsroom/news/criminal-use-of-chatgpt-cautionary-tale-about-large-language-models</a>.
</div>
<div id="ref-reasoning" class="csl-entry">
Fedorenko, Evelina, Steven T. Piantadosi, and Edward A. F. Gibson. 2024. <span>“Language Is Primarily a Tool for Communication Rather Than Thought.”</span> <em>Nature</em> 630 (8017): 575–86. <a href="https://doi.org/10.1038/s41586-024-07522-w">https://doi.org/10.1038/s41586-024-07522-w</a>.
</div>
<div id="ref-Frankfurt2005" class="csl-entry">
Frankfurt, Harry G. 2005. <em>On Bullshit</em>. Princeton University Press.
</div>
<div id="ref-Jablonka_2022" class="csl-entry">
Jablonka, Kevin Maik, Luc Patiny, and Berend Smit. 2022. <span>“Making the Collective Knowledge of Chemistry Open and Machine Actionable.”</span> <em>Nature Chemistry</em> 14 (4): 365–76. <a href="https://doi.org/10.1038/s41557-022-00910-7">https://doi.org/10.1038/s41557-022-00910-7</a>.
</div>
<div id="ref-peppin2025realityaibiorisk" class="csl-entry">
Peppin, Aidan, Anka Reuel, Stephen Casper, Elliot Jones, Andrew Strait, Usman Anwar, Anurag Agrawal, et al. 2025. <span>“The Reality of AI and Biorisk.”</span> <a href="https://arxiv.org/abs/2412.01946">https://arxiv.org/abs/2412.01946</a>.
</div>
<div id="ref-Robertson_1980" class="csl-entry">
Robertson, Miranda. 1980. <span>“Biology in the 1980s, Plus or Minus a Decade.”</span> <em>Nature</em> 285 (5764): 358–59. <a href="https://doi.org/10.1038/285358a0">https://doi.org/10.1038/285358a0</a>.
</div>
<div id="ref-Sagan1990" class="csl-entry">
Sagan, Carl. 1990. <em>Why We Need to Understand Science</em>. Vol. 14. 3.
</div>
<div id="ref-Schilling_Wilhelmi_2025" class="csl-entry">
Schilling-Wilhelmi, Mara, Martiño Ríos-García, Sherjeel Shabih, María Victoria Gil, Santiago Miret, Christoph T. Koch, José A. Márquez, and Kevin Maik Jablonka. 2025. <span>“From Text to Insight: Large Language Models for Chemical Data Extraction.”</span> <em>Chemical Society Reviews</em>. <a href="https://doi.org/10.1039/d4cs00913d">https://doi.org/10.1039/d4cs00913d</a>.
</div>
<div id="ref-zhang2024omniopenendednessmodelshuman" class="csl-entry">
Zhang, Jenny, Joel Lehman, Kenneth Stanley, and Jeff Clune. 2024. <span>“OMNI: Open-Endedness via Models of Human Notions of Interestingness.”</span> <a href="https://arxiv.org/abs/2306.01711">https://arxiv.org/abs/2306.01711</a>.
</div>
</div></section></div> ]]></description>
  <category>academia</category>
  <category>llm</category>
  <guid>https://kjablonka.com/blog/posts/why_llm/</guid>
  <pubDate>Sat, 04 Jan 2025 23:00:00 GMT</pubDate>
  <media:content url="https://kjablonka.com/blog/posts/why_llm" medium="image"/>
</item>
<item>
  <title>Take it easy, my friend</title>
  <link>https://kjablonka.com/blog/posts/take_it_easy/</link>
  <description><![CDATA[ 




<p>It’s easy to feel caught up in the (perceived) pressure to produce results quickly. Academia seems to prioritize speed and quantity over depth and quality, which can be overwhelming. But is this really true? If it is something you must rush to publish, is there any real value in it? Is it really of value if you need to “compete”? <span class="citation" data-cites="thiel2014competition">(Thiel 2014)</span> Is the project you are rushing to publish really the question that you are best positioned to answer and that you deeply care about? Isn’t rushing things another form of cargo cult science?</p>
<p>What if, instead of frantically racing to publish, you took the time to slow down, breathe, and really dig into your research, the research you really care about and take pleasure in craftsmanship? We must admit that we are not immune to the allure of “fast science.” There’s something undeniably exciting about chasing quick breakthroughs and racking up publications. Yet, this isn’t the path to meaningful, impactful, sustainable research and happiness.</p>
<p>Great outcomes take time and persistence. Take the story of Rosalind Franklin, whose research laid the groundwork for understanding the structure of DNA. Or consider the godfathers of deep learning, who persisted through the AI winter. Galileo took 18 years to finish and write up his pendulum experiments and Newton took four years for his initial writings about gravity. <span class="citation" data-cites="newport2024slow">(Newport 2024)</span></p>
<p>These scientists didn’t rush their work or cut it into “salami papers.” Instead, they took their time, and their persistence paid off.</p>
<p>Great outcomes can also not be easily optimized for. And this is, as Peter Drucker already realized, <span class="citation" data-cites="drucker1999knowledge">(Drucker 1999)</span> especially difficult for knowledge work. It is a fact that metrics are deceiving and that stepping stones that lead to discoveries cannot be anticipated. <span class="citation" data-cites="stanley2015why">(Stanley and Lehman 2015)</span> Thus, as a field, we are bound to be less successful in the long run if we only optimize for bibliometrics. Trust yourself that your unique point of view will lead to something exciting. Looking less at oneself, comparing Google Scholar profiles, and instead being in awe of the exciting times we are in will also lead to more happiness. <span class="citation" data-cites="brooks2022dont shiota2007nature dambrun2017self">(Brooks 2022; Shiota, Keltner, and Mossman 2007; Dambrun 2017)</span></p>
<p>Great outcomes are also very diverse and happen on very different timescales. Even though some of our communications suggest otherwise (“Samantha is a great student because she published in Nature”), <span class="citation" data-cites="lawrence2003politics">(Lawrence 2003)</span> we do not only do well if we publish in Cell, Nature, or Science or other “vanity outlets.” We also do well if we build software that is used and that powers a full line of other research (think of the impact Python, Numpy, RDKit, Pymatgen, and similar tools had on our work). We also do well if we curate datasets that enable new discoveries. <span class="citation" data-cites="abbott2020mind">(Abbott et al. 2020)</span> Ultimately, AlphaFold would not have been possible without the Protein Data Bank. While the systems with which we evaluate scientists only slowly evolve to reflect this reality, <span class="citation" data-cites="hicks2015bibliometrics">(Hicks et al. 2015)</span> it is important to remember that great work will ultimately pay off and lead to much more satisfaction. Ultimately, those who decide on funding and career moves benefit from hindsight that editors do not have. <span class="citation" data-cites="lawrence2003politics">(Lawrence 2003)</span></p>
<p>In a world filled with one-hit wonders and short-lived trends, it’s more important than ever to focus on creating meaningful, lasting contributions to your field. You want to be known for something great, not just a fleeting moment of recognition. Moreover, slowing down to a sustainable pace can help you avoid the pitfalls of academic burnout. It’s not about the number of publications or the speed at which you produce them; it’s about the quality of your work, the depth of your understanding, and the passion you bring to your research. To our knowledge, being stressed didn’t help anyone to think clearly. <span class="citation" data-cites="mcewen2007physiology">(McEwen 2007)</span></p>
<p>So, if you are feeling the pressure to constantly and rapidly produce, remember that taking it slow can lead to great things. Embrace the process, pursue your ideas with curiosity and dedication, and, most importantly, take the time to enjoy the journey. This journey isn’t a sprint — it’s a marathon. It will be more rewarding if you allow yourself the freedom to explore at your own pace. So, take a deep breath, and relax. In the end, you’ll be known not for a fleeting moment of recognition but for a lasting contribution to your field—a testament to your dedication, perseverance, and passion for your work.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://kjablonka.com/blog/posts/take_it_easy/cover.png" class="img-fluid figure-img"></p>
<figcaption>DALL-E generated image for slow science</figcaption>
</figure>
</div>
<p>This text is inspired by a conversation with Alán Aspuru-Guzik. Alán also suggested the title with a reference to a song.</p>


<!-- -->



<div id="quarto-appendix" class="default"><section class="quarto-appendix-contents" id="quarto-bibliography"><h2 class="anchored quarto-appendix-heading">References</h2><div id="refs" class="references csl-bib-body hanging-indent" data-entry-spacing="0">
<div id="ref-abbott2020mind" class="csl-entry">
Abbott, L. F., D. D. Bock, E. M. Callaway, et al. 2020. <span>“The Mind of a Mouse.”</span> <em>Cell</em> 182 (6): 1372–76. <a href="https://doi.org/10.1016/j.cell.2020.08.010">https://doi.org/10.1016/j.cell.2020.08.010</a>.
</div>
<div id="ref-brooks2022dont" class="csl-entry">
Brooks, Arthur C. 2022. <span>“Don’t Objectify Yourself.”</span> <em>The Atlantic</em>. <a href="https://www.theatlantic.com/family/archive/2022/09/how-be-less-self-centered/671499/">https://www.theatlantic.com/family/archive/2022/09/how-be-less-self-centered/671499/</a>.
</div>
<div id="ref-dambrun2017self" class="csl-entry">
Dambrun, Michael. 2017. <span>“Self-Centeredness and Selflessness: Happiness Correlates and Mediating Psychological Processes.”</span> <em>PeerJ</em> 5: e3306. <a href="https://doi.org/10.7717/peerj.3306">https://doi.org/10.7717/peerj.3306</a>.
</div>
<div id="ref-drucker1999knowledge" class="csl-entry">
Drucker, Peter F. 1999. <span>“Knowledge-Worker Productivity: The Biggest Challenge.”</span> <em>California Management Review</em> 41 (2): 79–94. <a href="https://doi.org/10.2307/41165987">https://doi.org/10.2307/41165987</a>.
</div>
<div id="ref-hicks2015bibliometrics" class="csl-entry">
Hicks, Diana, Paul Wouters, Ludo Waltman, Sarah de Rijcke, and Ismael Rafols. 2015. <span>“Bibliometrics: The Leiden Manifesto for Research Metrics.”</span> <em>Nature</em> 520 (7548): 429–31. <a href="https://doi.org/10.1038/520429a">https://doi.org/10.1038/520429a</a>.
</div>
<div id="ref-lawrence2003politics" class="csl-entry">
Lawrence, Peter A. 2003. <span>“The Politics of Publication.”</span> <em>Nature</em> 422 (6929): 259–61. <a href="https://doi.org/10.1038/422259a">https://doi.org/10.1038/422259a</a>.
</div>
<div id="ref-mcewen2007physiology" class="csl-entry">
McEwen, Bruce S. 2007. <span>“Physiology and Neurobiology of Stress and Adaptation: Central Role of the Brain.”</span> <em>Physiological Reviews</em> 87 (3): 873–904. <a href="https://doi.org/10.1152/physrev.00041.2006">https://doi.org/10.1152/physrev.00041.2006</a>.
</div>
<div id="ref-newport2024slow" class="csl-entry">
Newport, Cal. 2024. <em>Slow Productivity: The Lost Art of Accomplishment Without Burnout</em>. Penguin Books Limited.
</div>
<div id="ref-shiota2007nature" class="csl-entry">
Shiota, Michelle N., Dacher Keltner, and Amanda Mossman. 2007. <span>“The Nature of Awe: Elicitors, Appraisals, and Effects on Self-Concept.”</span> <em>Cognition and Emotion</em> 21 (5): 944–63. <a href="https://doi.org/10.1080/02699930600923668">https://doi.org/10.1080/02699930600923668</a>.
</div>
<div id="ref-stanley2015why" class="csl-entry">
Stanley, Kenneth O., and Joel Lehman. 2015. <em>Why Greatness Cannot Be Planned: The Myth of the Objective</em>. Cham: Springer International Publishing. <a href="https://doi.org/10.1007/978-3-319-15524-1">https://doi.org/10.1007/978-3-319-15524-1</a>.
</div>
<div id="ref-thiel2014competition" class="csl-entry">
Thiel, Peter. 2014. <span>“Competition Is for Losers.”</span> <em>Wall Street Journal</em>. <a href="http://online.wsj.com/articles/peter-thiel-competition-is-for-losers-1410535536">http://online.wsj.com/articles/peter-thiel-competition-is-for-losers-1410535536</a>.
</div>
</div></section></div> ]]></description>
  <category>life</category>
  <category>academia</category>
  <guid>https://kjablonka.com/blog/posts/take_it_easy/</guid>
  <pubDate>Sun, 01 Dec 2024 23:00:00 GMT</pubDate>
</item>
</channel>
</rss>
