<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Netlify on </title>
    <link>/tags/netlify/</link>
    <description>Recent content in Netlify on </description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en-us</language>
    <lastBuildDate>Sat, 14 Mar 2026 00:00:00 +0000</lastBuildDate><atom:link href="/tags/netlify/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Building an AI Conference Directory That Populates Itself</title>
      <link>/posts/augmented-resilience-posts/building-an-ai-conference-directory-that-populates-itself/</link>
      <pubDate>Sat, 14 Mar 2026 00:00:00 +0000</pubDate>
      
      <guid>/posts/augmented-resilience-posts/building-an-ai-conference-directory-that-populates-itself/</guid>
      <description>&lt;h2 id=&#34;the-problem-ai-conferences-are-everywhere-and-nowhere&#34;&gt;The Problem: AI Conferences Are Everywhere and Nowhere&lt;/h2&gt;
&lt;p&gt;If you&amp;rsquo;ve ever tried to find a comprehensive list of upcoming AI conferences, you know the pain. There&amp;rsquo;s no single source. AAAI has their page. NeurIPS has theirs. ICML posts deadlines on OpenReview. Half the emerging summits only exist on LinkedIn event pages or buried in Reddit threads.&lt;/p&gt;
&lt;p&gt;I wanted a simple, searchable directory of AI conferences — one site where I could see what&amp;rsquo;s coming up, filter by topic, and get the key details. But I didn&amp;rsquo;t want to manually curate it. I&amp;rsquo;ve seen too many &amp;ldquo;awesome lists&amp;rdquo; on GitHub that are lovingly maintained for three months and then abandoned.&lt;/p&gt;</description>
      <content>&lt;h2 id=&#34;the-problem-ai-conferences-are-everywhere-and-nowhere&#34;&gt;The Problem: AI Conferences Are Everywhere and Nowhere&lt;/h2&gt;
&lt;p&gt;If you&amp;rsquo;ve ever tried to find a comprehensive list of upcoming AI conferences, you know the pain. There&amp;rsquo;s no single source. AAAI has their page. NeurIPS has theirs. ICML posts deadlines on OpenReview. Half the emerging summits only exist on LinkedIn event pages or buried in Reddit threads.&lt;/p&gt;
&lt;p&gt;I wanted a simple, searchable directory of AI conferences — one site where I could see what&amp;rsquo;s coming up, filter by topic, and get the key details. But I didn&amp;rsquo;t want to manually curate it. I&amp;rsquo;ve seen too many &amp;ldquo;awesome lists&amp;rdquo; on GitHub that are lovingly maintained for three months and then abandoned.&lt;/p&gt;
&lt;p&gt;What I wanted was a system that populates itself.&lt;/p&gt;
&lt;p&gt;So I built one. And with Claude Code running through my PAI system, the whole pipeline — from search to database to website — came together over a few focused sessions.&lt;/p&gt;
&lt;p&gt;Here&amp;rsquo;s the full story.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id=&#34;the-architecture-three-layers-zero-manual-data-entry&#34;&gt;The Architecture: Three Layers, Zero Manual Data Entry&lt;/h2&gt;
&lt;p&gt;The final system has three layers, each handling a distinct responsibility:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;SearXNG (search engine)
    → conference_tracker.py (discovery)
        → Airtable (database)
            → fetch-events.mjs (build-time fetch)
                → React + Vite site on Netlify
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Each layer is independently useful, loosely coupled, and replaceable. Let&amp;rsquo;s walk through them.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id=&#34;layer-1-the-tracker--finding-conferences-automatically&#34;&gt;Layer 1: The Tracker — Finding Conferences Automatically&lt;/h2&gt;
&lt;p&gt;The foundation is a Python script called &lt;code&gt;conference_tracker.py&lt;/code&gt;. Its job is simple: search the web for AI conferences and store what it finds.&lt;/p&gt;
&lt;h3 id=&#34;search-searxng-instead-of-google&#34;&gt;Search: SearXNG Instead of Google&lt;/h3&gt;
&lt;p&gt;Rather than hitting the Google API (with its quotas and billing), I use &lt;a href=&#34;https://github.com/searxng/searxng&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;SearXNG&lt;/a&gt;
 — an open-source, self-hosted meta-search engine. It aggregates results from Google, Bing, DuckDuckGo, and others without API keys or rate limits.&lt;/p&gt;
&lt;p&gt;The tracker runs a curated list of search queries defined in &lt;code&gt;config.yaml&lt;/code&gt;:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-yaml&#34; data-lang=&#34;yaml&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;search_queries&lt;/span&gt;:
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  - &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;AI conference 2026&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  - &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;artificial intelligence conference 2026&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  - &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;machine learning conference 2026&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  - &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;NeurIPS 2026&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  - &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;ICML 2026&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  - &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;AAAI 2026&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  - &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;AI summit 2026&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  - &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;deep learning conference 2026&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  - &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;computer vision conference 2026 CVPR&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  - &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;natural language processing conference 2026&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Each query returns up to 10 results. The tracker extracts the title, URL, and snippet from each result, deduplicates against what&amp;rsquo;s already in the database, and stores new finds.&lt;/p&gt;
&lt;h3 id=&#34;storage-airtable-as-the-source-of-truth&#34;&gt;Storage: Airtable as the Source of Truth&lt;/h3&gt;
&lt;p&gt;Why Airtable? Because it&amp;rsquo;s a real database with an API, but it also has a spreadsheet-like UI for manual review. When you&amp;rsquo;re building a pipeline that discovers data automatically, you want a way to eyeball the results and clean up noise — and Airtable is perfect for that.&lt;/p&gt;
&lt;p&gt;The tracker writes five fields per record: &lt;code&gt;title&lt;/code&gt;, &lt;code&gt;websiteUrl&lt;/code&gt;, &lt;code&gt;description&lt;/code&gt;, &lt;code&gt;Source Query&lt;/code&gt;, and &lt;code&gt;Date Found&lt;/code&gt;. That&amp;rsquo;s it. Just the raw discovery data. The structured details come later.&lt;/p&gt;
&lt;p&gt;The deduplication is URL-based — normalized and lowercased. If we&amp;rsquo;ve already stored &lt;code&gt;neurips.cc/2026&lt;/code&gt;, we don&amp;rsquo;t store it again even if it appears in a different search query.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-python&#34; data-lang=&#34;python&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;def&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;extract_conference_info&lt;/span&gt;(result, source_query):
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#66d9ef&#34;&gt;return&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;title&amp;#34;&lt;/span&gt;: result[&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;title&amp;#34;&lt;/span&gt;][:&lt;span style=&#34;color:#ae81ff&#34;&gt;200&lt;/span&gt;],
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;websiteUrl&amp;#34;&lt;/span&gt;: result[&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;url&amp;#34;&lt;/span&gt;],
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;description&amp;#34;&lt;/span&gt;: result[&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;snippet&amp;#34;&lt;/span&gt;][:&lt;span style=&#34;color:#ae81ff&#34;&gt;1000&lt;/span&gt;],
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Source Query&amp;#34;&lt;/span&gt;: source_query,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;Date Found&amp;#34;&lt;/span&gt;: datetime&lt;span style=&#34;color:#f92672&#34;&gt;.&lt;/span&gt;now(timezone&lt;span style=&#34;color:#f92672&#34;&gt;.&lt;/span&gt;utc)&lt;span style=&#34;color:#f92672&#34;&gt;.&lt;/span&gt;strftime(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;%Y-%m-&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;%d&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;&lt;/span&gt;),
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    }
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;After one run, we had 87 unique conference records. The real stuff — NeurIPS, ICML, CVPR, AAAI — alongside smaller but interesting events like the Quantum AI and NLP Conference, Deep Learning Indaba, and the Wharton Human-AI Research summit.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id=&#34;layer-2-the-website--react--vite-on-netlify&#34;&gt;Layer 2: The Website — React + Vite on Netlify&lt;/h2&gt;
&lt;p&gt;The directory itself is a React app built with Vite and deployed on Netlify. It&amp;rsquo;s a single-page app with search, tag filtering, and individual event pages.&lt;/p&gt;
&lt;p&gt;The key architectural decision: &lt;strong&gt;data is fetched at build time, not runtime.&lt;/strong&gt; A prebuild script (&lt;code&gt;fetch-events.mjs&lt;/code&gt;) pulls conference data from the database and writes it to a &lt;code&gt;data.ts&lt;/code&gt; file that Vite bundles into the site. This means:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;No API keys exposed in the browser&lt;/li&gt;
&lt;li&gt;No CORS issues&lt;/li&gt;
&lt;li&gt;Instant page loads (data is already in the bundle)&lt;/li&gt;
&lt;li&gt;The site works even if Airtable is temporarily down&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The prebuild hook in &lt;code&gt;package.json&lt;/code&gt; makes this automatic:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-json&#34; data-lang=&#34;json&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;{
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;scripts&amp;#34;&lt;/span&gt;: {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;fetch-events&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;bun scripts/fetch-events.mjs&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;prebuild&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;bun scripts/fetch-events.mjs&amp;#34;&lt;/span&gt;,
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#f92672&#34;&gt;&amp;#34;build&amp;#34;&lt;/span&gt;: &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;vite build&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  }
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Every time Netlify builds the site, it automatically fetches the latest data from Airtable. Fresh data on every deploy.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id=&#34;the-middleman-problem-cutting-google-sheets&#34;&gt;The Middleman Problem: Cutting Google Sheets&lt;/h2&gt;
&lt;p&gt;Here&amp;rsquo;s where the story gets interesting.&lt;/p&gt;
&lt;p&gt;The original pipeline had an extra step: Airtable → Google Sheets → website. The &lt;code&gt;fetch-events.mjs&lt;/code&gt; script was pulling from a published Google Sheet CSV. Why? Because when I first prototyped the site, I started with a spreadsheet. It was quick and easy.&lt;/p&gt;
&lt;p&gt;But once the conference tracker was writing directly to Airtable, Google Sheets became a middleman with no purpose. Data had to be synced from Airtable to Sheets (manually or via Zapier), and that sync was another thing that could break.&lt;/p&gt;
&lt;p&gt;The fix was straightforward: teach &lt;code&gt;fetch-events.mjs&lt;/code&gt; to talk directly to the Airtable API.&lt;/p&gt;
&lt;h3 id=&#34;airtables-rest-api&#34;&gt;Airtable&amp;rsquo;s REST API&lt;/h3&gt;
&lt;p&gt;The Airtable API is clean. A single GET request returns records as JSON:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-javascript&#34; data-lang=&#34;javascript&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;url&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;new&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;URL&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;`https://api.airtable.com/v0/&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;${&lt;/span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;baseId&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;}&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;/&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;${&lt;/span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;tableId&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;}&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;`&lt;/span&gt;);
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;resp&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;await&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;fetch&lt;/span&gt;(&lt;span style=&#34;color:#a6e22e&#34;&gt;url&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;toString&lt;/span&gt;(), {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#a6e22e&#34;&gt;headers&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; { &lt;span style=&#34;color:#a6e22e&#34;&gt;Authorization&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;`Bearer &lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;${&lt;/span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;pat&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;}&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;`&lt;/span&gt; },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;});
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;data&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;await&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;resp&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;json&lt;/span&gt;();
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#75715e&#34;&gt;// data.records = [{ id, fields: { title, date, ... } }]
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The one gotcha: Airtable paginates at 100 records. You need to follow the &lt;code&gt;offset&lt;/code&gt; token:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-javascript&#34; data-lang=&#34;javascript&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;async&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;function&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;fetchFromAirtable&lt;/span&gt;(&lt;span style=&#34;color:#a6e22e&#34;&gt;pat&lt;/span&gt;, &lt;span style=&#34;color:#a6e22e&#34;&gt;baseId&lt;/span&gt;, &lt;span style=&#34;color:#a6e22e&#34;&gt;tableId&lt;/span&gt;) {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;allRecords&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; [];
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#66d9ef&#34;&gt;let&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;offset&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;null&lt;/span&gt;;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#66d9ef&#34;&gt;do&lt;/span&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;url&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;new&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;URL&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;`https://api.airtable.com/v0/&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;${&lt;/span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;baseId&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;}&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;/&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;${&lt;/span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;tableId&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;}&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;`&lt;/span&gt;);
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#66d9ef&#34;&gt;if&lt;/span&gt; (&lt;span style=&#34;color:#a6e22e&#34;&gt;offset&lt;/span&gt;) &lt;span style=&#34;color:#a6e22e&#34;&gt;url&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;searchParams&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;set&lt;/span&gt;(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#39;offset&amp;#39;&lt;/span&gt;, &lt;span style=&#34;color:#a6e22e&#34;&gt;offset&lt;/span&gt;);
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;resp&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;await&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;fetch&lt;/span&gt;(&lt;span style=&#34;color:#a6e22e&#34;&gt;url&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;toString&lt;/span&gt;(), {
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;      &lt;span style=&#34;color:#a6e22e&#34;&gt;headers&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; { &lt;span style=&#34;color:#a6e22e&#34;&gt;Authorization&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;:&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;`Bearer &lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;${&lt;/span&gt;&lt;span style=&#34;color:#a6e22e&#34;&gt;pat&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;}&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;`&lt;/span&gt; },
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    });
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#66d9ef&#34;&gt;const&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;data&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;await&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;resp&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;json&lt;/span&gt;();
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a6e22e&#34;&gt;allRecords&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;push&lt;/span&gt;(...&lt;span style=&#34;color:#a6e22e&#34;&gt;data&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;records&lt;/span&gt;);
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#a6e22e&#34;&gt;offset&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;data&lt;/span&gt;.&lt;span style=&#34;color:#a6e22e&#34;&gt;offset&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;||&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;null&lt;/span&gt;;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  } &lt;span style=&#34;color:#66d9ef&#34;&gt;while&lt;/span&gt; (&lt;span style=&#34;color:#a6e22e&#34;&gt;offset&lt;/span&gt;);
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;  &lt;span style=&#34;color:#66d9ef&#34;&gt;return&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;allRecords&lt;/span&gt;;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;}
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&#34;graceful-fallback&#34;&gt;Graceful Fallback&lt;/h3&gt;
&lt;p&gt;I kept the Google Sheets path as a fallback. The &lt;code&gt;main()&lt;/code&gt; function uses a priority chain:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Airtable&lt;/strong&gt; — if &lt;code&gt;AIRTABLE_PAT&lt;/code&gt;, &lt;code&gt;AIRTABLE_BASE_ID&lt;/code&gt;, &lt;code&gt;AIRTABLE_TABLE_ID&lt;/code&gt; are set&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Google Sheets&lt;/strong&gt; — if &lt;code&gt;GOOGLE_SHEET_CSV_URL&lt;/code&gt; is set&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Fallback events&lt;/strong&gt; — hardcoded sample data so the build never fails&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This means you can&amp;rsquo;t break the site by misconfiguring a data source. The build always succeeds.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id=&#34;layer-3-the-enrichment--ai-powered-data-extraction&#34;&gt;Layer 3: The Enrichment — AI-Powered Data Extraction&lt;/h2&gt;
&lt;p&gt;This is where things got really interesting.&lt;/p&gt;
&lt;p&gt;After cutting Google Sheets, I had 87 conference records in Airtable. But they only had three useful fields: title, description, and URL. No dates. No locations. No tags. The site worked, but every event card was sparse — no way to filter by date or location, no tags to browse by topic.&lt;/p&gt;
&lt;p&gt;Filling in 87 records by hand? No thanks.&lt;/p&gt;
&lt;h3 id=&#34;the-idea-visit-each-url-and-ask-ai-to-extract-the-data&#34;&gt;The Idea: Visit Each URL and Ask AI to Extract the Data&lt;/h3&gt;
&lt;p&gt;The approach: for each conference record, fetch its web page, extract the text content, and use AI inference to pull out structured fields like date, location, organizer, and tags.&lt;/p&gt;
&lt;p&gt;I built an enrichment script — &lt;code&gt;enrich_conferences.py&lt;/code&gt; — that sits alongside the tracker in the same project.&lt;/p&gt;
&lt;h3 id=&#34;step-1-fetch-and-clean-the-page&#34;&gt;Step 1: Fetch and Clean the Page&lt;/h3&gt;
&lt;p&gt;Each conference URL gets fetched with &lt;code&gt;requests&lt;/code&gt;, then cleaned with BeautifulSoup. Navigation, footers, scripts, and styling get stripped, leaving just the text content:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-python&#34; data-lang=&#34;python&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;def&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;fetch_page_text&lt;/span&gt;(url, timeout&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;15&lt;/span&gt;):
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    resp &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; requests&lt;span style=&#34;color:#f92672&#34;&gt;.&lt;/span&gt;get(url, headers&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;headers, timeout&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;timeout)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    soup &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; BeautifulSoup(resp&lt;span style=&#34;color:#f92672&#34;&gt;.&lt;/span&gt;text, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;html.parser&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#66d9ef&#34;&gt;for&lt;/span&gt; tag &lt;span style=&#34;color:#f92672&#34;&gt;in&lt;/span&gt; soup([&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;script&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;style&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;nav&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;footer&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;header&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;aside&amp;#34;&lt;/span&gt;]):
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        tag&lt;span style=&#34;color:#f92672&#34;&gt;.&lt;/span&gt;decompose()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    text &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; soup&lt;span style=&#34;color:#f92672&#34;&gt;.&lt;/span&gt;get_text(separator&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;\n&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;&lt;/span&gt;, strip&lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;True&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    lines &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; [line&lt;span style=&#34;color:#f92672&#34;&gt;.&lt;/span&gt;strip() &lt;span style=&#34;color:#66d9ef&#34;&gt;for&lt;/span&gt; line &lt;span style=&#34;color:#f92672&#34;&gt;in&lt;/span&gt; text&lt;span style=&#34;color:#f92672&#34;&gt;.&lt;/span&gt;splitlines() &lt;span style=&#34;color:#66d9ef&#34;&gt;if&lt;/span&gt; line&lt;span style=&#34;color:#f92672&#34;&gt;.&lt;/span&gt;strip()]
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#66d9ef&#34;&gt;return&lt;/span&gt; &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#ae81ff&#34;&gt;\n&lt;/span&gt;&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;&lt;/span&gt;&lt;span style=&#34;color:#f92672&#34;&gt;.&lt;/span&gt;join(lines)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&#34;step-2-ai-extraction-via-pai-inference&#34;&gt;Step 2: AI Extraction via PAI Inference&lt;/h3&gt;
&lt;p&gt;The cleaned text gets sent to Claude (via PAI&amp;rsquo;s Inference tool) with a structured extraction prompt. The prompt is specific about what to extract and what format to use:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;Given text from a conference web page, extract these fields as JSON:
{
  &amp;#34;date&amp;#34;: &amp;#34;human-readable date like &amp;#39;May 5-6, 2026&amp;#39;&amp;#34;,
  &amp;#34;endDate&amp;#34;: &amp;#34;ISO end date like &amp;#39;2026-05-06&amp;#39;&amp;#34;,
  &amp;#34;location&amp;#34;: &amp;#34;City, State/Country&amp;#34;,
  &amp;#34;venue&amp;#34;: &amp;#34;venue name&amp;#34;,
  &amp;#34;price&amp;#34;: &amp;#34;ticket price or &amp;#39;Free&amp;#39;&amp;#34;,
  &amp;#34;organizer&amp;#34;: &amp;#34;organizing body&amp;#34;,
  &amp;#34;tags&amp;#34;: &amp;#34;comma-separated topic tags (max 4)&amp;#34;
}
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;One critical addition: if the page is a &lt;strong&gt;list of conferences&lt;/strong&gt; (like &amp;ldquo;Top 10 AI Conferences of 2026&amp;rdquo;), the AI returns &lt;code&gt;{&amp;quot;is_list_page&amp;quot;: true}&lt;/code&gt; and the script skips it. This was essential — about 15% of our URLs were aggregator pages, not individual conference pages.&lt;/p&gt;
&lt;h3 id=&#34;step-3-write-back-to-airtable&#34;&gt;Step 3: Write Back to Airtable&lt;/h3&gt;
&lt;p&gt;Non-empty extracted fields get PATCHed back to Airtable. The script only writes fields that actually exist in the table schema — a lesson learned the hard way when &lt;code&gt;venue&lt;/code&gt; and &lt;code&gt;imageUrl&lt;/code&gt; threw 422 errors because those columns hadn&amp;rsquo;t been created yet.&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; style=&#34;color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;&#34;&gt;&lt;code class=&#34;language-python&#34; data-lang=&#34;python&#34;&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;&lt;span style=&#34;color:#66d9ef&#34;&gt;def&lt;/span&gt; &lt;span style=&#34;color:#a6e22e&#34;&gt;build_patch_fields&lt;/span&gt;(extracted, allowed_fields):
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#66d9ef&#34;&gt;if&lt;/span&gt; extracted&lt;span style=&#34;color:#f92672&#34;&gt;.&lt;/span&gt;get(&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;is_list_page&amp;#34;&lt;/span&gt;):
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#66d9ef&#34;&gt;return&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;None&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    patch &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; {}
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#66d9ef&#34;&gt;for&lt;/span&gt; key &lt;span style=&#34;color:#f92672&#34;&gt;in&lt;/span&gt; [&lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;date&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;endDate&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;location&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;venue&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;price&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;organizer&amp;#34;&lt;/span&gt;, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;tags&amp;#34;&lt;/span&gt;]:
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#66d9ef&#34;&gt;if&lt;/span&gt; key &lt;span style=&#34;color:#f92672&#34;&gt;not&lt;/span&gt; &lt;span style=&#34;color:#f92672&#34;&gt;in&lt;/span&gt; allowed_fields:
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            &lt;span style=&#34;color:#66d9ef&#34;&gt;continue&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        val &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; extracted&lt;span style=&#34;color:#f92672&#34;&gt;.&lt;/span&gt;get(key, &lt;span style=&#34;color:#e6db74&#34;&gt;&amp;#34;&amp;#34;&lt;/span&gt;)
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;        &lt;span style=&#34;color:#66d9ef&#34;&gt;if&lt;/span&gt; isinstance(val, str) &lt;span style=&#34;color:#f92672&#34;&gt;and&lt;/span&gt; val&lt;span style=&#34;color:#f92672&#34;&gt;.&lt;/span&gt;strip():
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;            patch[key] &lt;span style=&#34;color:#f92672&#34;&gt;=&lt;/span&gt; val&lt;span style=&#34;color:#f92672&#34;&gt;.&lt;/span&gt;strip()
&lt;/span&gt;&lt;/span&gt;&lt;span style=&#34;display:flex;&#34;&gt;&lt;span&gt;    &lt;span style=&#34;color:#66d9ef&#34;&gt;return&lt;/span&gt; patch &lt;span style=&#34;color:#66d9ef&#34;&gt;if&lt;/span&gt; patch &lt;span style=&#34;color:#66d9ef&#34;&gt;else&lt;/span&gt; &lt;span style=&#34;color:#66d9ef&#34;&gt;None&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id=&#34;the-results&#34;&gt;The Results&lt;/h3&gt;
&lt;p&gt;Running the enrichment script across all 87 records:&lt;/p&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;Outcome&lt;/th&gt;
          &lt;th&gt;Count&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;Records enriched&lt;/td&gt;
          &lt;td&gt;48&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;List/aggregator pages (correctly skipped)&lt;/td&gt;
          &lt;td&gt;12&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;No extractable fields (social media, OpenReview, etc.)&lt;/td&gt;
          &lt;td&gt;11&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Errors (timeouts, HTTP 403s)&lt;/td&gt;
          &lt;td&gt;16&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;After enrichment:&lt;/p&gt;
&lt;table&gt;
  &lt;thead&gt;
      &lt;tr&gt;
          &lt;th&gt;Field&lt;/th&gt;
          &lt;th&gt;Records populated&lt;/th&gt;
      &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
      &lt;tr&gt;
          &lt;td&gt;Date&lt;/td&gt;
          &lt;td&gt;42&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Location&lt;/td&gt;
          &lt;td&gt;41&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Tags&lt;/td&gt;
          &lt;td&gt;47&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Organizer&lt;/td&gt;
          &lt;td&gt;27&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
          &lt;td&gt;Price&lt;/td&gt;
          &lt;td&gt;4&lt;/td&gt;
      &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;From zero structured data to a directory where most events have dates, locations, and topic tags — without opening a single conference website manually.&lt;/p&gt;
&lt;p&gt;Some highlights from the extraction:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;NeurIPS 2026:&lt;/strong&gt; December 6-12, Sydney, Australia — Deep Learning, Research, Algorithms, LLMs&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;CVPR 2026:&lt;/strong&gt; June 3-7, Denver, CO — Computer Vision, Deep Learning, Research&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;ICML 2026:&lt;/strong&gt; July 6-11, Seoul, South Korea — LLMs, Computer Vision, NLP, Robotics&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;AI Council 2026:&lt;/strong&gt; May 12-14, San Francisco, CA — Generative AI, ML Ops, AI Safety&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;MIDL 2026:&lt;/strong&gt; July 8-10, Taipei — Deep Learning, Healthcare AI, Computer Vision&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h2 id=&#34;the-pipeline-today&#34;&gt;The Pipeline Today&lt;/h2&gt;
&lt;p&gt;Here&amp;rsquo;s what the full system looks like now:&lt;/p&gt;
&lt;pre tabindex=&#34;0&#34;&gt;&lt;code&gt;SearXNG (self-hosted search)
  → conference_tracker.py (Python — discovers conferences)
    → Airtable (source of truth — 87 records)
      → enrich_conferences.py (Python — AI-powered field extraction)
        → Airtable (now with dates, locations, tags)
          → fetch-events.mjs (Node — build-time data fetch)
            → data.ts (bundled into the site)
              → React + Vite app on Netlify
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The tracker discovers. The enricher structures. The fetcher delivers. The site displays. Each piece runs independently and can be re-run at any time.&lt;/p&gt;
&lt;p&gt;The enrichment script is idempotent — it only processes records where the &lt;code&gt;date&lt;/code&gt; field is empty, so running it again only touches new or previously-failed records.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id=&#34;what-id-do-differently-and-whats-next&#34;&gt;What I&amp;rsquo;d Do Differently (And What&amp;rsquo;s Next)&lt;/h2&gt;
&lt;h3 id=&#34;the-timeout-problem&#34;&gt;The Timeout Problem&lt;/h3&gt;
&lt;p&gt;About 16 records hit the 25-second inference timeout. The fast tier (Haiku) is quick but occasionally chokes on pages with dense, complex content. A retry mechanism using the standard tier (Sonnet) for failed records would catch most of these.&lt;/p&gt;
&lt;h3 id=&#34;missing-table-columns&#34;&gt;Missing Table Columns&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;venue&lt;/code&gt; and &lt;code&gt;imageUrl&lt;/code&gt; fields don&amp;rsquo;t exist in the Airtable table yet. The enrichment script extracts venue names beautifully (The Venetian for Ai4, COEX Convention Center for ICML, Dongguk University for AAAI Summer), but the data gets dropped because the columns aren&amp;rsquo;t there. A quick table schema update in the Airtable UI fixes this.&lt;/p&gt;
&lt;h3 id=&#34;scheduled-runs&#34;&gt;Scheduled Runs&lt;/h3&gt;
&lt;p&gt;Right now, both the tracker and enricher are manual. The natural next step is scheduling — run the tracker daily to discover new conferences, the enricher on new records, and trigger a Netlify deploy afterward. The Netlify build hook is already configured; it just needs a cron job or GitHub Action to call it.&lt;/p&gt;
&lt;h3 id=&#34;data-quality&#34;&gt;Data Quality&lt;/h3&gt;
&lt;p&gt;Some records are noise — Reddit discussion threads, Amazon Science blog posts, Twitter/X profiles. A quality filter (either rule-based on URL patterns or AI-powered) would clean the dataset before enrichment runs.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id=&#34;lessons-learned&#34;&gt;Lessons Learned&lt;/h2&gt;
&lt;h3 id=&#34;1-eliminate-middlemen-early&#34;&gt;1. Eliminate Middlemen Early&lt;/h3&gt;
&lt;p&gt;Google Sheets added zero value once Airtable was in the picture. But it lingered because it was the &amp;ldquo;original&amp;rdquo; approach. Every extra hop in a pipeline is a thing that can break, a thing that needs syncing, and a thing that slows you down. Cut it.&lt;/p&gt;
&lt;h3 id=&#34;2-build-time-data-fetching-is-underrated&#34;&gt;2. Build-Time Data Fetching Is Underrated&lt;/h3&gt;
&lt;p&gt;Pulling data at build time instead of runtime means no API keys in the browser, no loading spinners, and no CORS headaches. For data that changes daily (not per-second), this is the right architecture.&lt;/p&gt;
&lt;h3 id=&#34;3-ai-extraction-beats-manual-curation&#34;&gt;3. AI Extraction Beats Manual Curation&lt;/h3&gt;
&lt;p&gt;Using AI to extract structured data from unstructured web pages isn&amp;rsquo;t perfect — we got 48 out of 87 records enriched, not 87 out of 87. But it took 20 minutes of runtime versus what would have been hours of manual work. And the script is re-runnable. Improvement is incremental.&lt;/p&gt;
&lt;h3 id=&#34;4-detect-your-datas-shape-before-writing&#34;&gt;4. Detect Your Data&amp;rsquo;s Shape Before Writing&lt;/h3&gt;
&lt;p&gt;The Airtable 422 errors on &lt;code&gt;venue&lt;/code&gt; were entirely preventable. The enrichment script now probes the table schema at startup and only writes to fields that exist. Defensive coding at system boundaries saves debugging time.&lt;/p&gt;
&lt;h3 id=&#34;5-list-page-detection-is-essential-for-web-scraping-pipelines&#34;&gt;5. List Page Detection Is Essential for Web Scraping Pipelines&lt;/h3&gt;
&lt;p&gt;When you&amp;rsquo;re scraping URLs from search results, a significant percentage will be aggregator pages (&amp;ldquo;Top 10 Best AI Conferences&amp;rdquo;) rather than individual event pages. If you don&amp;rsquo;t detect and skip these, you&amp;rsquo;ll corrupt your dataset with merged data from multiple events. The &lt;code&gt;is_list_page&lt;/code&gt; flag in the AI extraction prompt was one of the highest-value additions to the whole pipeline.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id=&#34;the-bigger-picture&#34;&gt;The Bigger Picture&lt;/h2&gt;
&lt;p&gt;This project is a miniature version of a pattern I keep coming back to: &lt;strong&gt;systems that compound.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The tracker runs once and discovers 87 conferences. The enricher runs once and structures 48 of them. The next time the tracker runs, it discovers only &lt;em&gt;new&lt;/em&gt; conferences (deduplication handles the rest). The next time the enricher runs, it only processes records it hasn&amp;rsquo;t touched yet.&lt;/p&gt;
&lt;p&gt;Every run makes the dataset better without redoing previous work. That&amp;rsquo;s the whole point of building infrastructure instead of doing things manually — you invest upfront so the system improves over time with minimal additional effort.&lt;/p&gt;
&lt;p&gt;Working with Claude through PAI made each layer come together faster than I expected. The tracker, the Airtable integration, the Google Sheets elimination, the enrichment script — each was a focused session where the AI handled the implementation details while I focused on architecture decisions.&lt;/p&gt;
&lt;p&gt;That&amp;rsquo;s the augmented part of Augmented Resilience. Not replacing the thinking — amplifying it.&lt;/p&gt;
</content>
    </item>
    
  </channel>
</rss>
