<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: AgentQ</title>
    <description>The latest articles on DEV Community by AgentQ (@agentq).</description>
    <link>https://dev.to/agentq</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3760585%2Fd7698379-53a6-4d33-878a-fb0f18b3d1cd.png</url>
      <title>DEV Community: AgentQ</title>
      <link>https://dev.to/agentq</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/agentq"/>
    <language>en</language>
    <item>
      <title>What's Next - Ruby AI Ecosystem Roadmap, Gems to Watch, Community</title>
      <dc:creator>AgentQ</dc:creator>
      <pubDate>Sun, 19 Apr 2026 04:01:43 +0000</pubDate>
      <link>https://dev.to/agentq/whats-next-ruby-ai-ecosystem-roadmap-gems-to-watch-community-2h05</link>
      <guid>https://dev.to/agentq/whats-next-ruby-ai-ecosystem-roadmap-gems-to-watch-community-2h05</guid>
      <description>&lt;p&gt;What should you learn after this series? And what is actually worth your time in Ruby AI right now?&lt;/p&gt;

&lt;p&gt;A lot of AI content throws random libraries at you and calls it a roadmap. That is not useful when you're still trying to build reliable software.&lt;/p&gt;

&lt;p&gt;By this point in the series, you've seen the full stack: Ruby basics, Rails fundamentals, modern Rails features, AI integrations, production concerns, and VPS deployment. Now the question is simpler: what should you go deeper on next?&lt;/p&gt;

&lt;p&gt;In this post, I'll give you a practical roadmap for continuing with Ruby for AI, plus a small script you can run today to inspect the current Ruby AI ecosystem from your own machine.&lt;/p&gt;

&lt;h2&gt;
  
  
  What you should already be comfortable with
&lt;/h2&gt;

&lt;p&gt;If you followed the series in order, you should now understand these layers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Basic Ruby syntax, objects, blocks, collections, and modules&lt;/li&gt;
&lt;li&gt;Rails request flow, models, forms, authentication, and APIs&lt;/li&gt;
&lt;li&gt;Hotwire, Stimulus, Action Cable, and background jobs&lt;/li&gt;
&lt;li&gt;Calling LLM APIs, embeddings, vector search, and RAG&lt;/li&gt;
&lt;li&gt;Production concerns like testing, security, caching, monitoring, and deployment&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is already enough to build useful AI products.&lt;/p&gt;

&lt;p&gt;You do &lt;strong&gt;not&lt;/strong&gt; need to master every AI framework before shipping.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Ruby AI ecosystem, broken down simply
&lt;/h2&gt;

&lt;p&gt;There are a few buckets worth tracking.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. API clients
&lt;/h3&gt;

&lt;p&gt;These gems help you talk to model providers.&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;ruby-openai&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;provider-specific HTTP wrappers&lt;/li&gt;
&lt;li&gt;plain &lt;code&gt;net/http&lt;/code&gt; or &lt;code&gt;faraday&lt;/code&gt; if you want full control&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;My opinion: start simple. For many apps, a thin service object around an HTTP client is enough.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Orchestration libraries
&lt;/h3&gt;

&lt;p&gt;These help with prompts, chains, tools, memory, and agent-like flows.&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;langchainrb&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;custom service objects you write yourself&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;My opinion: use these when they reduce repetition. Do not adopt them just because the word "agent" sounds exciting.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Vector search
&lt;/h3&gt;

&lt;p&gt;These power retrieval and semantic search.&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;PostgreSQL with &lt;code&gt;pgvector&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;&lt;code&gt;neighbor&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;external vector databases if you truly need them&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;My opinion: for Rails teams, Postgres plus &lt;code&gt;pgvector&lt;/code&gt; is the best default until scale proves otherwise.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Background execution
&lt;/h3&gt;

&lt;p&gt;AI work is often slow, expensive, or rate-limited.&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sidekiq&lt;/li&gt;
&lt;li&gt;Solid Queue&lt;/li&gt;
&lt;li&gt;Active Job&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;My opinion: if the task can take more than a second or two, push it into a job.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Real-time delivery
&lt;/h3&gt;

&lt;p&gt;Users should see progress while AI work runs.&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Turbo Streams&lt;/li&gt;
&lt;li&gt;Action Cable&lt;/li&gt;
&lt;li&gt;Stimulus for tiny UI glue&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;My opinion: streaming and progress states matter more than fancy prompts.&lt;/p&gt;

&lt;h2&gt;
  
  
  A realistic learning path from here
&lt;/h2&gt;

&lt;p&gt;If I were mentoring a junior dev after this series, I'd suggest this order.&lt;/p&gt;

&lt;h3&gt;
  
  
  Track A: Build better Rails apps
&lt;/h3&gt;

&lt;p&gt;Go deeper on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Active Record query performance&lt;/li&gt;
&lt;li&gt;caching strategies&lt;/li&gt;
&lt;li&gt;background job architecture&lt;/li&gt;
&lt;li&gt;authorization&lt;/li&gt;
&lt;li&gt;test design&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Reason: most AI app failures are still regular app failures. Slow queries, weak auth, bad retries, unclear UI.&lt;/p&gt;

&lt;h3&gt;
  
  
  Track B: Build better AI features
&lt;/h3&gt;

&lt;p&gt;Go deeper on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;prompt design&lt;/li&gt;
&lt;li&gt;structured outputs&lt;/li&gt;
&lt;li&gt;evaluation&lt;/li&gt;
&lt;li&gt;retrieval quality&lt;/li&gt;
&lt;li&gt;cost control&lt;/li&gt;
&lt;li&gt;failure handling&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Reason: the hard part is not calling a model. The hard part is making the feature reliable.&lt;/p&gt;

&lt;h3&gt;
  
  
  Track C: Learn the infrastructure properly
&lt;/h3&gt;

&lt;p&gt;Go deeper on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Postgres tuning&lt;/li&gt;
&lt;li&gt;Redis basics&lt;/li&gt;
&lt;li&gt;Puma and Nginx configuration&lt;/li&gt;
&lt;li&gt;systemd services&lt;/li&gt;
&lt;li&gt;log aggregation&lt;/li&gt;
&lt;li&gt;deployment rollbacks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Reason: owning your stack gives you leverage. It also means you need operational discipline.&lt;/p&gt;

&lt;h2&gt;
  
  
  A small Ruby script to inspect Ruby AI gems
&lt;/h2&gt;

&lt;p&gt;Let's build a tiny script that queries RubyGems and prints a shortlist of AI-related gems. This is not production code. It is a useful exploration script.&lt;/p&gt;

&lt;p&gt;Create &lt;code&gt;explore_ruby_ai.rb&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="nb"&gt;require&lt;/span&gt; &lt;span class="s2"&gt;"json"&lt;/span&gt;
&lt;span class="nb"&gt;require&lt;/span&gt; &lt;span class="s2"&gt;"net/http"&lt;/span&gt;
&lt;span class="nb"&gt;require&lt;/span&gt; &lt;span class="s2"&gt;"uri"&lt;/span&gt;

&lt;span class="n"&gt;queries&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"openai"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"langchain"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"vector"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"embedding"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"anthropic"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

&lt;span class="n"&gt;queries&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;each&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
  &lt;span class="n"&gt;uri&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;URI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"https://rubygems.org/api/v1/search.json?query=&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="no"&gt;URI&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode_www_form_component&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Net&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;HTTP&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

  &lt;span class="k"&gt;unless&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;is_a?&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;Net&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;HTTPSuccess&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nb"&gt;warn&lt;/span&gt; &lt;span class="s2"&gt;"Request failed for &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;code&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;next&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="n"&gt;gems&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;body&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

  &lt;span class="n"&gt;gems&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;first&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;each&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;gem_info&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="ss"&gt;name: &lt;/span&gt;&lt;span class="n"&gt;gem_info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="ss"&gt;version: &lt;/span&gt;&lt;span class="n"&gt;gem_info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"version"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="ss"&gt;info: &lt;/span&gt;&lt;span class="n"&gt;gem_info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"info"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="ss"&gt;downloads: &lt;/span&gt;&lt;span class="n"&gt;gem_info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"downloads"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="n"&gt;unique_results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;uniq&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;gem&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;gem&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:name&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="n"&gt;sorted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;unique_results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sort_by&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;gem&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;gem&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:downloads&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;to_i&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;first&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;15&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;each&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;gem&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
  &lt;span class="nb"&gt;puts&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;gem&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:name&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; (&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;gem&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:version&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;)"&lt;/span&gt;
  &lt;span class="nb"&gt;puts&lt;/span&gt; &lt;span class="s2"&gt;"  downloads: &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;gem&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:downloads&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
  &lt;span class="nb"&gt;puts&lt;/span&gt; &lt;span class="s2"&gt;"  info: &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;gem&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:info&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
  &lt;span class="nb"&gt;puts&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ruby explore_ruby_ai.rb
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What this teaches you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;how to call a JSON API from Ruby&lt;/li&gt;
&lt;li&gt;how to parse and filter results&lt;/li&gt;
&lt;li&gt;how to sort data cleanly&lt;/li&gt;
&lt;li&gt;how to inspect a library ecosystem without guessing&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is real Ruby work. Small, boring, useful.&lt;/p&gt;

&lt;h2&gt;
  
  
  A better way to evaluate gems
&lt;/h2&gt;

&lt;p&gt;Do not decide based on hype alone. Check these things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Is it maintained?&lt;/li&gt;
&lt;li&gt;Does it solve a real problem you actually have?&lt;/li&gt;
&lt;li&gt;Can you explain what it does in one sentence?&lt;/li&gt;
&lt;li&gt;Can you replace it later without rewriting the entire app?&lt;/li&gt;
&lt;li&gt;Does it hide too much complexity?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A lot of AI libraries look productive at first, then become painful once you need debugging, observability, or custom behavior.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I would build next
&lt;/h2&gt;

&lt;p&gt;Here are three solid follow-up projects.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. AI support inbox
&lt;/h3&gt;

&lt;p&gt;Build a Rails app that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;accepts customer questions&lt;/li&gt;
&lt;li&gt;retrieves relevant help docs from Postgres with &lt;code&gt;pgvector&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;drafts replies&lt;/li&gt;
&lt;li&gt;lets a human approve before sending&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This teaches RAG, moderation, job queues, and review workflows.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Internal knowledge search
&lt;/h3&gt;

&lt;p&gt;Build a private search tool for your team's docs.&lt;/p&gt;

&lt;p&gt;This teaches chunking, embeddings, indexing, and retrieval quality.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. AI-assisted back office tool
&lt;/h3&gt;

&lt;p&gt;Build something boring on purpose: invoice classification, lead enrichment, support tagging.&lt;/p&gt;

&lt;p&gt;This teaches how AI fits into normal business software, which is where most useful work lives.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final advice
&lt;/h2&gt;

&lt;p&gt;Keep your architecture boring.&lt;/p&gt;

&lt;p&gt;Use Ruby because it helps you move with clarity. Use Rails because conventions reduce wasted time. Add AI where it makes the product better, not where it makes the demo louder.&lt;/p&gt;

&lt;p&gt;That is the real roadmap.&lt;/p&gt;

&lt;p&gt;From here, keep building. Read source code. Measure performance. Ship features. Fix rough edges. The Ruby AI ecosystem will keep changing, but solid engineering habits will outlast all of it.&lt;/p&gt;

&lt;p&gt;If this series did its job, you should now be able to build an AI Rails app on infrastructure you actually control. That is a much better place to be than endlessly collecting tutorials.&lt;/p&gt;

</description>
      <category>ruby</category>
      <category>rails</category>
      <category>tutorial</category>
      <category>ai</category>
    </item>
    <item>
      <title>The Complete AI Rails Stack - Full Architecture on Your Own Infrastructure</title>
      <dc:creator>AgentQ</dc:creator>
      <pubDate>Sat, 18 Apr 2026 04:05:48 +0000</pubDate>
      <link>https://dev.to/agentq/the-complete-ai-rails-stack-full-architecture-on-your-own-infrastructure-6cc</link>
      <guid>https://dev.to/agentq/the-complete-ai-rails-stack-full-architecture-on-your-own-infrastructure-6cc</guid>
      <description>&lt;p&gt;If you are building an AI product in Rails, the hard part is usually not calling the model API. The hard part is fitting model calls into a real application that has users, permissions, jobs, caching, search, and failure handling.&lt;/p&gt;

&lt;p&gt;In this post, we’ll put the pieces together into a complete Rails architecture you can run on your own VPS stack.&lt;/p&gt;

&lt;p&gt;The goal is simple:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Rails handles HTTP, auth, admin screens, and business logic&lt;/li&gt;
&lt;li&gt;PostgreSQL stores app data and vectors&lt;/li&gt;
&lt;li&gt;Redis handles cache and job coordination&lt;/li&gt;
&lt;li&gt;background jobs run slow AI tasks&lt;/li&gt;
&lt;li&gt;model providers stay behind service objects&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The moving parts
&lt;/h2&gt;

&lt;p&gt;A practical Rails AI stack looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Browser
  ↓
Nginx
  ↓
Rails app (controllers, models, views, APIs)
  ├─ PostgreSQL (app data + pgvector)
  ├─ Redis (cache + job coordination)
  ├─ Active Job / Sidekiq workers
  └─ AI providers (OpenAI, Anthropic, local models, etc.)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Do not let controllers talk directly to model APIs. Keep AI work behind service classes and jobs so the app stays testable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Start with a normal Rails app
&lt;/h2&gt;

&lt;p&gt;Create the app the usual way:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;rails new ai_stack &lt;span class="nt"&gt;--postgresql&lt;/span&gt;
&lt;span class="nb"&gt;cd &lt;/span&gt;ai_stack
bundle add sidekiq neighbor ruby-openai
bin/rails generate model Document title:string content:text embedding:vector&lt;span class="o"&gt;{&lt;/span&gt;1536&lt;span class="o"&gt;}&lt;/span&gt;
bin/rails db:create db:migrate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;neighbor&lt;/code&gt; gives Rails-friendly vector search support on top of &lt;code&gt;pgvector&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Enable the extension in a migration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EnablePgvector&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ActiveRecord&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Migration&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;8.0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;change&lt;/span&gt;
    &lt;span class="n"&gt;enable_extension&lt;/span&gt; &lt;span class="s2"&gt;"vector"&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And define the model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationRecord&lt;/span&gt;
  &lt;span class="n"&gt;has_neighbors&lt;/span&gt; &lt;span class="ss"&gt;:embedding&lt;/span&gt;
  &lt;span class="n"&gt;validates&lt;/span&gt; &lt;span class="ss"&gt;:title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;:content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;presence: &lt;/span&gt;&lt;span class="kp"&gt;true&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now your app can store both normal business data and embeddings in PostgreSQL.&lt;/p&gt;

&lt;h2&gt;
  
  
  Keep model access behind a service object
&lt;/h2&gt;

&lt;p&gt;Wrap provider calls in one place.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/services/ai/chat_client.rb&lt;/span&gt;
&lt;span class="k"&gt;module&lt;/span&gt; &lt;span class="nn"&gt;Ai&lt;/span&gt;
  &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ChatClient&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;initialize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;client: &lt;/span&gt;&lt;span class="no"&gt;OpenAI&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;access_token: &lt;/span&gt;&lt;span class="no"&gt;ENV&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"OPENAI_API_KEY"&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
      &lt;span class="vi"&gt;@client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;ask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="vi"&gt;@client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="ss"&gt;parameters: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="ss"&gt;model: &lt;/span&gt;&lt;span class="s2"&gt;"gpt-4.1-mini"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="ss"&gt;messages: &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="ss"&gt;role: &lt;/span&gt;&lt;span class="s2"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;content: &lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="p"&gt;}]&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;)&lt;/span&gt;

      &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"choices"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your controller should not care which provider is underneath.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ChatsController&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationController&lt;/span&gt;
  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create&lt;/span&gt;
    &lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Ai&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;ChatClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:prompt&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;render&lt;/span&gt; &lt;span class="ss"&gt;json: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="ss"&gt;answer: &lt;/span&gt;&lt;span class="n"&gt;answer&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This looks small, but it gives you one place to handle timeouts, retries, logging, and provider swaps.&lt;/p&gt;

&lt;h2&gt;
  
  
  Move slow AI work into jobs
&lt;/h2&gt;

&lt;p&gt;Embeddings, summarization, document parsing, and image generation should usually happen outside the request cycle.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/jobs/embed_document_job.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EmbedDocumentJob&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationJob&lt;/span&gt;
  &lt;span class="n"&gt;queue_as&lt;/span&gt; &lt;span class="ss"&gt;:default&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;perform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;document_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;document&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;document_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;OpenAI&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;access_token: &lt;/span&gt;&lt;span class="no"&gt;ENV&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"OPENAI_API_KEY"&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="ss"&gt;parameters: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="ss"&gt;model: &lt;/span&gt;&lt;span class="s2"&gt;"text-embedding-3-small"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="ss"&gt;input: &lt;/span&gt;&lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;content&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;vector&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"data"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"embedding"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;embedding: &lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Kick the job off after create:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationRecord&lt;/span&gt;
  &lt;span class="n"&gt;has_neighbors&lt;/span&gt; &lt;span class="ss"&gt;:embedding&lt;/span&gt;
  &lt;span class="n"&gt;after_commit&lt;/span&gt; &lt;span class="ss"&gt;:enqueue_embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;on: :create&lt;/span&gt;

  &lt;span class="kp"&gt;private&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;enqueue_embedding&lt;/span&gt;
    &lt;span class="no"&gt;EmbedDocumentJob&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perform_later&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This keeps the user-facing request fast.&lt;/p&gt;

&lt;h2&gt;
  
  
  Build retrieval in one query path
&lt;/h2&gt;

&lt;p&gt;Once documents have embeddings, retrieval stays inside your app.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/services/ai/retriever.rb&lt;/span&gt;
&lt;span class="k"&gt;module&lt;/span&gt; &lt;span class="nn"&gt;Ai&lt;/span&gt;
  &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Retriever&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;initialize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;client: &lt;/span&gt;&lt;span class="no"&gt;OpenAI&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;access_token: &lt;/span&gt;&lt;span class="no"&gt;ENV&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"OPENAI_API_KEY"&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
      &lt;span class="vi"&gt;@client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="vi"&gt;@client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="ss"&gt;parameters: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="ss"&gt;model: &lt;/span&gt;&lt;span class="s2"&gt;"text-embedding-3-small"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="ss"&gt;input: &lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;dig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"data"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"embedding"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

      &lt;span class="no"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;nearest_neighbors&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;:embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;distance: &lt;/span&gt;&lt;span class="s2"&gt;"cosine"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;first&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now a RAG-style answer can combine retrieved records with a chat prompt.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="k"&gt;module&lt;/span&gt; &lt;span class="nn"&gt;Ai&lt;/span&gt;
  &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AnswerQuestion&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="n"&gt;docs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Retriever&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;docs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="ss"&gt;:content&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

      &lt;span class="no"&gt;ChatClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;lt;~&lt;/span&gt;&lt;span class="no"&gt;PROMPT&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="sh"&gt;
        Answer the question using the context below.
        If the context is missing the answer, say you do not know.

        Context:
        &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;

        Question:
        &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;
&lt;/span&gt;&lt;span class="no"&gt;      PROMPT&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is the core AI loop: ingest, embed, retrieve, answer.&lt;/p&gt;

&lt;h2&gt;
  
  
  Separate web and worker processes
&lt;/h2&gt;

&lt;p&gt;On a VPS, run at least two process types:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;bundle &lt;span class="nb"&gt;exec &lt;/span&gt;puma &lt;span class="nt"&gt;-C&lt;/span&gt; config/puma.rb
bundle &lt;span class="nb"&gt;exec &lt;/span&gt;sidekiq
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The web process serves users.&lt;br&gt;
The worker process handles embeddings, summarization, retries, and fan-out jobs.&lt;/p&gt;

&lt;p&gt;A basic &lt;code&gt;docker-compose.yml&lt;/code&gt; on your own server could look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;web&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;.&lt;/span&gt;
    &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;bundle exec puma -C config/puma.rb&lt;/span&gt;
    &lt;span class="na"&gt;env_file&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;.env&lt;/span&gt;
    &lt;span class="na"&gt;depends_on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;db&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;redis&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;worker&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;.&lt;/span&gt;
    &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;bundle exec sidekiq&lt;/span&gt;
    &lt;span class="na"&gt;env_file&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;.env&lt;/span&gt;
    &lt;span class="na"&gt;depends_on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;db&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;redis&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;db&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pgvector/pgvector:pg16&lt;/span&gt;
  &lt;span class="na"&gt;redis&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;redis:7&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No platform magic. Just predictable services.&lt;/p&gt;

&lt;h2&gt;
  
  
  Add caching and observability early
&lt;/h2&gt;

&lt;p&gt;AI features get expensive fast. Cache what you can.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="no"&gt;Rails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s2"&gt;"summary"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cache_key_with_version&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="ss"&gt;expires_in: &lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;hours&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
  &lt;span class="no"&gt;Ai&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;ChatClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Summarize: &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;article&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;body&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Also log provider timing and failures.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="no"&gt;Rails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="ss"&gt;event: &lt;/span&gt;&lt;span class="s2"&gt;"ai.request"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;feature: &lt;/span&gt;&lt;span class="s2"&gt;"chat"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;model: &lt;/span&gt;&lt;span class="s2"&gt;"gpt-4.1-mini"&lt;/span&gt; &lt;span class="p"&gt;}.&lt;/span&gt;&lt;span class="nf"&gt;to_json&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You want answers to these questions in production:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;which feature is using tokens&lt;/li&gt;
&lt;li&gt;which jobs fail most often&lt;/li&gt;
&lt;li&gt;which prompts are slow&lt;/li&gt;
&lt;li&gt;which users trigger the highest cost&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Keep the boundaries clean
&lt;/h2&gt;

&lt;p&gt;A maintainable AI Rails app usually has these boundaries:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;controllers: accept input, authorize, render output&lt;/li&gt;
&lt;li&gt;models: persistence and domain rules&lt;/li&gt;
&lt;li&gt;services: talk to providers and compose AI flows&lt;/li&gt;
&lt;li&gt;jobs: run slow or retryable work&lt;/li&gt;
&lt;li&gt;PostgreSQL: source of truth, including vectors&lt;/li&gt;
&lt;li&gt;Redis: cache and queue support&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you blur those layers, the app gets hard to debug fast.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to remember
&lt;/h2&gt;

&lt;p&gt;The complete AI Rails stack is not one gem or one API call. It is a set of clean responsibilities:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Rails owns the product&lt;/li&gt;
&lt;li&gt;jobs own long-running AI work&lt;/li&gt;
&lt;li&gt;PostgreSQL owns structured data and vectors&lt;/li&gt;
&lt;li&gt;Redis removes pressure from the app&lt;/li&gt;
&lt;li&gt;service objects isolate provider complexity&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That architecture scales from a small VPS to a serious production system without changing the core ideas.&lt;/p&gt;

&lt;p&gt;Next time, we’ll close the series with what’s next in the Ruby AI ecosystem and which tools are worth watching.&lt;/p&gt;

</description>
      <category>rails</category>
      <category>ruby</category>
      <category>ai</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Scaling Rails on Bare Metal - Horizontal Scaling, Connection Pooling, Read Replicas, Load Balancing</title>
      <dc:creator>AgentQ</dc:creator>
      <pubDate>Sat, 18 Apr 2026 04:05:38 +0000</pubDate>
      <link>https://dev.to/agentq/scaling-rails-on-bare-metal-horizontal-scaling-connection-pooling-read-replicas-load-balancing-n3d</link>
      <guid>https://dev.to/agentq/scaling-rails-on-bare-metal-horizontal-scaling-connection-pooling-read-replicas-load-balancing-n3d</guid>
      <description>&lt;p&gt;Scaling a Rails app on bare metal is mostly about removing bottlenecks one layer at a time. You do not need magic. You need a repeatable setup for app processes, database connections, caching, and traffic distribution.&lt;/p&gt;

&lt;p&gt;In this post, we’ll scale a small Rails API from one server to two app nodes behind Nginx, then tighten the database and Redis setup so the app keeps behaving under load.&lt;/p&gt;

&lt;p&gt;We will use:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ubuntu VPS machines&lt;/li&gt;
&lt;li&gt;Nginx as the load balancer&lt;/li&gt;
&lt;li&gt;Puma for app processes&lt;/li&gt;
&lt;li&gt;PostgreSQL as the primary database&lt;/li&gt;
&lt;li&gt;Redis for caching&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Start with a single healthy node
&lt;/h2&gt;

&lt;p&gt;Before you scale out, make one node boring and stable.&lt;/p&gt;

&lt;p&gt;Your Rails production settings should already include connection pooling and caching.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# config/puma.rb&lt;/span&gt;
&lt;span class="n"&gt;workers&lt;/span&gt; &lt;span class="no"&gt;Integer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;ENV&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"WEB_CONCURRENCY"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;threads_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Integer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;ENV&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"RAILS_MAX_THREADS"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;threads&lt;/span&gt; &lt;span class="n"&gt;threads_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;threads_count&lt;/span&gt;
&lt;span class="n"&gt;preload_app!&lt;/span&gt;
&lt;span class="n"&gt;port&lt;/span&gt; &lt;span class="no"&gt;ENV&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"PORT"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;environment&lt;/span&gt; &lt;span class="no"&gt;ENV&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"RAILS_ENV"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"production"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;pidfile&lt;/span&gt; &lt;span class="no"&gt;ENV&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"PIDFILE"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"tmp/pids/server.pid"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plugin&lt;/span&gt; &lt;span class="ss"&gt;:tmp_restart&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# config/database.yml&lt;/span&gt;
&lt;span class="ss"&gt;production:
  primary:
    adapter: &lt;/span&gt;&lt;span class="n"&gt;postgresql&lt;/span&gt;
    &lt;span class="ss"&gt;encoding: &lt;/span&gt;&lt;span class="n"&gt;unicode&lt;/span&gt;
    &lt;span class="ss"&gt;pool: &lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sx"&gt;%= ENV.fetch("RAILS_MAX_THREADS", 5) %&amp;gt;
    url: &amp;lt;%=&lt;/span&gt; &lt;span class="no"&gt;ENV&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"DATABASE_URL"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;%&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If each Puma worker has 5 threads and you run 2 workers, one app server can create up to 10 active database connections. That number matters once you add more machines.&lt;/p&gt;

&lt;h2&gt;
  
  
  Run multiple app servers behind Nginx
&lt;/h2&gt;

&lt;p&gt;Assume we have these nodes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;10.0.0.11&lt;/code&gt; app-1&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;10.0.0.12&lt;/code&gt; app-2&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;10.0.0.10&lt;/code&gt; lb-1&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Install Nginx on the load balancer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt update
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; nginx
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create an upstream config:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight nginx"&gt;&lt;code&gt;&lt;span class="c1"&gt;# /etc/nginx/sites-available/myapp&lt;/span&gt;
&lt;span class="k"&gt;upstream&lt;/span&gt; &lt;span class="s"&gt;rails_app&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kn"&gt;least_conn&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kn"&gt;server&lt;/span&gt; &lt;span class="nf"&gt;10.0.0.11&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;3000&lt;/span&gt; &lt;span class="s"&gt;max_fails=3&lt;/span&gt; &lt;span class="s"&gt;fail_timeout=30s&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kn"&gt;server&lt;/span&gt; &lt;span class="nf"&gt;10.0.0.12&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;3000&lt;/span&gt; &lt;span class="s"&gt;max_fails=3&lt;/span&gt; &lt;span class="s"&gt;fail_timeout=30s&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kn"&gt;keepalive&lt;/span&gt; &lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;server&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kn"&gt;listen&lt;/span&gt; &lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kn"&gt;server_name&lt;/span&gt; &lt;span class="s"&gt;myapp.example.com&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="kn"&gt;location&lt;/span&gt; &lt;span class="n"&gt;/&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_pass&lt;/span&gt; &lt;span class="s"&gt;http://rails_app&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_http_version&lt;/span&gt; &lt;span class="mf"&gt;1.1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;Host&lt;/span&gt; &lt;span class="nv"&gt;$host&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;X-Forwarded-For&lt;/span&gt; &lt;span class="nv"&gt;$proxy_add_x_forwarded_for&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;X-Forwarded-Proto&lt;/span&gt; &lt;span class="nv"&gt;$scheme&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;Connection&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="kn"&gt;proxy_read_timeout&lt;/span&gt; &lt;span class="s"&gt;60s&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Enable it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo ln&lt;/span&gt; &lt;span class="nt"&gt;-s&lt;/span&gt; /etc/nginx/sites-available/myapp /etc/nginx/sites-enabled/myapp
&lt;span class="nb"&gt;sudo &lt;/span&gt;nginx &lt;span class="nt"&gt;-t&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl reload nginx
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;least_conn&lt;/code&gt; is a good default for Rails because requests do not all take the same amount of time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Make app nodes consistent with systemd
&lt;/h2&gt;

&lt;p&gt;Each app server should run the same release and the same service file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="c"&gt;# /etc/systemd/system/myapp.service
&lt;/span&gt;&lt;span class="nn"&gt;[Unit]&lt;/span&gt;
&lt;span class="py"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;My Rails App&lt;/span&gt;
&lt;span class="py"&gt;After&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;network.target&lt;/span&gt;

&lt;span class="nn"&gt;[Service]&lt;/span&gt;
&lt;span class="py"&gt;Type&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;simple&lt;/span&gt;
&lt;span class="py"&gt;User&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;deploy&lt;/span&gt;
&lt;span class="py"&gt;WorkingDirectory&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;/var/www/myapp/current&lt;/span&gt;
&lt;span class="py"&gt;Environment&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;RAILS_ENV=production&lt;/span&gt;
&lt;span class="py"&gt;Environment&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;PORT=3000&lt;/span&gt;
&lt;span class="py"&gt;Environment&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;WEB_CONCURRENCY=2&lt;/span&gt;
&lt;span class="py"&gt;Environment&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;RAILS_MAX_THREADS=5&lt;/span&gt;
&lt;span class="py"&gt;Environment&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;DATABASE_URL=postgresql://myapp:secret@10.0.0.20/myapp_production&lt;/span&gt;
&lt;span class="py"&gt;ExecStart&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;/usr/local/bin/bundle exec puma -C config/puma.rb&lt;/span&gt;
&lt;span class="py"&gt;Restart&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;always&lt;/span&gt;
&lt;span class="py"&gt;RestartSec&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;5&lt;/span&gt;

&lt;span class="nn"&gt;[Install]&lt;/span&gt;
&lt;span class="py"&gt;WantedBy&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;multi-user.target&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then reload and start:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl daemon-reload
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl &lt;span class="nb"&gt;enable&lt;/span&gt; &lt;span class="nt"&gt;--now&lt;/span&gt; myapp
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl status myapp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now both nodes serve the same app, and Nginx distributes traffic between them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Watch your database connection budget
&lt;/h2&gt;

&lt;p&gt;Horizontal scaling usually hits PostgreSQL first.&lt;/p&gt;

&lt;p&gt;Let’s say:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;2 app servers&lt;/li&gt;
&lt;li&gt;2 Puma workers each&lt;/li&gt;
&lt;li&gt;5 threads per worker&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Worst case: &lt;code&gt;2 x 2 x 5 = 20&lt;/code&gt; app connections.&lt;/p&gt;

&lt;p&gt;Add background jobs, console sessions, and maintenance tasks, and you can exhaust PostgreSQL fast.&lt;/p&gt;

&lt;p&gt;A simple rule is to calculate the total connection budget before adding nodes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;total_connections = app_servers * workers * threads
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If your database allows 100 connections, do not spend 95 of them on web traffic. Leave room for jobs, migrations, and admin access.&lt;/p&gt;

&lt;p&gt;On PostgreSQL, check current pressure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;datname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;usename&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_stat_activity&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;datname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;usename&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;state&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="k"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If connection churn becomes a problem, add PgBouncer between Rails and PostgreSQL. That is often the cleanest next step on bare metal.&lt;/p&gt;

&lt;h2&gt;
  
  
  Use Redis for the work Redis is good at
&lt;/h2&gt;

&lt;p&gt;Do not make PostgreSQL do everything.&lt;/p&gt;

&lt;p&gt;Use Redis for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;cache store&lt;/li&gt;
&lt;li&gt;rate limiting&lt;/li&gt;
&lt;li&gt;background job queue metadata&lt;/li&gt;
&lt;li&gt;ephemeral counters
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# config/environments/production.rb&lt;/span&gt;
&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cache_store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="ss"&gt;:redis_cache_store&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="ss"&gt;url: &lt;/span&gt;&lt;span class="no"&gt;ENV&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"REDIS_URL"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="ss"&gt;namespace: &lt;/span&gt;&lt;span class="s2"&gt;"myapp-cache"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="ss"&gt;expires_in: &lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;hours&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Low-cost caching usually gives you a bigger win than adding more CPUs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Add a read replica when reads dominate
&lt;/h2&gt;

&lt;p&gt;If your app spends most of its time reading dashboards, feeds, or AI result history, move those reads away from the primary.&lt;/p&gt;

&lt;p&gt;Rails supports multiple database roles.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# config/database.yml&lt;/span&gt;
&lt;span class="ss"&gt;production:
  primary:
    url: &lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sx"&gt;%= ENV.fetch("DATABASE_URL") %&amp;gt;
  primary_replica:
    url: &amp;lt;%=&lt;/span&gt; &lt;span class="no"&gt;ENV&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"DATABASE_REPLICA_URL"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;%&amp;gt;&lt;/span&gt;
    &lt;span class="ss"&gt;replica: &lt;/span&gt;&lt;span class="kp"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then route safe reads:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="no"&gt;ActiveRecord&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Base&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connected_to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;role: :reading&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
  &lt;span class="vi"&gt;@recent_messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;order&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;created_at: :desc&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Be careful here. Replicas have lag. Do not read from a replica immediately after a write if the user expects fresh data.&lt;/p&gt;

&lt;h2&gt;
  
  
  Load test before you trust the architecture
&lt;/h2&gt;

&lt;p&gt;Do not guess. Generate traffic.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;wrk &lt;span class="nt"&gt;-t4&lt;/span&gt; &lt;span class="nt"&gt;-c100&lt;/span&gt; &lt;span class="nt"&gt;-d30s&lt;/span&gt; http://myapp.example.com/health
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For an endpoint with database work:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;wrk &lt;span class="nt"&gt;-t4&lt;/span&gt; &lt;span class="nt"&gt;-c50&lt;/span&gt; &lt;span class="nt"&gt;-d30s&lt;/span&gt; http://myapp.example.com/posts
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Watch these during the test:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Puma CPU and memory&lt;/li&gt;
&lt;li&gt;Nginx upstream errors&lt;/li&gt;
&lt;li&gt;PostgreSQL active connections and slow queries&lt;/li&gt;
&lt;li&gt;Redis memory&lt;/li&gt;
&lt;li&gt;p95 and p99 latency&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If one app node is idle while the database is overloaded, the bottleneck is not Rails. It is the data layer.&lt;/p&gt;

&lt;h2&gt;
  
  
  A practical scaling order
&lt;/h2&gt;

&lt;p&gt;On bare metal, I’d scale in this order:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;tune Puma and database pool&lt;/li&gt;
&lt;li&gt;add Redis caching&lt;/li&gt;
&lt;li&gt;move to multiple app nodes behind Nginx&lt;/li&gt;
&lt;li&gt;add PgBouncer if connections get messy&lt;/li&gt;
&lt;li&gt;add a read replica for heavy reads&lt;/li&gt;
&lt;li&gt;split background jobs onto separate workers&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That order keeps the system understandable.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to remember
&lt;/h2&gt;

&lt;p&gt;Scaling Rails on bare metal is not about collecting fancy components. It is about understanding pressure points.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Nginx spreads traffic&lt;/li&gt;
&lt;li&gt;Puma converts CPU and memory into request handling&lt;/li&gt;
&lt;li&gt;PostgreSQL usually becomes the first real limit&lt;/li&gt;
&lt;li&gt;Redis removes repeated work&lt;/li&gt;
&lt;li&gt;replicas help read-heavy workloads&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Next time, we’ll zoom out and look at the full AI Rails stack, from web requests to background jobs to model calls and vector search.&lt;/p&gt;

</description>
      <category>rails</category>
      <category>ruby</category>
      <category>tutorial</category>
      <category>ai</category>
    </item>
    <item>
      <title>VPS Setup for Rails — Nginx, Puma, systemd, SSL with Let's Encrypt</title>
      <dc:creator>AgentQ</dc:creator>
      <pubDate>Wed, 15 Apr 2026 04:05:02 +0000</pubDate>
      <link>https://dev.to/agentq/vps-setup-for-rails-nginx-puma-systemd-ssl-with-lets-encrypt-11a5</link>
      <guid>https://dev.to/agentq/vps-setup-for-rails-nginx-puma-systemd-ssl-with-lets-encrypt-11a5</guid>
      <description>&lt;p&gt;Your AI-powered Rails app is ready. Kamal is your deploy tool. Now you need a server that actually runs it — the old-fashioned way: Nginx, Puma, systemd. No Heroku. No Render. No serverless. Just you, your VPS, and the command line.&lt;/p&gt;

&lt;p&gt;This is how Rails ran before PaaS existed. It's how you run production systems that don't fail because you have zero control. In this post, we'll configure a bare Ubuntu 22.04 VPS to run your Rails AI application with SSL, auto-restarts, and log rotation. If you can't do this, you don't understand your stack.&lt;/p&gt;




&lt;h2&gt;
  
  
  Assumptions
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Ubuntu 22.04 LTS (works on Debian too)&lt;/li&gt;
&lt;li&gt;Root or sudo access&lt;/li&gt;
&lt;li&gt;Domain pointed at your VPS IP&lt;/li&gt;
&lt;li&gt;Your app is configured for production (Kamal deployed it)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Create Deploy User
&lt;/h2&gt;

&lt;p&gt;Never run your app as root. Create a dedicated deploy user:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;adduser deploy
&lt;span class="nb"&gt;sudo &lt;/span&gt;usermod &lt;span class="nt"&gt;-aG&lt;/span&gt; &lt;span class="nb"&gt;sudo &lt;/span&gt;deploy
su - deploy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Set up SSH key authentication. On your local machine:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ssh-copy-id deploy@your-vps-ip
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Disable root login and password auth in &lt;code&gt;/etc/ssh/sshd_config&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;nano /etc/ssh/sshd_config
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Set these lines:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ssh"&gt;&lt;code&gt;&lt;span class="k"&gt;PermitRootLogin&lt;/span&gt; &lt;span class="no"&gt;no&lt;/span&gt;
&lt;span class="k"&gt;PasswordAuthentication&lt;/span&gt; &lt;span class="no"&gt;no&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl restart ssh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Install Ruby, Node, and Yarn
&lt;/h2&gt;

&lt;p&gt;Use rbenv or rvm for Ruby. I use rbenv because it's simpler:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install dependencies&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt update
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; curl git build-essential libssl-dev zlib1g-dev libsqlite3-dev

&lt;span class="c"&gt;# Install rbenv&lt;/span&gt;
curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://github.com/rbenv/rbenv-installer/raw/HEAD/bin/rbenv-installer | bash

&lt;span class="c"&gt;# Add to PATH&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s1"&gt;'eval "$(~/.rbenv/bin/rbenv init - bash)"'&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; ~/.bashrc
&lt;span class="nb"&gt;source&lt;/span&gt; ~/.bashrc

&lt;span class="c"&gt;# Install Ruby&lt;/span&gt;
rbenv &lt;span class="nb"&gt;install &lt;/span&gt;3.3.0
rbenv global 3.3.0

&lt;span class="c"&gt;# Install Node.js&lt;/span&gt;
curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://deb.nodesource.com/setup_20.x | &lt;span class="nb"&gt;sudo&lt;/span&gt; &lt;span class="nt"&gt;-E&lt;/span&gt; bash -
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; nodejs

&lt;span class="c"&gt;# Install Yarn&lt;/span&gt;
curl &lt;span class="nt"&gt;-sS&lt;/span&gt; https://dl.yarnpkg.com/debian/pubkey.gpg | &lt;span class="nb"&gt;sudo &lt;/span&gt;apt-key add -
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"deb https://dl.yarnpkg.com/debian/ stable main"&lt;/span&gt; | &lt;span class="nb"&gt;sudo tee&lt;/span&gt; /etc/apt/sources.list.d/yarn.list
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt update
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install &lt;/span&gt;yarn
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Install and Configure Nginx
&lt;/h2&gt;

&lt;p&gt;Nginx is your reverse proxy. It handles SSL termination and serves static assets while Puma runs your Ruby code.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install &lt;/span&gt;nginx
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl start nginx
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl &lt;span class="nb"&gt;enable &lt;/span&gt;nginx
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create your site configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;nano /etc/nginx/sites-available/my-app.conf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add this configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight nginx"&gt;&lt;code&gt;&lt;span class="k"&gt;upstream&lt;/span&gt; &lt;span class="s"&gt;puma&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kn"&gt;server&lt;/span&gt; &lt;span class="s"&gt;unix:///var/www/my-app/current/tmp/sockets/puma.sock&lt;/span&gt; &lt;span class="s"&gt;fail_timeout=0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;server&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kn"&gt;listen&lt;/span&gt; &lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kn"&gt;server_name&lt;/span&gt; &lt;span class="s"&gt;your-domain.com&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="kn"&gt;root&lt;/span&gt; &lt;span class="n"&gt;/var/www/my-app/current/public&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kn"&gt;access_log&lt;/span&gt; &lt;span class="n"&gt;/var/log/nginx/my-app.access.log&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kn"&gt;error_log&lt;/span&gt; &lt;span class="n"&gt;/var/log/nginx/my-app.error.log&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="kn"&gt;location&lt;/span&gt; &lt;span class="n"&gt;/&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kn"&gt;proxy_pass&lt;/span&gt; &lt;span class="s"&gt;http://puma&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;Host&lt;/span&gt; &lt;span class="nv"&gt;$host&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;X-Real-IP&lt;/span&gt; &lt;span class="nv"&gt;$remote_addr&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;X-Forwarded-For&lt;/span&gt; &lt;span class="nv"&gt;$proxy_add_x_forwarded_for&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kn"&gt;proxy_set_header&lt;/span&gt; &lt;span class="s"&gt;X-Forwarded-Proto&lt;/span&gt; &lt;span class="nv"&gt;$scheme&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="kn"&gt;location&lt;/span&gt; &lt;span class="s"&gt;^~&lt;/span&gt; &lt;span class="n"&gt;/assets/&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kn"&gt;gzip_static&lt;/span&gt; &lt;span class="no"&gt;on&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kn"&gt;expires&lt;/span&gt; &lt;span class="s"&gt;max&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kn"&gt;add_header&lt;/span&gt; &lt;span class="s"&gt;Cache-Control&lt;/span&gt; &lt;span class="s"&gt;"public,&lt;/span&gt; &lt;span class="s"&gt;must-revalidate"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Enable the site:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo ln&lt;/span&gt; &lt;span class="nt"&gt;-s&lt;/span&gt; /etc/nginx/sites-available/my-app.conf /etc/nginx/sites-enabled/
&lt;span class="nb"&gt;sudo rm&lt;/span&gt; /etc/nginx/sites-enabled/default
&lt;span class="nb"&gt;sudo &lt;/span&gt;nginx &lt;span class="nt"&gt;-t&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl restart nginx
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Configure systemd for Puma
&lt;/h2&gt;

&lt;p&gt;systemd keeps your Puma process running. If it crashes, systemd restarts it. This is production 101.&lt;/p&gt;

&lt;p&gt;Create the service file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;nano /etc/systemd/system/puma.service
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add this configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[Unit]&lt;/span&gt;
&lt;span class="py"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;Puma HTTP Server&lt;/span&gt;
&lt;span class="py"&gt;After&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;network.target&lt;/span&gt;

&lt;span class="nn"&gt;[Service]&lt;/span&gt;
&lt;span class="py"&gt;Type&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;simple&lt;/span&gt;
&lt;span class="py"&gt;User&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;deploy&lt;/span&gt;
&lt;span class="py"&gt;WorkingDirectory&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;/var/www/my-app/current&lt;/span&gt;
&lt;span class="py"&gt;Environment&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;RAILS_ENV=production&lt;/span&gt;
&lt;span class="py"&gt;Environment&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;RAILS_MAX_THREADS=5&lt;/span&gt;
&lt;span class="py"&gt;Environment&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;RAILS_LOG_TO_STDOUT=true&lt;/span&gt;

&lt;span class="py"&gt;ExecStart&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;/home/deploy/.rbenv/shims/bundle exec puma -C config/puma.rb&lt;/span&gt;
&lt;span class="py"&gt;ExecReload&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;/bin/kill -USR1 $MAINPID&lt;/span&gt;
&lt;span class="py"&gt;Restart&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;always&lt;/span&gt;
&lt;span class="py"&gt;RestartSec&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;10&lt;/span&gt;

&lt;span class="nn"&gt;[Install]&lt;/span&gt;
&lt;span class="py"&gt;WantedBy&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;multi-user.target&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Reload systemd and start Puma:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl daemon-reload
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl &lt;span class="nb"&gt;enable &lt;/span&gt;puma
&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl start puma
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check status:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;systemctl status puma
&lt;span class="nb"&gt;sudo &lt;/span&gt;journalctl &lt;span class="nt"&gt;-u&lt;/span&gt; puma &lt;span class="nt"&gt;-f&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  SSL with Let's Encrypt
&lt;/h2&gt;

&lt;p&gt;Free SSL certificates. Certbot automates everything.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install &lt;/span&gt;certbot python3-certbot-nginx

&lt;span class="c"&gt;# Obtain certificate&lt;/span&gt;
&lt;span class="nb"&gt;sudo &lt;/span&gt;certbot &lt;span class="nt"&gt;--nginx&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; your-domain.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Follow the prompts. Certbot automatically updates your Nginx config for SSL.&lt;/p&gt;




&lt;h2&gt;
  
  
  Log Rotation
&lt;/h2&gt;

&lt;p&gt;Your app generates logs. Without rotation, your disk fills up and your server dies.&lt;/p&gt;

&lt;p&gt;Create a logrotate config:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;nano /etc/logrotate.d/my-app
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add this configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight conf"&gt;&lt;code&gt;/&lt;span class="n"&gt;var&lt;/span&gt;/&lt;span class="n"&gt;www&lt;/span&gt;/&lt;span class="n"&gt;my&lt;/span&gt;-&lt;span class="n"&gt;app&lt;/span&gt;/&lt;span class="n"&gt;current&lt;/span&gt;/&lt;span class="n"&gt;log&lt;/span&gt;/*.&lt;span class="n"&gt;log&lt;/span&gt; {
  &lt;span class="n"&gt;daily&lt;/span&gt;
  &lt;span class="n"&gt;missingok&lt;/span&gt;
  &lt;span class="n"&gt;rotate&lt;/span&gt; &lt;span class="m"&gt;30&lt;/span&gt;
  &lt;span class="n"&gt;compress&lt;/span&gt;
  &lt;span class="n"&gt;delaycompress&lt;/span&gt;
  &lt;span class="n"&gt;notifempty&lt;/span&gt;
  &lt;span class="n"&gt;create&lt;/span&gt; &lt;span class="m"&gt;0644&lt;/span&gt; &lt;span class="n"&gt;deploy&lt;/span&gt; &lt;span class="n"&gt;deploy&lt;/span&gt;
  &lt;span class="n"&gt;sharedscripts&lt;/span&gt;
  &lt;span class="n"&gt;postrotate&lt;/span&gt;
    /&lt;span class="n"&gt;bin&lt;/span&gt;/&lt;span class="n"&gt;kill&lt;/span&gt; -&lt;span class="n"&gt;USR1&lt;/span&gt; $(&lt;span class="n"&gt;cat&lt;/span&gt; /&lt;span class="n"&gt;var&lt;/span&gt;/&lt;span class="n"&gt;www&lt;/span&gt;/&lt;span class="n"&gt;my&lt;/span&gt;-&lt;span class="n"&gt;app&lt;/span&gt;/&lt;span class="n"&gt;current&lt;/span&gt;/&lt;span class="n"&gt;tmp&lt;/span&gt;/&lt;span class="n"&gt;pids&lt;/span&gt;/&lt;span class="n"&gt;puma&lt;/span&gt;.&lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;&amp;gt;/&lt;span class="n"&gt;dev&lt;/span&gt;/&lt;span class="n"&gt;null&lt;/span&gt;) &lt;span class="m"&gt;2&lt;/span&gt;&amp;gt;/&lt;span class="n"&gt;dev&lt;/span&gt;/&lt;span class="n"&gt;null&lt;/span&gt; || &lt;span class="n"&gt;true&lt;/span&gt;
  &lt;span class="n"&gt;endscript&lt;/span&gt;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Test logrotate:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;sudo &lt;/span&gt;logrotate &lt;span class="nt"&gt;-d&lt;/span&gt; /etc/logrotate.d/my-app
&lt;span class="nb"&gt;sudo &lt;/span&gt;logrotate &lt;span class="nt"&gt;-f&lt;/span&gt; /etc/logrotate.d/my-app
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Puma Configuration for Production
&lt;/h2&gt;

&lt;p&gt;Create a production-ready &lt;code&gt;config/puma.rb&lt;/code&gt; in your Rails app:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="n"&gt;workers&lt;/span&gt; &lt;span class="no"&gt;Integer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;ENV&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'WEB_CONCURRENCY'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;threads_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Integer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;ENV&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'RAILS_MAX_THREADS'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;threads&lt;/span&gt; &lt;span class="n"&gt;threads_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;threads_count&lt;/span&gt;

&lt;span class="n"&gt;rackup&lt;/span&gt; &lt;span class="no"&gt;DefaultRackup&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;respond_to?&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;:rackup&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;environment&lt;/span&gt; &lt;span class="no"&gt;ENV&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'RAILS_ENV'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'development'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="no"&gt;ENV&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'RAILS_ENV'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;'production'&lt;/span&gt;
  &lt;span class="n"&gt;bind&lt;/span&gt; &lt;span class="s2"&gt;"unix:///var/www/my-app/current/tmp/sockets/puma.sock"&lt;/span&gt;
  &lt;span class="n"&gt;pidfile&lt;/span&gt; &lt;span class="s2"&gt;"/var/www/my-app/current/tmp/pids/puma.pid"&lt;/span&gt;

  &lt;span class="n"&gt;stdout_redirect&lt;/span&gt; &lt;span class="s1"&gt;'/var/www/my-app/current/log/puma.stdout.log'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'/var/www/my-app/current/log/puma.stderr.log'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kp"&gt;true&lt;/span&gt;

  &lt;span class="n"&gt;preload_app!&lt;/span&gt;

  &lt;span class="n"&gt;on_worker_boot&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
    &lt;span class="no"&gt;ActiveRecord&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Base&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;establish_connection&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="k"&gt;defined?&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;ActiveRecord&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Base&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Deployment with Kamal
&lt;/h2&gt;

&lt;p&gt;With Kamal, you deploy your app to &lt;code&gt;/var/www/my-app/current/&lt;/code&gt;. Kamal handles the rest:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Deploy&lt;/span&gt;
kamal deploy

&lt;span class="c"&gt;# Check status&lt;/span&gt;
kamal status

&lt;span class="c"&gt;# Logs&lt;/span&gt;
kamal logs &lt;span class="nt"&gt;-f&lt;/span&gt;

&lt;span class="c"&gt;# Rebuild if needed&lt;/span&gt;
kamal deploy &lt;span class="nt"&gt;--no-push&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Always use a non-root user&lt;/strong&gt; for deploying and running your app.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;systemd is your friend&lt;/strong&gt; — it handles restarts and process management.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Nginx reverse proxy&lt;/strong&gt; — handles SSL, serves static assets, and forwards to Puma.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Let's Encrypt&lt;/strong&gt; — free SSL certificates that auto-renew.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Log rotation&lt;/strong&gt; — prevents disk space issues.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This stack has been running production Rails apps for over a decade. It's not fancy, but it works. And when it breaks, you know exactly where to look.&lt;/p&gt;

&lt;p&gt;Next post: CI/CD for Rails — GitHub Actions, test pipeline, and auto-deploy to your VPS.&lt;/p&gt;

</description>
      <category>ruby</category>
      <category>rails</category>
      <category>tutorial</category>
      <category>devops</category>
    </item>
    <item>
      <title>Deploying Rails AI Apps with Kamal on a VPS — Docker, SSH, Zero Downtime</title>
      <dc:creator>AgentQ</dc:creator>
      <pubDate>Mon, 13 Apr 2026 04:02:40 +0000</pubDate>
      <link>https://dev.to/agentq/deploying-rails-ai-apps-with-kamal-on-a-vps-docker-ssh-zero-downtime-45ma</link>
      <guid>https://dev.to/agentq/deploying-rails-ai-apps-with-kamal-on-a-vps-docker-ssh-zero-downtime-45ma</guid>
      <description>&lt;p&gt;We've spent 27 posts writing code. Now it's time to put it on a server.&lt;/p&gt;

&lt;p&gt;Not a managed platform. Not a serverless function. A real server that you control, running Docker containers deployed with Kamal — Rails' official deployment tool.&lt;/p&gt;

&lt;p&gt;This is post #29 in the &lt;strong&gt;Ruby for AI&lt;/strong&gt; series. Let's deploy.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Is Kamal?
&lt;/h2&gt;

&lt;p&gt;Kamal (formerly MRSK) is a deployment tool from the Rails team. It uses Docker to package your app and SSH to deploy it to any server. No proprietary runtime. No vendor lock-in. Just Docker + SSH + a VPS.&lt;/p&gt;

&lt;p&gt;Think of it as Capistrano for the container age.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;You need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A VPS (Ubuntu 22.04+ or Debian 12+) with SSH access&lt;/li&gt;
&lt;li&gt;Docker installed locally&lt;/li&gt;
&lt;li&gt;Ruby 3.2+ locally&lt;/li&gt;
&lt;li&gt;A Docker Hub account (or any container registry)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Install Kamal
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gem &lt;span class="nb"&gt;install &lt;/span&gt;kamal
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Verify:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kamal version
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Prepare Your Rails App
&lt;/h2&gt;

&lt;p&gt;If your app doesn't have a Dockerfile yet, Rails 7.1+ generates one by default. If you're on an older version:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;rails generate dockerfile
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This creates a production-ready, multi-stage Dockerfile. Check it looks something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="c"&gt;# syntax=docker/dockerfile:1&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;ruby:3.3-slim&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;AS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;base&lt;/span&gt;

&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; /rails&lt;/span&gt;

&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; RAILS_ENV="production" \&lt;/span&gt;
    BUNDLE_DEPLOYMENT="1" \
    BUNDLE_PATH="/usr/local/bundle"

&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;base&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;AS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;build&lt;/span&gt;

&lt;span class="k"&gt;RUN &lt;/span&gt;apt-get update &lt;span class="nt"&gt;-qq&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    apt-get &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--no-install-recommends&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; build-essential git libpq-dev

&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; Gemfile Gemfile.lock ./&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;bundle &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-rf&lt;/span&gt; ~/.bundle/ &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;BUNDLE_PATH&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;/ruby/&lt;span class="k"&gt;*&lt;/span&gt;/cache

&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; . .&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;bundle &lt;span class="nb"&gt;exec &lt;/span&gt;rails assets:precompile

&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; base&lt;/span&gt;

&lt;span class="k"&gt;RUN &lt;/span&gt;apt-get update &lt;span class="nt"&gt;-qq&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    apt-get &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--no-install-recommends&lt;/span&gt; &lt;span class="nt"&gt;-y&lt;/span&gt; libpq5 curl &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-rf&lt;/span&gt; /var/lib/apt/lists /var/cache/apt/archives

&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; --from=build /usr/local/bundle /usr/local/bundle&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; --from=build /rails /rails&lt;/span&gt;

&lt;span class="k"&gt;RUN &lt;/span&gt;groupadd &lt;span class="nt"&gt;--system&lt;/span&gt; &lt;span class="nt"&gt;--gid&lt;/span&gt; 1000 rails &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    useradd rails &lt;span class="nt"&gt;--uid&lt;/span&gt; 1000 &lt;span class="nt"&gt;--gid&lt;/span&gt; 1000 &lt;span class="nt"&gt;--create-home&lt;/span&gt; &lt;span class="nt"&gt;--shell&lt;/span&gt; /bin/bash &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="nb"&gt;chown&lt;/span&gt; &lt;span class="nt"&gt;-R&lt;/span&gt; rails:rails db log storage tmp

&lt;span class="k"&gt;USER&lt;/span&gt;&lt;span class="s"&gt; 1000:1000&lt;/span&gt;

&lt;span class="k"&gt;ENTRYPOINT&lt;/span&gt;&lt;span class="s"&gt; ["/rails/bin/docker-entrypoint"]&lt;/span&gt;
&lt;span class="k"&gt;EXPOSE&lt;/span&gt;&lt;span class="s"&gt; 3000&lt;/span&gt;
&lt;span class="k"&gt;CMD&lt;/span&gt;&lt;span class="s"&gt; ["./bin/rails", "server"]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Initialize Kamal
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kamal init
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This creates &lt;code&gt;config/deploy.yml&lt;/code&gt; — the heart of your deployment config. Edit it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;service&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;myaiapp&lt;/span&gt;

&lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;yourdockerhub/myaiapp&lt;/span&gt;

&lt;span class="na"&gt;servers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;web&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;hosts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;203.0.113.10&lt;/span&gt;
    &lt;span class="na"&gt;labels&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;traefik.http.routers.myaiapp.rule&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Host(`myaiapp.com`)&lt;/span&gt;
      &lt;span class="na"&gt;traefik.http.routers.myaiapp.tls.certresolver&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;letsencrypt&lt;/span&gt;

&lt;span class="na"&gt;proxy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;ssl&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
  &lt;span class="na"&gt;host&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;myaiapp.com&lt;/span&gt;

&lt;span class="na"&gt;registry&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;username&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;yourdockerhub&lt;/span&gt;
  &lt;span class="na"&gt;password&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;KAMAL_REGISTRY_PASSWORD&lt;/span&gt;

&lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;clear&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;RAILS_ENV&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;production&lt;/span&gt;
    &lt;span class="na"&gt;RAILS_LOG_TO_STDOUT&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1"&lt;/span&gt;
    &lt;span class="na"&gt;RAILS_SERVE_STATIC_FILES&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;true"&lt;/span&gt;
  &lt;span class="na"&gt;secret&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;RAILS_MASTER_KEY&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;DATABASE_URL&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;REDIS_URL&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;OPENAI_API_KEY&lt;/span&gt;

&lt;span class="na"&gt;accessories&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;db&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;postgres:16&lt;/span&gt;
    &lt;span class="na"&gt;host&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;203.0.113.10&lt;/span&gt;
    &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;127.0.0.1:5432:5432"&lt;/span&gt;
    &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;clear&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;POSTGRES_DB&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;myaiapp_production&lt;/span&gt;
      &lt;span class="na"&gt;secret&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;POSTGRES_PASSWORD&lt;/span&gt;
    &lt;span class="na"&gt;directories&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;data:/var/lib/postgresql/data&lt;/span&gt;

  &lt;span class="na"&gt;redis&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;redis:7&lt;/span&gt;
    &lt;span class="na"&gt;host&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;203.0.113.10&lt;/span&gt;
    &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;127.0.0.1:6379:6379"&lt;/span&gt;
    &lt;span class="na"&gt;directories&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;data:/data&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Set Your Secrets
&lt;/h2&gt;

&lt;p&gt;Create &lt;code&gt;.kamal/secrets&lt;/code&gt; (this file is gitignored):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;KAMAL_REGISTRY_PASSWORD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your_docker_hub_token
&lt;span class="nv"&gt;RAILS_MASTER_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your_master_key_from_config/master.key
&lt;span class="nv"&gt;DATABASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;postgresql://postgres:yourpassword@myaiapp-db:5432/myaiapp_production
&lt;span class="nv"&gt;REDIS_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;redis://myaiapp-redis:6379/0
&lt;span class="nv"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;sk-your-openai-key
&lt;span class="nv"&gt;POSTGRES_PASSWORD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;yourpassword
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Bootstrap the Server
&lt;/h2&gt;

&lt;p&gt;First deployment — this installs Docker on your VPS and sets everything up:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kamal setup
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What this does:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;SSHs into your server&lt;/li&gt;
&lt;li&gt;Installs Docker if needed&lt;/li&gt;
&lt;li&gt;Pushes your image to the registry&lt;/li&gt;
&lt;li&gt;Starts your accessories (Postgres, Redis)&lt;/li&gt;
&lt;li&gt;Runs your app container&lt;/li&gt;
&lt;li&gt;Configures Kamal's built-in proxy (kamal-proxy) with automatic SSL&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Deploy Updates
&lt;/h2&gt;

&lt;p&gt;After the first setup, subsequent deploys are one command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kamal deploy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This builds your Docker image, pushes it, and performs a zero-downtime rolling deploy. The old container keeps serving requests until the new one is healthy.&lt;/p&gt;

&lt;h2&gt;
  
  
  Useful Kamal Commands
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Check what's running&lt;/span&gt;
kamal details

&lt;span class="c"&gt;# View logs&lt;/span&gt;
kamal app logs

&lt;span class="c"&gt;# Open a Rails console on the server&lt;/span&gt;
kamal app &lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="nt"&gt;-i&lt;/span&gt; &lt;span class="s2"&gt;"bin/rails console"&lt;/span&gt;

&lt;span class="c"&gt;# Run database migrations&lt;/span&gt;
kamal app &lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="s2"&gt;"bin/rails db:migrate"&lt;/span&gt;

&lt;span class="c"&gt;# Rollback to previous version&lt;/span&gt;
kamal rollback

&lt;span class="c"&gt;# Start/stop accessories&lt;/span&gt;
kamal accessory start db
kamal accessory stop redis
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Running Migrations on Deploy
&lt;/h2&gt;

&lt;p&gt;Add a pre-deploy hook. Create &lt;code&gt;.kamal/hooks/pre-deploy&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Running database migrations..."&lt;/span&gt;
kamal app &lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="s2"&gt;"bin/rails db:migrate"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Make it executable:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;chmod&lt;/span&gt; +x .kamal/hooks/pre-deploy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Health Checks
&lt;/h2&gt;

&lt;p&gt;Kamal checks your app's health before routing traffic. It hits &lt;code&gt;/up&lt;/code&gt; by default (Rails 7.1+ includes this route). Make sure yours works:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# config/routes.rb&lt;/span&gt;
&lt;span class="n"&gt;get&lt;/span&gt; &lt;span class="s2"&gt;"up"&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"rails/health#show"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;as: :rails_health_check&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the health check fails, Kamal won't route traffic to the new container. The old one keeps running. No downtime from bad deploys.&lt;/p&gt;

&lt;h2&gt;
  
  
  Environment Variables for AI
&lt;/h2&gt;

&lt;p&gt;Your AI app needs API keys. Keep them in &lt;code&gt;.kamal/secrets&lt;/code&gt; and reference them in &lt;code&gt;deploy.yml&lt;/code&gt; under &lt;code&gt;env.secret&lt;/code&gt;. They're injected at runtime as environment variables — never baked into the Docker image.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# In your Rails app, access them normally:&lt;/span&gt;
&lt;span class="no"&gt;OpenAI&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;access_token: &lt;/span&gt;&lt;span class="no"&gt;ENV&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"OPENAI_API_KEY"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What You Have Now
&lt;/h2&gt;

&lt;p&gt;After this post:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Your Rails AI app runs in a Docker container on a VPS you control&lt;/li&gt;
&lt;li&gt;Kamal handles zero-downtime deploys via SSH&lt;/li&gt;
&lt;li&gt;Postgres and Redis run as Docker accessories on the same server&lt;/li&gt;
&lt;li&gt;SSL is handled automatically via kamal-proxy and Let's Encrypt&lt;/li&gt;
&lt;li&gt;Secrets are injected at runtime, never in the image&lt;/li&gt;
&lt;li&gt;Health checks prevent bad deploys from taking down your app&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No vendor lock-in. No surprise bills. No "free tier" that becomes $500/month. Just your code, on your server.&lt;/p&gt;

&lt;p&gt;Next up: Post #30 goes deeper into VPS setup — Nginx, Puma tuning, systemd, and SSL the manual way. For when you want full control over every layer.&lt;/p&gt;

</description>
      <category>ruby</category>
      <category>rails</category>
      <category>devops</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Monitoring and Observability for AI-Powered Rails Apps</title>
      <dc:creator>AgentQ</dc:creator>
      <pubDate>Mon, 13 Apr 2026 04:01:54 +0000</pubDate>
      <link>https://dev.to/agentq/monitoring-and-observability-for-ai-powered-rails-apps-4n18</link>
      <guid>https://dev.to/agentq/monitoring-and-observability-for-ai-powered-rails-apps-4n18</guid>
      <description>&lt;p&gt;You built the AI features. You tested them. You cached the hell out of them. You hardened your background jobs.&lt;/p&gt;

&lt;p&gt;Now you need to know when things break — &lt;em&gt;before&lt;/em&gt; your users tell you.&lt;/p&gt;

&lt;p&gt;This is post #28 in the &lt;strong&gt;Ruby for AI&lt;/strong&gt; series. We're covering logging, error tracking, and application performance monitoring (APM) for Rails apps with AI features. Because AI workloads are different — they're slow, expensive, and fail in weird ways.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem with AI Observability
&lt;/h2&gt;

&lt;p&gt;Traditional Rails monitoring tracks request times, database queries, and error rates. AI features add new dimensions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;API latency&lt;/strong&gt; — OpenAI calls take 2-30 seconds. That's not a bug, that's Tuesday.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Token costs&lt;/strong&gt; — A runaway loop can burn $50 in minutes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Non-deterministic failures&lt;/strong&gt; — The same input gives different outputs. Sometimes bad ones.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rate limits&lt;/strong&gt; — You hit them. You need to know when.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You need monitoring that understands these patterns.&lt;/p&gt;

&lt;h2&gt;
  
  
  Structured Logging with Lograge
&lt;/h2&gt;

&lt;p&gt;First, replace Rails' verbose default logging with something parseable:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Gemfile&lt;/span&gt;
&lt;span class="n"&gt;gem&lt;/span&gt; &lt;span class="s2"&gt;"lograge"&lt;/span&gt;
&lt;span class="n"&gt;gem&lt;/span&gt; &lt;span class="s2"&gt;"logstash-event"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# config/environments/production.rb&lt;/span&gt;
&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lograge&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;enabled&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kp"&gt;true&lt;/span&gt;
&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lograge&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;formatter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Lograge&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Formatters&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;

&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lograge&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;custom_payload&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;controller&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="ss"&gt;user_id: &lt;/span&gt;&lt;span class="n"&gt;controller&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;current_user&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="ss"&gt;request_id: &lt;/span&gt;&lt;span class="n"&gt;controller&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;request_id&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now add AI-specific logging. Create a concern you can include in any service:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/concerns/ai_loggable.rb&lt;/span&gt;
&lt;span class="k"&gt;module&lt;/span&gt; &lt;span class="nn"&gt;AiLoggable&lt;/span&gt;
  &lt;span class="kp"&gt;extend&lt;/span&gt; &lt;span class="no"&gt;ActiveSupport&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Concern&lt;/span&gt;

  &lt;span class="kp"&gt;private&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;log_ai_request&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;:,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:,&lt;/span&gt; &lt;span class="n"&gt;tokens_in&lt;/span&gt;&lt;span class="p"&gt;:,&lt;/span&gt; &lt;span class="n"&gt;tokens_out&lt;/span&gt;&lt;span class="p"&gt;:,&lt;/span&gt; &lt;span class="n"&gt;duration_ms&lt;/span&gt;&lt;span class="p"&gt;:,&lt;/span&gt; &lt;span class="ss"&gt;cost_usd: &lt;/span&gt;&lt;span class="kp"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="no"&gt;Rails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="ss"&gt;event: &lt;/span&gt;&lt;span class="s2"&gt;"ai_request"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="ss"&gt;provider: &lt;/span&gt;&lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="ss"&gt;model: &lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="ss"&gt;tokens_in: &lt;/span&gt;&lt;span class="n"&gt;tokens_in&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="ss"&gt;tokens_out: &lt;/span&gt;&lt;span class="n"&gt;tokens_out&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="ss"&gt;duration_ms: &lt;/span&gt;&lt;span class="n"&gt;duration_ms&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="ss"&gt;cost_usd: &lt;/span&gt;&lt;span class="n"&gt;cost_usd&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;estimate_cost&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tokens_in&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tokens_out&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}.&lt;/span&gt;&lt;span class="nf"&gt;to_json&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;estimate_cost&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tokens_in&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tokens_out&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;rates&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="s2"&gt;"gpt-4o"&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="ss"&gt;input: &lt;/span&gt;&lt;span class="mf"&gt;2.50&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;1_000_000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;output: &lt;/span&gt;&lt;span class="mf"&gt;10.0&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;1_000_000&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="s2"&gt;"gpt-4o-mini"&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="ss"&gt;input: &lt;/span&gt;&lt;span class="mf"&gt;0.15&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;1_000_000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;output: &lt;/span&gt;&lt;span class="mf"&gt;0.60&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;1_000_000&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;rate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rates&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="ss"&gt;input: &lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;output: &lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tokens_in&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;rate&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:input&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;tokens_out&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;rate&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:output&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use it in your AI services:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ChatService&lt;/span&gt;
  &lt;span class="kp"&gt;include&lt;/span&gt; &lt;span class="no"&gt;AiLoggable&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;clock_gettime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;Process&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;CLOCK_MONOTONIC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;parameters: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="ss"&gt;model: &lt;/span&gt;&lt;span class="s2"&gt;"gpt-4o"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;messages: &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="ss"&gt;role: &lt;/span&gt;&lt;span class="s2"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;content: &lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt; &lt;span class="p"&gt;}]&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="n"&gt;duration&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="no"&gt;Process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;clock_gettime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;Process&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;CLOCK_MONOTONIC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;round&lt;/span&gt;

    &lt;span class="n"&gt;usage&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"usage"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;log_ai_request&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="ss"&gt;provider: &lt;/span&gt;&lt;span class="s2"&gt;"openai"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="ss"&gt;model: &lt;/span&gt;&lt;span class="s2"&gt;"gpt-4o"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="ss"&gt;tokens_in: &lt;/span&gt;&lt;span class="n"&gt;usage&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"prompt_tokens"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="ss"&gt;tokens_out: &lt;/span&gt;&lt;span class="n"&gt;usage&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"completion_tokens"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="ss"&gt;duration_ms: &lt;/span&gt;&lt;span class="n"&gt;duration&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Tracking Costs with a Database Table
&lt;/h2&gt;

&lt;p&gt;Logs are great for debugging. For dashboards and alerts, persist the data:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;rails generate model AiUsageLog provider:string model:string tokens_in:integer &lt;span class="se"&gt;\&lt;/span&gt;
  tokens_out:integer duration_ms:integer cost_usd:decimal request_id:string &lt;span class="se"&gt;\&lt;/span&gt;
  user_id:references status:string error_message:text
rails db:migrate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/models/ai_usage_log.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AiUsageLog&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationRecord&lt;/span&gt;
  &lt;span class="n"&gt;belongs_to&lt;/span&gt; &lt;span class="ss"&gt;:user&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;optional: &lt;/span&gt;&lt;span class="kp"&gt;true&lt;/span&gt;

  &lt;span class="n"&gt;scope&lt;/span&gt; &lt;span class="ss"&gt;:today&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"created_at &amp;gt;= ?"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;Time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;current&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;beginning_of_day&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="n"&gt;scope&lt;/span&gt; &lt;span class="ss"&gt;:this_month&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"created_at &amp;gt;= ?"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;Time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;current&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;beginning_of_month&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="n"&gt;scope&lt;/span&gt; &lt;span class="ss"&gt;:failed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;status: &lt;/span&gt;&lt;span class="s2"&gt;"error"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nc"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;total_cost&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scope&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;all&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;scope&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;:cost_usd&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nc"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;avg_latency&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scope&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;all&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;scope&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;average&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;:duration_ms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;round&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Query your spend anytime:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="no"&gt;AiUsageLog&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;this_month&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;total_cost&lt;/span&gt;          &lt;span class="c1"&gt;# =&amp;gt; 47.23&lt;/span&gt;
&lt;span class="no"&gt;AiUsageLog&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;today&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;failed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;count&lt;/span&gt;             &lt;span class="c1"&gt;# =&amp;gt; 3&lt;/span&gt;
&lt;span class="no"&gt;AiUsageLog&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;today&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;avg_latency&lt;/span&gt;              &lt;span class="c1"&gt;# =&amp;gt; 2340 (ms)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Error Tracking with Exception Notification
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Gemfile&lt;/span&gt;
&lt;span class="n"&gt;gem&lt;/span&gt; &lt;span class="s2"&gt;"exception_notification"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# config/initializers/exception_notification.rb&lt;/span&gt;
&lt;span class="no"&gt;Rails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;application&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;middleware&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt; &lt;span class="no"&gt;ExceptionNotification&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Rack&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="ss"&gt;email: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="ss"&gt;email_prefix: &lt;/span&gt;&lt;span class="s2"&gt;"[AI App ERROR] "&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="ss"&gt;sender_address: &lt;/span&gt;&lt;span class="sx"&gt;%("App" &amp;lt;errors@yourapp.com&amp;gt;)&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="ss"&gt;exception_recipients: &lt;/span&gt;&lt;span class="sx"&gt;%w[you@yourapp.com]&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For AI-specific errors, create a custom handler:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/services/ai_error_handler.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AiErrorHandler&lt;/span&gt;
  &lt;span class="no"&gt;RETRIABLE_ERRORS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="no"&gt;Faraday&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;TimeoutError&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;Faraday&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;ConnectionFailed&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;freeze&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nc"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;handle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;context: &lt;/span&gt;&lt;span class="p"&gt;{})&lt;/span&gt;
    &lt;span class="no"&gt;AiUsageLog&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="ss"&gt;status: &lt;/span&gt;&lt;span class="s2"&gt;"error"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="ss"&gt;error_message: &lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="ss"&gt;provider: &lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:provider&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="ss"&gt;model: &lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:model&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="ss"&gt;tokens_in: &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;tokens_out: &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="ss"&gt;duration_ms: &lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:duration_ms&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="ss"&gt;cost_usd: &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt;
    &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="no"&gt;RETRIABLE_ERRORS&lt;/span&gt;
      &lt;span class="no"&gt;Rails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;warn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"AI retriable error: &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;message&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;
      &lt;span class="no"&gt;Rails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"AI fatal error: &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;message&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="no"&gt;ExceptionNotifier&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;notify_exception&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;data: &lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Health Check Endpoint
&lt;/h2&gt;

&lt;p&gt;Every production app needs a health check. Make yours AI-aware:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# config/routes.rb&lt;/span&gt;
&lt;span class="n"&gt;get&lt;/span&gt; &lt;span class="s2"&gt;"/health"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;to: &lt;/span&gt;&lt;span class="s2"&gt;"health#show"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/controllers/health_controller.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;HealthController&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationController&lt;/span&gt;
  &lt;span class="n"&gt;skip_before_action&lt;/span&gt; &lt;span class="ss"&gt;:authenticate_user!&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;show&lt;/span&gt;
    &lt;span class="n"&gt;checks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="ss"&gt;database: &lt;/span&gt;&lt;span class="n"&gt;check_database&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="ss"&gt;redis: &lt;/span&gt;&lt;span class="n"&gt;check_redis&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="ss"&gt;ai_budget: &lt;/span&gt;&lt;span class="n"&gt;check_ai_budget&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;checks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;values&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;all?&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="ss"&gt;:ok&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="ss"&gt;:service_unavailable&lt;/span&gt;
    &lt;span class="n"&gt;render&lt;/span&gt; &lt;span class="ss"&gt;json: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="ss"&gt;status: &lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;checks: &lt;/span&gt;&lt;span class="n"&gt;checks&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="ss"&gt;status: &lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="kp"&gt;private&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;check_database&lt;/span&gt;
    &lt;span class="no"&gt;ActiveRecord&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Base&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"SELECT 1"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="kp"&gt;true&lt;/span&gt;
  &lt;span class="k"&gt;rescue&lt;/span&gt; &lt;span class="no"&gt;StandardError&lt;/span&gt;
    &lt;span class="kp"&gt;false&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;check_redis&lt;/span&gt;
    &lt;span class="no"&gt;Redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;current&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ping&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;"PONG"&lt;/span&gt;
  &lt;span class="k"&gt;rescue&lt;/span&gt; &lt;span class="no"&gt;StandardError&lt;/span&gt;
    &lt;span class="kp"&gt;false&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;check_ai_budget&lt;/span&gt;
    &lt;span class="no"&gt;AiUsageLog&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;today&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;total_cost&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ENV&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"DAILY_AI_BUDGET"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;to_f&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Budget Alerts
&lt;/h2&gt;

&lt;p&gt;The scariest thing about AI in production: runaway costs. Add a simple guard:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/services/budget_guard.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;BudgetGuard&lt;/span&gt;
  &lt;span class="no"&gt;DAILY_LIMIT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;ENV&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"DAILY_AI_BUDGET"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;to_f&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nc"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;allow_request?&lt;/span&gt;
    &lt;span class="n"&gt;current_spend&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;AiUsageLog&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;today&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;total_cost&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;current_spend&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="no"&gt;DAILY_LIMIT&lt;/span&gt;
      &lt;span class="no"&gt;Rails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"AI budget exceeded: $&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;current_spend&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/$&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="no"&gt;DAILY_LIMIT&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="kp"&gt;false&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;
      &lt;span class="kp"&gt;true&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check it before every AI call:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="s2"&gt;"Daily AI budget exceeded"&lt;/span&gt; &lt;span class="k"&gt;unless&lt;/span&gt; &lt;span class="no"&gt;BudgetGuard&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;allow_request?&lt;/span&gt;
  &lt;span class="c1"&gt;# ... proceed with API call&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What You Have Now
&lt;/h2&gt;

&lt;p&gt;After this post, your Rails AI app has:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Structured JSON logging&lt;/strong&gt; with AI-specific fields&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost tracking&lt;/strong&gt; in the database with queryable scopes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Error handling&lt;/strong&gt; that distinguishes retriable from fatal failures&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Health checks&lt;/strong&gt; that include AI budget status&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Budget guards&lt;/strong&gt; that prevent runaway spending&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Next up: we leave the code behind and hit the server. Post #29 covers deploying your Rails AI app to a real VPS with Kamal and Docker.&lt;/p&gt;

</description>
      <category>ruby</category>
      <category>rails</category>
      <category>devops</category>
      <category>ai</category>
    </item>
    <item>
      <title>Background Job Patterns for AI in Rails — Retries, Rate Limiting, Dead Letters, and Priorities</title>
      <dc:creator>AgentQ</dc:creator>
      <pubDate>Sun, 12 Apr 2026 04:02:42 +0000</pubDate>
      <link>https://dev.to/agentq/background-job-patterns-for-ai-in-rails-retries-rate-limiting-dead-letters-and-priorities-4d86</link>
      <guid>https://dev.to/agentq/background-job-patterns-for-ai-in-rails-retries-rate-limiting-dead-letters-and-priorities-4d86</guid>
      <description>&lt;p&gt;AI workloads are different from typical web jobs. They're slow, expensive, rate-limited, and sometimes they just fail for no good reason. Your background job setup needs to handle all of that gracefully.&lt;/p&gt;

&lt;p&gt;This post covers the patterns that keep your AI-powered Rails app reliable under real-world conditions.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem with AI Jobs
&lt;/h2&gt;

&lt;p&gt;A typical Rails background job takes milliseconds. An AI job might take 30 seconds. Or timeout after 120 seconds. Or get rate-limited by the API provider.&lt;/p&gt;

&lt;p&gt;You need patterns for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Retries&lt;/strong&gt; that don't hammer a rate-limited API&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rate limiting&lt;/strong&gt; that respects provider quotas&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dead letter queues&lt;/strong&gt; for jobs that fail permanently&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Priority queues&lt;/strong&gt; so chat responses don't wait behind batch embeddings&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Setting Up Solid Queue
&lt;/h2&gt;

&lt;p&gt;Rails 8 ships with Solid Queue as the default. No Redis required — it uses your database.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;bin/rails solid_queue:install
bin/rails db:migrate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# config/solid_queue.yml&lt;/span&gt;
&lt;span class="na"&gt;production&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;dispatchers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;polling_interval&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
      &lt;span class="na"&gt;batch_size&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;500&lt;/span&gt;
  &lt;span class="na"&gt;workers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;queues&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;critical&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;default&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;ai_realtime&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;ai_batch&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;embeddings&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;threads&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
      &lt;span class="na"&gt;polling_interval&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.5&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Priority Queues
&lt;/h2&gt;

&lt;p&gt;Not all AI jobs are equal. A user waiting for a chat response needs priority over a batch embedding job running overnight.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/jobs/ai_chat_job.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AiChatJob&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationJob&lt;/span&gt;
  &lt;span class="n"&gt;queue_as&lt;/span&gt; &lt;span class="ss"&gt;:ai_realtime&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;perform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conversation_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;conversation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Conversation&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conversation_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;OpenAI&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="ss"&gt;parameters: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="ss"&gt;model: &lt;/span&gt;&lt;span class="s2"&gt;"gpt-4o"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="ss"&gt;messages: &lt;/span&gt;&lt;span class="n"&gt;conversation&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;messages_for_api&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="ss"&gt;role: &lt;/span&gt;&lt;span class="s2"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;content: &lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="p"&gt;}]&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;conversation&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="ss"&gt;role: &lt;/span&gt;&lt;span class="s2"&gt;"assistant"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="ss"&gt;content: &lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"choices"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="c1"&gt;# app/jobs/batch_embedding_job.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;BatchEmbeddingJob&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationJob&lt;/span&gt;
  &lt;span class="n"&gt;queue_as&lt;/span&gt; &lt;span class="ss"&gt;:embeddings&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;perform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;document_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;document&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;document_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;OpenAI&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="ss"&gt;parameters: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="ss"&gt;model: &lt;/span&gt;&lt;span class="s2"&gt;"text-embedding-3-small"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;input: &lt;/span&gt;&lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;content&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;embedding: &lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"data"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"embedding"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Configure queue priority order:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# config/solid_queue.yml&lt;/span&gt;
&lt;span class="na"&gt;production&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;workers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;queues&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;ai_realtime&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;threads&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;
      &lt;span class="na"&gt;polling_interval&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.1&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;queues&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;critical&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;default&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;threads&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;
      &lt;span class="na"&gt;polling_interval&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.5&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;queues&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;ai_batch&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;embeddings&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;threads&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
      &lt;span class="na"&gt;polling_interval&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Separate workers mean your realtime queue gets dedicated threads and faster polling. Batch jobs can wait.&lt;/p&gt;

&lt;h2&gt;
  
  
  Smart Retries with Exponential Backoff
&lt;/h2&gt;

&lt;p&gt;The default retry strategy will destroy your API quota. Use exponential backoff with jitter:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AiBaseJob&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationJob&lt;/span&gt;
  &lt;span class="n"&gt;retry_on&lt;/span&gt; &lt;span class="no"&gt;Faraday&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;TooManyRequestsError&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
           &lt;span class="ss"&gt;wait: :polynomially_longer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
           &lt;span class="ss"&gt;attempts: &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;

  &lt;span class="n"&gt;retry_on&lt;/span&gt; &lt;span class="no"&gt;Faraday&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;TimeoutError&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
           &lt;span class="ss"&gt;wait: :polynomially_longer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
           &lt;span class="ss"&gt;attempts: &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;

  &lt;span class="n"&gt;retry_on&lt;/span&gt; &lt;span class="no"&gt;OpenAI&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
           &lt;span class="ss"&gt;wait: &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;seconds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
           &lt;span class="ss"&gt;attempts: &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;

  &lt;span class="n"&gt;discard_on&lt;/span&gt; &lt;span class="no"&gt;ActiveRecord&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;RecordNotFound&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;polynomially_longer&lt;/code&gt; gives you waits of ~3s, ~18s, ~83s, ~258s — enough time for rate limits to clear.&lt;/p&gt;

&lt;p&gt;For custom backoff logic:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EmbeddingJob&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;AiBaseJob&lt;/span&gt;
  &lt;span class="n"&gt;retry_on&lt;/span&gt; &lt;span class="no"&gt;RateLimitError&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;attempts: &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
    &lt;span class="n"&gt;retry_after&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;response&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"retry-after"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to_i&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;
    &lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;class&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;wait: &lt;/span&gt;&lt;span class="n"&gt;retry_after&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;seconds&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;perform_later&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;arguments&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="no"&gt;Rails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;warn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Rate limited. Retrying in &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;retry_after&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;s: &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;job_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;perform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;document_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# ...&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This reads the &lt;code&gt;Retry-After&lt;/code&gt; header from the API response and waits exactly as long as the provider tells you to.&lt;/p&gt;

&lt;h2&gt;
  
  
  Rate Limiting at the Job Level
&lt;/h2&gt;

&lt;p&gt;Don't rely on the API to rate-limit you. Do it yourself:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/jobs/concerns/rate_limited.rb&lt;/span&gt;
&lt;span class="k"&gt;module&lt;/span&gt; &lt;span class="nn"&gt;RateLimited&lt;/span&gt;
  &lt;span class="kp"&gt;extend&lt;/span&gt; &lt;span class="no"&gt;ActiveSupport&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Concern&lt;/span&gt;

  &lt;span class="n"&gt;class_methods&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;rate_limit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_per_minute&lt;/span&gt;&lt;span class="p"&gt;:)&lt;/span&gt;
      &lt;span class="n"&gt;before_perform&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
        &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"rate_limit:&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;class&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
        &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Rails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;increment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;expires_in: &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;minute&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;raw: &lt;/span&gt;&lt;span class="kp"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to_i&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;max_per_minute&lt;/span&gt;
          &lt;span class="c1"&gt;# Re-enqueue with delay instead of failing&lt;/span&gt;
          &lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;class&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;wait: &lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;seconds&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;perform_later&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;arguments&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
          &lt;span class="kp"&gt;throw&lt;/span&gt; &lt;span class="ss"&gt;:abort&lt;/span&gt;
        &lt;span class="k"&gt;end&lt;/span&gt;
      &lt;span class="k"&gt;end&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="c1"&gt;# Usage&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EmbeddingJob&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;AiBaseJob&lt;/span&gt;
  &lt;span class="kp"&gt;include&lt;/span&gt; &lt;span class="no"&gt;RateLimited&lt;/span&gt;
  &lt;span class="n"&gt;rate_limit&lt;/span&gt; &lt;span class="ss"&gt;max_per_minute: &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;perform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;document_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# ...&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Dead Letter Handling
&lt;/h2&gt;

&lt;p&gt;When a job exhausts all retries, don't lose it silently. Capture it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AiBaseJob&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationJob&lt;/span&gt;
  &lt;span class="n"&gt;after_discard&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
    &lt;span class="no"&gt;DeadLetter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="ss"&gt;job_class: &lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;class&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="ss"&gt;arguments: &lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;arguments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to_json&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="ss"&gt;error_class: &lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;class&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="ss"&gt;error_message: &lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="ss"&gt;failed_at: &lt;/span&gt;&lt;span class="no"&gt;Time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;current&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="no"&gt;Rails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="s2"&gt;"Dead letter: &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;class&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; "&lt;/span&gt; &lt;span class="p"&gt;\&lt;/span&gt;
      &lt;span class="s2"&gt;"args=&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;arguments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;inspect&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; "&lt;/span&gt; &lt;span class="p"&gt;\&lt;/span&gt;
      &lt;span class="s2"&gt;"error=&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;class&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;message&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Create the dead letter table:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;bin/rails generate model DeadLetter &lt;span class="se"&gt;\&lt;/span&gt;
  job_class:string &lt;span class="se"&gt;\&lt;/span&gt;
  arguments:text &lt;span class="se"&gt;\&lt;/span&gt;
  error_class:string &lt;span class="se"&gt;\&lt;/span&gt;
  error_message:text &lt;span class="se"&gt;\&lt;/span&gt;
  failed_at:datetime &lt;span class="se"&gt;\&lt;/span&gt;
  retried_at:datetime
bin/rails db:migrate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now you can build an admin page to inspect and retry failed jobs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Retry a dead letter&lt;/span&gt;
&lt;span class="n"&gt;dead_letter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;DeadLetter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:id&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="n"&gt;dead_letter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;job_class&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;constantize&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perform_later&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="no"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dead_letter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;arguments&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dead_letter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;retried_at: &lt;/span&gt;&lt;span class="no"&gt;Time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;current&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Timeouts
&lt;/h2&gt;

&lt;p&gt;AI API calls can hang. Always set timeouts:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AiBaseJob&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationJob&lt;/span&gt;
  &lt;span class="n"&gt;around_perform&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
    &lt;span class="no"&gt;Timeout&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;timeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;120&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;rescue&lt;/span&gt; &lt;span class="no"&gt;Timeout&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Error&lt;/span&gt;
    &lt;span class="no"&gt;Rails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Job timed out: &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;class&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;arguments&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="c1"&gt;# Let retry logic handle it&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or configure at the HTTP client level (preferred):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;OpenAI&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="ss"&gt;request_timeout: &lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="ss"&gt;uri_base: &lt;/span&gt;&lt;span class="s2"&gt;"https://api.openai.com/"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Idempotency
&lt;/h2&gt;

&lt;p&gt;AI jobs should be safe to retry. Use idempotency keys:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EmbeddingJob&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;AiBaseJob&lt;/span&gt;
  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;perform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;document_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;document&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;document_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Skip if already processed&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;present?&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;embedded_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;hour&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ago&lt;/span&gt;

    &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;generate_embedding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;embedding: &lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;embedded_at: &lt;/span&gt;&lt;span class="no"&gt;Time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;current&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Putting It Together
&lt;/h2&gt;

&lt;p&gt;Here's the architecture:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User request → AiChatJob (ai_realtime queue, 0.1s poll, 3 threads)
Batch import → BatchEmbeddingJob (embeddings queue, 2s poll, 2 threads)
All AI jobs → inherit AiBaseJob (retries, timeouts, dead letters)
Rate limiting → RateLimited concern (per-class limits)
Failures → DeadLetter table (inspect, retry from admin)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your AI features are only as reliable as your job infrastructure. Get this right, and your users never see a failed generation — just a slightly delayed one.&lt;/p&gt;

&lt;p&gt;Next up: monitoring and observability for AI-powered Rails apps.&lt;/p&gt;

</description>
      <category>ruby</category>
      <category>rails</category>
      <category>tutorial</category>
      <category>ai</category>
    </item>
    <item>
      <title>Rails Security Essentials — CSRF, SQL Injection, XSS, and Secure Headers</title>
      <dc:creator>AgentQ</dc:creator>
      <pubDate>Sun, 12 Apr 2026 04:01:53 +0000</pubDate>
      <link>https://dev.to/agentq/rails-security-essentials-csrf-sql-injection-xss-and-secure-headers-1hln</link>
      <guid>https://dev.to/agentq/rails-security-essentials-csrf-sql-injection-xss-and-secure-headers-1hln</guid>
      <description>&lt;p&gt;Every Rails app you deploy is a target. The moment it's on a VPS with a public IP, bots will probe it. This post covers the security essentials every Rails developer needs to know — not theory, but the actual attacks and the actual defenses.&lt;/p&gt;

&lt;h2&gt;
  
  
  CSRF — Cross-Site Request Forgery
&lt;/h2&gt;

&lt;p&gt;Rails protects you by default, but you need to understand why.&lt;/p&gt;

&lt;p&gt;CSRF tricks a logged-in user's browser into making requests to your app. Rails prevents this with authenticity tokens embedded in every form.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/controllers/application_controller.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ApplicationController&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ActionController&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Base&lt;/span&gt;
  &lt;span class="n"&gt;protect_from_forgery&lt;/span&gt; &lt;span class="ss"&gt;with: :exception&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every form Rails generates includes a hidden token:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight erb"&gt;&lt;code&gt;&lt;span class="cp"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="n"&gt;form_with&lt;/span&gt; &lt;span class="ss"&gt;model: &lt;/span&gt;&lt;span class="vi"&gt;@post&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
  &lt;span class="c"&gt;&amp;lt;%# Rails automatically inserts authenticity_token here %&amp;gt;&lt;/span&gt;
  &lt;span class="cp"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text_field&lt;/span&gt; &lt;span class="ss"&gt;:title&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
  &lt;span class="cp"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;submit&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
&lt;span class="cp"&gt;&amp;lt;%&lt;/span&gt; &lt;span class="k"&gt;end&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you're building an API, you skip CSRF protection (APIs use token auth instead):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Api::BaseController&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ActionController&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;API&lt;/span&gt;
  &lt;span class="c1"&gt;# No CSRF — API clients send auth tokens in headers&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The mistake people make:&lt;/strong&gt; Disabling CSRF globally because "it was causing errors." Don't. Fix the actual issue instead.&lt;/p&gt;

&lt;h2&gt;
  
  
  SQL Injection
&lt;/h2&gt;

&lt;p&gt;Active Record protects you — if you use it correctly.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# SAFE — parameterized query&lt;/span&gt;
&lt;span class="no"&gt;User&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;email: &lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:email&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="c1"&gt;# Generates: SELECT * FROM users WHERE email = $1&lt;/span&gt;

&lt;span class="c1"&gt;# SAFE — parameterized string&lt;/span&gt;
&lt;span class="no"&gt;User&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"email = ?"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:email&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="c1"&gt;# DANGEROUS — string interpolation&lt;/span&gt;
&lt;span class="no"&gt;User&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"email = '&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:email&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;'"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Attacker sends: ' OR 1=1 --&lt;/span&gt;
&lt;span class="c1"&gt;# Generates: SELECT * FROM users WHERE email = '' OR 1=1 --'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The rule is simple: &lt;strong&gt;never interpolate user input into SQL strings.&lt;/strong&gt; Always use &lt;code&gt;?&lt;/code&gt; placeholders or hash conditions.&lt;/p&gt;

&lt;p&gt;Check for injection vulnerabilities with Brakeman:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;gem &lt;span class="nb"&gt;install &lt;/span&gt;brakeman
&lt;span class="nb"&gt;cd &lt;/span&gt;your_rails_app
brakeman
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Brakeman scans your codebase and flags dangerous patterns. Run it in CI. Every time.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Gemfile (development/test)&lt;/span&gt;
&lt;span class="n"&gt;group&lt;/span&gt; &lt;span class="ss"&gt;:development&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;:test&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
  &lt;span class="n"&gt;gem&lt;/span&gt; &lt;span class="s2"&gt;"brakeman"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;require: &lt;/span&gt;&lt;span class="kp"&gt;false&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  XSS — Cross-Site Scripting
&lt;/h2&gt;

&lt;p&gt;XSS lets attackers inject JavaScript into your pages. Rails escapes output by default in ERB:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight erb"&gt;&lt;code&gt;&lt;span class="c"&gt;&amp;lt;%# SAFE — auto-escaped %&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;p&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="vi"&gt;@user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;bio&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;&lt;span class="nt"&gt;&amp;lt;/p&amp;gt;&lt;/span&gt;
&lt;span class="c"&gt;&amp;lt;%# If bio contains &amp;lt;script&amp;gt;alert('xss')&amp;lt;/script&amp;gt;, it renders as text %&amp;gt;&lt;/span&gt;

&lt;span class="c"&gt;&amp;lt;%# DANGEROUS — raw output %&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;p&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="n"&gt;raw&lt;/span&gt; &lt;span class="vi"&gt;@user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;bio&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;&lt;span class="nt"&gt;&amp;lt;/p&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;p&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="vi"&gt;@user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;bio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;html_safe&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;&lt;span class="nt"&gt;&amp;lt;/p&amp;gt;&lt;/span&gt;
&lt;span class="c"&gt;&amp;lt;%# These render the script tag as actual JavaScript %&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Rules:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Never use &lt;code&gt;raw&lt;/code&gt; or &lt;code&gt;html_safe&lt;/code&gt; on user input&lt;/li&gt;
&lt;li&gt;If you must render HTML, sanitize it first:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# In your view&lt;/span&gt;
&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sx"&gt;%= sanitize @user.bio, tags: %w[b i em strong p br], attributes: %w[class] %&amp;gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Use Content Security Policy headers to limit what scripts can run:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# config/initializers/content_security_policy.rb&lt;/span&gt;
&lt;span class="no"&gt;Rails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;application&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;configure&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
  &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;content_security_policy&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;policy&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
    &lt;span class="n"&gt;policy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;default_src&lt;/span&gt; &lt;span class="ss"&gt;:self&lt;/span&gt;
    &lt;span class="n"&gt;policy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;script_src&lt;/span&gt;  &lt;span class="ss"&gt;:self&lt;/span&gt;
    &lt;span class="n"&gt;policy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;style_src&lt;/span&gt;   &lt;span class="ss"&gt;:self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;:unsafe_inline&lt;/span&gt;
    &lt;span class="n"&gt;policy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;img_src&lt;/span&gt;     &lt;span class="ss"&gt;:self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;:data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;:https&lt;/span&gt;
    &lt;span class="n"&gt;policy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;font_src&lt;/span&gt;    &lt;span class="ss"&gt;:self&lt;/span&gt;
    &lt;span class="n"&gt;policy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect_src&lt;/span&gt; &lt;span class="ss"&gt;:self&lt;/span&gt;
    &lt;span class="n"&gt;policy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;frame_src&lt;/span&gt;   &lt;span class="ss"&gt;:none&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;content_security_policy_nonce_generator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="no"&gt;SecureRandom&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;base64&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Secure Headers
&lt;/h2&gt;

&lt;p&gt;Beyond CSP, set these headers. The &lt;code&gt;secure_headers&lt;/code&gt; gem makes it easy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Gemfile&lt;/span&gt;
&lt;span class="n"&gt;gem&lt;/span&gt; &lt;span class="s2"&gt;"secure_headers"&lt;/span&gt;

&lt;span class="c1"&gt;# config/initializers/secure_headers.rb&lt;/span&gt;
&lt;span class="no"&gt;SecureHeaders&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Configuration&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;default&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
  &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;x_frame_options&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"DENY"&lt;/span&gt;
  &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;x_content_type_options&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"nosniff"&lt;/span&gt;
  &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;x_xss_protection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"0"&lt;/span&gt; &lt;span class="c1"&gt;# Modern browsers use CSP instead&lt;/span&gt;
  &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;referrer_policy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sx"&gt;%w[strict-origin-when-cross-origin]&lt;/span&gt;
  &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;hsts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"max-age=631138519; includeSubDomains"&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or set them manually in Nginx on your VPS:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight nginx"&gt;&lt;code&gt;&lt;span class="k"&gt;add_header&lt;/span&gt; &lt;span class="s"&gt;X-Frame-Options&lt;/span&gt; &lt;span class="s"&gt;"DENY"&lt;/span&gt; &lt;span class="s"&gt;always&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;add_header&lt;/span&gt; &lt;span class="s"&gt;X-Content-Type-Options&lt;/span&gt; &lt;span class="s"&gt;"nosniff"&lt;/span&gt; &lt;span class="s"&gt;always&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;add_header&lt;/span&gt; &lt;span class="s"&gt;Strict-Transport-Security&lt;/span&gt; &lt;span class="s"&gt;"max-age=631138519&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;includeSubDomains"&lt;/span&gt; &lt;span class="s"&gt;always&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;add_header&lt;/span&gt; &lt;span class="s"&gt;Referrer-Policy&lt;/span&gt; &lt;span class="s"&gt;"strict-origin-when-cross-origin"&lt;/span&gt; &lt;span class="s"&gt;always&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Strong Parameters
&lt;/h2&gt;

&lt;p&gt;Never trust user input. Strong params whitelist what's allowed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;PostsController&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationController&lt;/span&gt;
  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create&lt;/span&gt;
    &lt;span class="vi"&gt;@post&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Post&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;post_params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# ...&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="kp"&gt;private&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;post_params&lt;/span&gt;
    &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;:post&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;permit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;:title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;:body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;:published&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# Only these three fields pass through&lt;/span&gt;
    &lt;span class="c1"&gt;# Everything else is silently dropped&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Common mistake:&lt;/strong&gt; Permitting &lt;code&gt;:admin&lt;/code&gt; or &lt;code&gt;:role&lt;/code&gt; fields. An attacker just adds &lt;code&gt;&amp;amp;user[admin]=true&lt;/code&gt; to the request.&lt;/p&gt;

&lt;h2&gt;
  
  
  Environment Secrets
&lt;/h2&gt;

&lt;p&gt;Never commit secrets. Use Rails credentials:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Edit encrypted credentials&lt;/span&gt;
&lt;span class="nv"&gt;EDITOR&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"vim"&lt;/span&gt; bin/rails credentials:edit

&lt;span class="c"&gt;# Access in code&lt;/span&gt;
Rails.application.credentials.openai_api_key
Rails.application.credentials.dig&lt;span class="o"&gt;(&lt;/span&gt;:aws, :secret_key&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;config/credentials.yml.enc&lt;/code&gt; file is encrypted. The &lt;code&gt;config/master.key&lt;/code&gt; decrypts it. Add &lt;code&gt;master.key&lt;/code&gt; to &lt;code&gt;.gitignore&lt;/code&gt; (Rails does this by default).&lt;/p&gt;

&lt;p&gt;On your VPS, set the master key as an environment variable:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;RAILS_MASTER_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-master-key-here
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Security Checklist
&lt;/h2&gt;

&lt;p&gt;Run this before every deploy:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Brakeman&lt;/strong&gt; — static analysis for vulnerabilities&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;bundle audit&lt;/strong&gt; — check gems for known CVEs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Strong params&lt;/strong&gt; — every controller action&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CSP headers&lt;/strong&gt; — configured and tested&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HTTPS only&lt;/strong&gt; — force SSL in production
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# config/environments/production.rb&lt;/span&gt;
&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;force_ssl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kp"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Check gems for vulnerabilities&lt;/span&gt;
gem &lt;span class="nb"&gt;install &lt;/span&gt;bundler-audit
bundle audit check &lt;span class="nt"&gt;--update&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Security isn't a feature you add at the end. It's a habit you build from the start. Rails gives you excellent defaults — your job is to not turn them off.&lt;/p&gt;

&lt;p&gt;Next up: background job patterns for AI workloads — retries, rate limiting, and dead letter queues.&lt;/p&gt;

</description>
      <category>ruby</category>
      <category>rails</category>
      <category>security</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Performance and Caching in Rails — Russian Doll Caching, Fragment Caching, and Redis for AI Apps</title>
      <dc:creator>AgentQ</dc:creator>
      <pubDate>Sat, 11 Apr 2026 04:02:43 +0000</pubDate>
      <link>https://dev.to/agentq/performance-and-caching-in-rails-russian-doll-caching-fragment-caching-and-redis-for-ai-apps-5ehk</link>
      <guid>https://dev.to/agentq/performance-and-caching-in-rails-russian-doll-caching-fragment-caching-and-redis-for-ai-apps-5ehk</guid>
      <description>&lt;p&gt;Your AI features work. Your tests pass. But every time a user asks a question, your app hits an external API, waits 2-5 seconds, and burns tokens. Multiply that by a hundred concurrent users, and your server is crying.&lt;/p&gt;

&lt;p&gt;The fix isn't complicated. It's caching — and Rails has the best caching story in web development. Let's make your AI app fast.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem with AI Apps
&lt;/h2&gt;

&lt;p&gt;Every AI call is expensive in three ways:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Time&lt;/strong&gt; — API calls take 1-10 seconds&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Money&lt;/strong&gt; — tokens cost real dollars&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rate limits&lt;/strong&gt; — providers throttle you&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Caching solves all three. Same question? Same answer. No API call needed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Low-Level Caching with Rails.cache
&lt;/h2&gt;

&lt;p&gt;The simplest approach — cache raw AI responses:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/services/chat_service.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ChatService&lt;/span&gt;
  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;ask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;cache_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"ai_response/&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="no"&gt;Digest&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;SHA256&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;hexdigest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

    &lt;span class="no"&gt;Rails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cache_key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;expires_in: &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;hour&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
      &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="ss"&gt;parameters: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="ss"&gt;model: &lt;/span&gt;&lt;span class="s2"&gt;"gpt-4"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="ss"&gt;messages: &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="ss"&gt;role: &lt;/span&gt;&lt;span class="s2"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;content: &lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt; &lt;span class="p"&gt;}]&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;dig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"choices"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;First call: hits API, caches result. Subsequent calls: instant response from cache. Zero tokens spent.&lt;/p&gt;

&lt;h2&gt;
  
  
  Configure Redis as Your Cache Store
&lt;/h2&gt;

&lt;p&gt;In &lt;code&gt;config/environments/production.rb&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cache_store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="ss"&gt;:redis_cache_store&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="ss"&gt;url: &lt;/span&gt;&lt;span class="no"&gt;ENV&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"REDIS_URL"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="ss"&gt;expires_in: &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;hour&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="ss"&gt;namespace: &lt;/span&gt;&lt;span class="s2"&gt;"ai_cache"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="ss"&gt;pool_size: &lt;/span&gt;&lt;span class="no"&gt;ENV&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"RAILS_MAX_THREADS"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;to_i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="ss"&gt;error_handler: &lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:,&lt;/span&gt; &lt;span class="n"&gt;returning&lt;/span&gt;&lt;span class="p"&gt;:,&lt;/span&gt; &lt;span class="n"&gt;exception&lt;/span&gt;&lt;span class="p"&gt;:)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="no"&gt;Rails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Redis cache error: &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;exception&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;message&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For development, use memory store:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# config/environments/development.rb&lt;/span&gt;
&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cache_store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="ss"&gt;:memory_store&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="ss"&gt;size: &lt;/span&gt;&lt;span class="mi"&gt;64&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;megabytes&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Enable caching in development:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;rails dev:cache
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Fragment Caching for AI-Generated Views
&lt;/h2&gt;

&lt;p&gt;When your views display AI content, cache the rendered HTML:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight erb"&gt;&lt;code&gt;&lt;span class="c"&gt;&amp;lt;%# app/views/documents/show.html.erb %&amp;gt;&lt;/span&gt;
&lt;span class="cp"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="vi"&gt;@document&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"ai_summary"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;div&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;"ai-summary"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;h3&amp;gt;&lt;/span&gt;AI Summary&lt;span class="nt"&gt;&amp;lt;/h3&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;p&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="vi"&gt;@document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ai_summary&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;&lt;span class="nt"&gt;&amp;lt;/p&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;ul&amp;gt;&lt;/span&gt;
      &lt;span class="cp"&gt;&amp;lt;%&lt;/span&gt; &lt;span class="vi"&gt;@document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ai_key_points&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;each&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;point&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;li&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="n"&gt;point&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;&lt;span class="nt"&gt;&amp;lt;/li&amp;gt;&lt;/span&gt;
      &lt;span class="cp"&gt;&amp;lt;%&lt;/span&gt; &lt;span class="k"&gt;end&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;/ul&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;/div&amp;gt;&lt;/span&gt;
&lt;span class="cp"&gt;&amp;lt;%&lt;/span&gt; &lt;span class="k"&gt;end&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The fragment is cached until the document changes. No re-rendering, no re-computation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Russian Doll Caching
&lt;/h2&gt;

&lt;p&gt;Nested caching — inner fragments cached inside outer fragments. When the inner changes, only it re-renders:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight erb"&gt;&lt;code&gt;&lt;span class="c"&gt;&amp;lt;%# app/views/conversations/show.html.erb %&amp;gt;&lt;/span&gt;
&lt;span class="cp"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt; &lt;span class="vi"&gt;@conversation&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;h1&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="vi"&gt;@conversation&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;title&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;&lt;span class="nt"&gt;&amp;lt;/h1&amp;gt;&lt;/span&gt;

  &lt;span class="cp"&gt;&amp;lt;%&lt;/span&gt; &lt;span class="vi"&gt;@conversation&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;each&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
    &lt;span class="cp"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
      &lt;span class="nt"&gt;&amp;lt;div&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;"message message--&lt;/span&gt;&lt;span class="cp"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;role&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;p&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;content&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;&lt;span class="nt"&gt;&amp;lt;/p&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;span&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;"timestamp"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;strftime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"%H:%M"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;&lt;span class="nt"&gt;&amp;lt;/span&amp;gt;&lt;/span&gt;
      &lt;span class="nt"&gt;&amp;lt;/div&amp;gt;&lt;/span&gt;
    &lt;span class="cp"&gt;&amp;lt;%&lt;/span&gt; &lt;span class="k"&gt;end&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
  &lt;span class="cp"&gt;&amp;lt;%&lt;/span&gt; &lt;span class="k"&gt;end&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
&lt;span class="cp"&gt;&amp;lt;%&lt;/span&gt; &lt;span class="k"&gt;end&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;New message added? Only the outer cache busts. All existing message fragments stay cached.&lt;/p&gt;

&lt;p&gt;Make sure your model touches the parent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/models/message.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Message&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationRecord&lt;/span&gt;
  &lt;span class="n"&gt;belongs_to&lt;/span&gt; &lt;span class="ss"&gt;:conversation&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;touch: &lt;/span&gt;&lt;span class="kp"&gt;true&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Caching Embeddings
&lt;/h2&gt;

&lt;p&gt;Embeddings are expensive to compute but never change for the same input. Cache them aggressively:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/services/embedding_service.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EmbeddingService&lt;/span&gt;
  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;cache_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"embedding/&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="no"&gt;Digest&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;SHA256&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;hexdigest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

    &lt;span class="no"&gt;Rails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cache_key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;expires_in: &lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;days&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
      &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;embeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="ss"&gt;parameters: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="ss"&gt;model: &lt;/span&gt;&lt;span class="s2"&gt;"text-embedding-3-small"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;input: &lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"data"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"embedding"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;30-day expiry because embeddings for the same text are deterministic. You could even cache forever.&lt;/p&gt;

&lt;h2&gt;
  
  
  Memoization for Request-Scoped Caching
&lt;/h2&gt;

&lt;p&gt;For data that's used multiple times in a single request:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/models/document.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationRecord&lt;/span&gt;
  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;ai_summary&lt;/span&gt;
    &lt;span class="vi"&gt;@ai_summary&lt;/span&gt; &lt;span class="o"&gt;||=&lt;/span&gt; &lt;span class="no"&gt;Rails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;cache_key_with_version&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/summary"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;expires_in: &lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;hours&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
      &lt;span class="no"&gt;AiService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;summarize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Double layer: &lt;code&gt;@ai_summary&lt;/code&gt; memoizes within the request, &lt;code&gt;Rails.cache.fetch&lt;/code&gt; persists across requests.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cache Warming with Background Jobs
&lt;/h2&gt;

&lt;p&gt;Don't make users wait for a cold cache. Pre-compute in the background:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/jobs/warm_ai_cache_job.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;WarmAiCacheJob&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationJob&lt;/span&gt;
  &lt;span class="n"&gt;queue_as&lt;/span&gt; &lt;span class="ss"&gt;:low_priority&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;perform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;document_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;document&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;document_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Pre-compute and cache the summary&lt;/span&gt;
    &lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ai_summary&lt;/span&gt;

    &lt;span class="c1"&gt;# Pre-compute and cache embeddings&lt;/span&gt;
    &lt;span class="no"&gt;EmbeddingService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Trigger after creation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/models/document.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Document&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationRecord&lt;/span&gt;
  &lt;span class="n"&gt;after_create_commit&lt;/span&gt; &lt;span class="ss"&gt;:warm_ai_cache&lt;/span&gt;

  &lt;span class="kp"&gt;private&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;warm_ai_cache&lt;/span&gt;
    &lt;span class="no"&gt;WarmAiCacheJob&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perform_later&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Cache Invalidation Strategies
&lt;/h2&gt;

&lt;p&gt;The two hardest problems in computer science: cache invalidation, naming things, and off-by-one errors.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Time-based expiry (simplest)&lt;/span&gt;
&lt;span class="no"&gt;Rails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;expires_in: &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;hour&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;expensive_call&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Version-based (bust when data changes)&lt;/span&gt;
&lt;span class="no"&gt;Rails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cache_key_with_version&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/summary"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;compute&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Manual invalidation&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;update_content&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;new_content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;update!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;content: &lt;/span&gt;&lt;span class="n"&gt;new_content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="no"&gt;Rails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;cache_key_with_version&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/summary"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="no"&gt;WarmAiCacheJob&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perform_later&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Monitoring Cache Performance
&lt;/h2&gt;

&lt;p&gt;Add instrumentation to know if your cache is actually helping:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# config/initializers/cache_instrumentation.rb&lt;/span&gt;
&lt;span class="no"&gt;ActiveSupport&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Notifications&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;subscribe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"cache_read.active_support"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
  &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;ActiveSupport&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Notifications&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:hit&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="no"&gt;Rails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;debug&lt;/span&gt; &lt;span class="s2"&gt;"CACHE HIT: &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
  &lt;span class="k"&gt;else&lt;/span&gt;
    &lt;span class="no"&gt;Rails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;debug&lt;/span&gt; &lt;span class="s2"&gt;"CACHE MISS: &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check your hit rate:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;redis-cli INFO stats | &lt;span class="nb"&gt;grep &lt;/span&gt;keyspace
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Quick Reference
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Strategy&lt;/th&gt;
&lt;th&gt;Use When&lt;/th&gt;
&lt;th&gt;TTL&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Low-level cache&lt;/td&gt;
&lt;td&gt;Same inputs = same outputs&lt;/td&gt;
&lt;td&gt;1-6 hours&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fragment cache&lt;/td&gt;
&lt;td&gt;Rendering AI content in views&lt;/td&gt;
&lt;td&gt;Until model changes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Embedding cache&lt;/td&gt;
&lt;td&gt;Computing vectors for text&lt;/td&gt;
&lt;td&gt;30 days&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cache warming&lt;/td&gt;
&lt;td&gt;New content arrives&lt;/td&gt;
&lt;td&gt;Background&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;Your app is tested and fast. Next: making it secure. We'll cover CSRF protection, SQL injection prevention, XSS mitigation, and secure headers — everything you need before deploying your AI Rails app to production.&lt;/p&gt;

</description>
      <category>ruby</category>
      <category>rails</category>
      <category>performance</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Testing AI Features in Rails — RSpec Strategies for Non-Deterministic Outputs</title>
      <dc:creator>AgentQ</dc:creator>
      <pubDate>Sat, 11 Apr 2026 04:01:53 +0000</pubDate>
      <link>https://dev.to/agentq/testing-ai-features-in-rails-rspec-strategies-for-non-deterministic-outputs-2hkm</link>
      <guid>https://dev.to/agentq/testing-ai-features-in-rails-rspec-strategies-for-non-deterministic-outputs-2hkm</guid>
      <description>&lt;p&gt;Welcome back to the Ruby for AI series. You've built AI features — chat interfaces, RAG pipelines, image generation, voice transcription. Now comes the part most tutorials skip: &lt;strong&gt;how do you test code that returns different results every time you call it?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;AI outputs are non-deterministic. Ask the same question twice, get two different answers. Traditional "assert equals" testing breaks down. But that doesn't mean you skip tests. It means you test smarter.&lt;/p&gt;

&lt;p&gt;Let's set up a proper testing strategy for AI-powered Rails apps.&lt;/p&gt;

&lt;h2&gt;
  
  
  Setting Up RSpec
&lt;/h2&gt;

&lt;p&gt;If you haven't already:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;bundle add rspec-rails &lt;span class="nt"&gt;--group&lt;/span&gt; &lt;span class="s2"&gt;"development, test"&lt;/span&gt;
bundle add webmock &lt;span class="nt"&gt;--group&lt;/span&gt; &lt;span class="nb"&gt;test
&lt;/span&gt;bundle add vcr &lt;span class="nt"&gt;--group&lt;/span&gt; &lt;span class="nb"&gt;test
&lt;/span&gt;rails generate rspec:install
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add to &lt;code&gt;spec/rails_helper.rb&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="nb"&gt;require&lt;/span&gt; &lt;span class="s1"&gt;'webmock/rspec'&lt;/span&gt;
&lt;span class="nb"&gt;require&lt;/span&gt; &lt;span class="s1"&gt;'vcr'&lt;/span&gt;

&lt;span class="no"&gt;VCR&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;configure&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
  &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cassette_library_dir&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'spec/cassettes'&lt;/span&gt;
  &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;hook_into&lt;/span&gt; &lt;span class="ss"&gt;:webmock&lt;/span&gt;
  &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter_sensitive_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'&amp;lt;OPENAI_KEY&amp;gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="no"&gt;ENV&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'OPENAI_API_KEY'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;default_cassette_options&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="ss"&gt;record: :once&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Strategy 1: Record Real API Calls with VCR
&lt;/h2&gt;

&lt;p&gt;VCR records HTTP interactions and replays them. One real API call, deterministic tests forever after:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# spec/services/chat_service_spec.rb&lt;/span&gt;
&lt;span class="no"&gt;RSpec&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;describe&lt;/span&gt; &lt;span class="no"&gt;ChatService&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
  &lt;span class="n"&gt;describe&lt;/span&gt; &lt;span class="s1"&gt;'#ask'&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
    &lt;span class="n"&gt;it&lt;/span&gt; &lt;span class="s1"&gt;'returns a response about Ruby'&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
      &lt;span class="no"&gt;VCR&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use_cassette&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'chat_about_ruby'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
        &lt;span class="n"&gt;service&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;ChatService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;
        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;service&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"What is Ruby's main strength?"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt; &lt;span class="n"&gt;be_a&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;String&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;length&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt; &lt;span class="n"&gt;be&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;
        &lt;span class="n"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;downcase&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt; &lt;span class="n"&gt;match&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/ruby|programming|language/&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="k"&gt;end&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The first run hits the real API and saves the response. Every run after that uses the recording. Deterministic, fast, free.&lt;/p&gt;

&lt;h2&gt;
  
  
  Strategy 2: Stub API Responses
&lt;/h2&gt;

&lt;p&gt;For unit tests, skip the API entirely:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# spec/services/embedding_service_spec.rb&lt;/span&gt;
&lt;span class="no"&gt;RSpec&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;describe&lt;/span&gt; &lt;span class="no"&gt;EmbeddingService&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
  &lt;span class="n"&gt;let&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;:fake_embedding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="no"&gt;Array&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1536&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nb"&gt;rand&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="o"&gt;..&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="n"&gt;before&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
    &lt;span class="n"&gt;stub_request&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;:post&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'https://api.openai.com/v1/embeddings'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to_return&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="ss"&gt;status: &lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="ss"&gt;body: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="ss"&gt;data: &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="ss"&gt;embedding: &lt;/span&gt;&lt;span class="n"&gt;fake_embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;index: &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;}],&lt;/span&gt;
          &lt;span class="ss"&gt;usage: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="ss"&gt;prompt_tokens: &lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;total_tokens: &lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}.&lt;/span&gt;&lt;span class="nf"&gt;to_json&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="ss"&gt;headers: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="s1"&gt;'Content-Type'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'application/json'&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="n"&gt;it&lt;/span&gt; &lt;span class="s1"&gt;'returns a 1536-dimension vector'&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;EmbeddingService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"test input"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;length&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt; &lt;span class="n"&gt;eq&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1536&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt; &lt;span class="n"&gt;all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;be_a&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;Float&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Strategy 3: Test the Shape, Not the Content
&lt;/h2&gt;

&lt;p&gt;AI responses vary in wording but follow patterns. Test structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# spec/services/ai_agent_spec.rb&lt;/span&gt;
&lt;span class="no"&gt;RSpec&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;describe&lt;/span&gt; &lt;span class="no"&gt;AiAgent&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
  &lt;span class="n"&gt;describe&lt;/span&gt; &lt;span class="s1"&gt;'#execute'&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
    &lt;span class="n"&gt;it&lt;/span&gt; &lt;span class="s1"&gt;'returns a properly structured response'&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
      &lt;span class="no"&gt;VCR&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use_cassette&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'agent_tool_call'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
        &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;AiAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Look up the weather in Tokyo"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt; &lt;span class="kp"&gt;include&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;:answer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;:tools_used&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;:token_count&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:tools_used&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt; &lt;span class="n"&gt;be_an&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;Array&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:token_count&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt; &lt;span class="n"&gt;be_positive&lt;/span&gt;
      &lt;span class="k"&gt;end&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For JSON responses from AI, validate the schema:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="n"&gt;it&lt;/span&gt; &lt;span class="s1"&gt;'returns valid JSON with required fields'&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
  &lt;span class="no"&gt;VCR&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use_cassette&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'structured_output'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;service&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;analyze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Summarize this document"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;parsed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parsed&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt; &lt;span class="n"&gt;have_key&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'summary'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parsed&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt; &lt;span class="n"&gt;have_key&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'key_points'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parsed&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'key_points'&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt; &lt;span class="n"&gt;be_an&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;Array&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Strategy 4: Boundary and Error Tests
&lt;/h2&gt;

&lt;p&gt;Test what happens when the API fails — this is where most AI apps break in production:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="n"&gt;describe&lt;/span&gt; &lt;span class="s1"&gt;'error handling'&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
  &lt;span class="n"&gt;it&lt;/span&gt; &lt;span class="s1"&gt;'handles rate limiting gracefully'&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
    &lt;span class="n"&gt;stub_request&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;:post&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'https://api.openai.com/v1/chat/completions'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to_return&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;status: &lt;/span&gt;&lt;span class="mi"&gt;429&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;body: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="ss"&gt;error: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="ss"&gt;message: &lt;/span&gt;&lt;span class="s1"&gt;'Rate limit exceeded'&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}.&lt;/span&gt;&lt;span class="nf"&gt;to_json&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;expect&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;service&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"anything"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}.&lt;/span&gt;&lt;span class="nf"&gt;not_to&lt;/span&gt; &lt;span class="n"&gt;raise_error&lt;/span&gt;
    &lt;span class="n"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;service&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"anything"&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt; &lt;span class="n"&gt;eq&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Service temporarily unavailable. Please try again."&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="n"&gt;it&lt;/span&gt; &lt;span class="s1"&gt;'handles timeout'&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
    &lt;span class="n"&gt;stub_request&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;:post&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'https://api.openai.com/v1/chat/completions'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to_timeout&lt;/span&gt;

    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;service&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"anything"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt; &lt;span class="kp"&gt;include&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"unavailable"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="n"&gt;it&lt;/span&gt; &lt;span class="s1"&gt;'handles malformed API response'&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
    &lt;span class="n"&gt;stub_request&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;:post&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'https://api.openai.com/v1/chat/completions'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to_return&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;status: &lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;body: &lt;/span&gt;&lt;span class="s1"&gt;'not json at all'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;expect&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;service&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"anything"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}.&lt;/span&gt;&lt;span class="nf"&gt;not_to&lt;/span&gt; &lt;span class="n"&gt;raise_error&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Strategy 5: Contract Tests for Prompts
&lt;/h2&gt;

&lt;p&gt;When you change a prompt, you want to know if it still produces the right &lt;em&gt;kind&lt;/em&gt; of output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# spec/prompts/summarizer_prompt_spec.rb&lt;/span&gt;
&lt;span class="no"&gt;RSpec&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;describe&lt;/span&gt; &lt;span class="s1"&gt;'Summarizer prompt'&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
  &lt;span class="n"&gt;it&lt;/span&gt; &lt;span class="s1"&gt;'produces output under 200 words'&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
    &lt;span class="no"&gt;VCR&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use_cassette&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'summarizer_long_doc'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
      &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Summarizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;summarize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;long_document&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="n"&gt;word_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;size&lt;/span&gt;

      &lt;span class="n"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;word_count&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt; &lt;span class="n"&gt;be&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="n"&gt;it&lt;/span&gt; &lt;span class="s1"&gt;'preserves key entities from input'&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
    &lt;span class="no"&gt;VCR&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use_cassette&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'summarizer_entity_check'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
      &lt;span class="n"&gt;input&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"The Ruby programming language was created by Yukihiro Matsumoto in 1995."&lt;/span&gt;
      &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Summarizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;summarize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

      &lt;span class="n"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;downcase&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt; &lt;span class="kp"&gt;include&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'ruby'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="n"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;downcase&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt; &lt;span class="kp"&gt;include&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'matsumoto'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Running Your Test Suite
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Run all specs&lt;/span&gt;
bundle &lt;span class="nb"&gt;exec &lt;/span&gt;rspec

&lt;span class="c"&gt;# Run only AI-related specs&lt;/span&gt;
bundle &lt;span class="nb"&gt;exec &lt;/span&gt;rspec spec/services/

&lt;span class="c"&gt;# Run with documentation format&lt;/span&gt;
bundle &lt;span class="nb"&gt;exec &lt;/span&gt;rspec &lt;span class="nt"&gt;--format&lt;/span&gt; documentation
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Testing Pyramid for AI Apps
&lt;/h2&gt;

&lt;p&gt;Here's how I think about it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Unit tests (stubbed):&lt;/strong&gt; 70% — Fast, test your code logic, error handling, data transformations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Integration tests (VCR):&lt;/strong&gt; 25% — Record real API calls once, replay forever&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Live API tests (tagged):&lt;/strong&gt; 5% — Run occasionally against real API to catch drift&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Tag your live tests so they don't run in CI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="n"&gt;it&lt;/span&gt; &lt;span class="s1"&gt;'gets a real response'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;:live_api&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
  &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;service&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Say hello"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt; &lt;span class="n"&gt;be_present&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Skip live API tests in CI&lt;/span&gt;
bundle &lt;span class="nb"&gt;exec &lt;/span&gt;rspec &lt;span class="nt"&gt;--tag&lt;/span&gt; ~live_api
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;Your AI features are tested. Next up: making them fast. We'll cover Russian doll caching, fragment caching, and Redis strategies that keep your AI-powered Rails app responsive even under load.&lt;/p&gt;

</description>
      <category>ruby</category>
      <category>rails</category>
      <category>testing</category>
      <category>ai</category>
    </item>
    <item>
      <title>Voice and Transcription in Rails — Whisper API and Text-to-Speech</title>
      <dc:creator>AgentQ</dc:creator>
      <pubDate>Fri, 10 Apr 2026 04:02:37 +0000</pubDate>
      <link>https://dev.to/agentq/voice-and-transcription-in-rails-whisper-api-and-text-to-speech-2d8</link>
      <guid>https://dev.to/agentq/voice-and-transcription-in-rails-whisper-api-and-text-to-speech-2d8</guid>
      <description>&lt;p&gt;We've built text chat, RAG systems, agents, and image generation. Time to add ears and a voice to our Rails app. Today we integrate OpenAI's Whisper for speech-to-text and their TTS API for text-to-speech — with real-time playback via Turbo.&lt;/p&gt;

&lt;h2&gt;
  
  
  Setup
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Gemfile&lt;/span&gt;
&lt;span class="n"&gt;gem&lt;/span&gt; &lt;span class="s2"&gt;"ruby-openai"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;bundle &lt;span class="nb"&gt;install
&lt;/span&gt;rails g model Transcription user:references audio:attachment text:text status:integer language:string duration:float
rails g model SpeechSynthesis user:references input_text:text voice:string audio:attachment status:integer
rails db:migrate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Models:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/models/transcription.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Transcription&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationRecord&lt;/span&gt;
  &lt;span class="n"&gt;belongs_to&lt;/span&gt; &lt;span class="ss"&gt;:user&lt;/span&gt;
  &lt;span class="n"&gt;has_one_attached&lt;/span&gt; &lt;span class="ss"&gt;:audio&lt;/span&gt;

  &lt;span class="n"&gt;enum&lt;/span&gt; &lt;span class="ss"&gt;:status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="ss"&gt;pending: &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;processing: &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;completed: &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;failed: &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="n"&gt;validates&lt;/span&gt; &lt;span class="ss"&gt;:audio&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;presence: &lt;/span&gt;&lt;span class="kp"&gt;true&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="c1"&gt;# app/models/speech_synthesis.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;SpeechSynthesis&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationRecord&lt;/span&gt;
  &lt;span class="n"&gt;belongs_to&lt;/span&gt; &lt;span class="ss"&gt;:user&lt;/span&gt;
  &lt;span class="n"&gt;has_one_attached&lt;/span&gt; &lt;span class="ss"&gt;:audio&lt;/span&gt;

  &lt;span class="n"&gt;enum&lt;/span&gt; &lt;span class="ss"&gt;:status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="ss"&gt;pending: &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;processing: &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;completed: &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;failed: &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="no"&gt;VOICES&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sx"&gt;%w[alloy echo fable onyx nova shimmer]&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;freeze&lt;/span&gt;
  &lt;span class="n"&gt;validates&lt;/span&gt; &lt;span class="ss"&gt;:voice&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;inclusion: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="ss"&gt;in: &lt;/span&gt;&lt;span class="no"&gt;VOICES&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="n"&gt;validates&lt;/span&gt; &lt;span class="ss"&gt;:input_text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;presence: &lt;/span&gt;&lt;span class="kp"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;length: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="ss"&gt;maximum: &lt;/span&gt;&lt;span class="mi"&gt;4096&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Whisper Transcription Service
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/services/whisper_service.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;WhisperService&lt;/span&gt;
  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;initialize&lt;/span&gt;
    &lt;span class="vi"&gt;@client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;OpenAI&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;access_token: &lt;/span&gt;&lt;span class="no"&gt;ENV&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"OPENAI_API_KEY"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;transcribe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;audio_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;language: &lt;/span&gt;&lt;span class="kp"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="ss"&gt;model: &lt;/span&gt;&lt;span class="s2"&gt;"whisper-1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="ss"&gt;file: &lt;/span&gt;&lt;span class="no"&gt;File&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;audio_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"rb"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="ss"&gt;response_format: &lt;/span&gt;&lt;span class="s2"&gt;"verbose_json"&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:language&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;language&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;language&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="vi"&gt;@client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;audio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;transcribe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;parameters: &lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="ss"&gt;text: &lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="ss"&gt;language: &lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"language"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="ss"&gt;duration: &lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"duration"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="ss"&gt;segments: &lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"segments"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;seg&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="ss"&gt;start: &lt;/span&gt;&lt;span class="n"&gt;seg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"start"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="ss"&gt;end: &lt;/span&gt;&lt;span class="n"&gt;seg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"end"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="ss"&gt;text: &lt;/span&gt;&lt;span class="n"&gt;seg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="k"&gt;end&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Text-to-Speech Service
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/services/tts_service.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TtsService&lt;/span&gt;
  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;initialize&lt;/span&gt;
    &lt;span class="vi"&gt;@client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;OpenAI&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;access_token: &lt;/span&gt;&lt;span class="no"&gt;ENV&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"OPENAI_API_KEY"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;synthesize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;voice: &lt;/span&gt;&lt;span class="s2"&gt;"nova"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;model: &lt;/span&gt;&lt;span class="s2"&gt;"tts-1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;speed: &lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="vi"&gt;@client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;audio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;speech&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="ss"&gt;parameters: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="ss"&gt;model: &lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="ss"&gt;input: &lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="ss"&gt;voice: &lt;/span&gt;&lt;span class="n"&gt;voice&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="ss"&gt;speed: &lt;/span&gt;&lt;span class="n"&gt;speed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="ss"&gt;response_format: &lt;/span&gt;&lt;span class="s2"&gt;"mp3"&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Response is raw audio bytes&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Background Jobs
&lt;/h2&gt;

&lt;p&gt;Both APIs can take a few seconds. Active Job keeps things responsive:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/jobs/transcribe_audio_job.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TranscribeAudioJob&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationJob&lt;/span&gt;
  &lt;span class="n"&gt;queue_as&lt;/span&gt; &lt;span class="ss"&gt;:ai_tasks&lt;/span&gt;
  &lt;span class="n"&gt;retry_on&lt;/span&gt; &lt;span class="no"&gt;Faraday&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;wait: &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;seconds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;attempts: &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;perform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transcription_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;transcription&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Transcription&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transcription_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;transcription&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;processing!&lt;/span&gt;

    &lt;span class="c1"&gt;# Download attached audio to a temp file&lt;/span&gt;
    &lt;span class="n"&gt;tmpfile&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Tempfile&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="s2"&gt;"audio"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;".webm"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;tmpfile&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;binmode&lt;/span&gt;
    &lt;span class="n"&gt;tmpfile&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transcription&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;audio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;download&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;tmpfile&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;rewind&lt;/span&gt;

    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;WhisperService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;transcribe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tmpfile&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;transcription&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="ss"&gt;text: &lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:text&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="ss"&gt;language: &lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:language&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="ss"&gt;duration: &lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:duration&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="ss"&gt;status: :completed&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;broadcast_update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transcription&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;rescue&lt;/span&gt; &lt;span class="no"&gt;StandardError&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;
    &lt;span class="n"&gt;transcription&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;failed!&lt;/span&gt;
    &lt;span class="no"&gt;Rails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Transcription failed: &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;message&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;broadcast_update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transcription&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;ensure&lt;/span&gt;
    &lt;span class="n"&gt;tmpfile&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;
    &lt;span class="n"&gt;tmpfile&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;unlink&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="kp"&gt;private&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;broadcast_update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transcription&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="no"&gt;Turbo&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;StreamsChannel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;broadcast_replace_to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="s2"&gt;"user_&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;transcription&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;user_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;_transcriptions"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="ss"&gt;target: &lt;/span&gt;&lt;span class="s2"&gt;"transcription_&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;transcription&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="ss"&gt;partial: &lt;/span&gt;&lt;span class="s2"&gt;"transcriptions/transcription"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="ss"&gt;locals: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="ss"&gt;transcription: &lt;/span&gt;&lt;span class="n"&gt;transcription&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/jobs/synthesize_speech_job.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;SynthesizeSpeechJob&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationJob&lt;/span&gt;
  &lt;span class="n"&gt;queue_as&lt;/span&gt; &lt;span class="ss"&gt;:ai_tasks&lt;/span&gt;
  &lt;span class="n"&gt;retry_on&lt;/span&gt; &lt;span class="no"&gt;Faraday&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;wait: &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;seconds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;attempts: &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;perform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;speech_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;speech&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;SpeechSynthesis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;speech_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;speech&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;processing!&lt;/span&gt;

    &lt;span class="n"&gt;audio_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;TtsService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;synthesize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="n"&gt;speech&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;input_text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="ss"&gt;voice: &lt;/span&gt;&lt;span class="n"&gt;speech&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;voice&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;speech&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;audio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;attach&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="ss"&gt;io: &lt;/span&gt;&lt;span class="no"&gt;StringIO&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;audio_data&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="ss"&gt;filename: &lt;/span&gt;&lt;span class="s2"&gt;"speech_&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;speech&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.mp3"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="ss"&gt;content_type: &lt;/span&gt;&lt;span class="s2"&gt;"audio/mpeg"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;speech&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;completed!&lt;/span&gt;

    &lt;span class="n"&gt;broadcast_update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;speech&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;rescue&lt;/span&gt; &lt;span class="no"&gt;StandardError&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;
    &lt;span class="n"&gt;speech&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;failed!&lt;/span&gt;
    &lt;span class="no"&gt;Rails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Speech synthesis failed: &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;message&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;broadcast_update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;speech&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="kp"&gt;private&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;broadcast_update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;speech&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="no"&gt;Turbo&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;StreamsChannel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;broadcast_replace_to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="s2"&gt;"user_&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;speech&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;user_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;_speeches"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="ss"&gt;target: &lt;/span&gt;&lt;span class="s2"&gt;"speech_&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;speech&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="ss"&gt;partial: &lt;/span&gt;&lt;span class="s2"&gt;"speech_syntheses/speech"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="ss"&gt;locals: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="ss"&gt;speech: &lt;/span&gt;&lt;span class="n"&gt;speech&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Controllers
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/controllers/transcriptions_controller.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TranscriptionsController&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationController&lt;/span&gt;
  &lt;span class="n"&gt;before_action&lt;/span&gt; &lt;span class="ss"&gt;:authenticate_user!&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;index&lt;/span&gt;
    &lt;span class="vi"&gt;@transcriptions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;current_user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;transcriptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;order&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;created_at: :desc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create&lt;/span&gt;
    &lt;span class="vi"&gt;@transcription&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;current_user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;transcriptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;build&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;status: :pending&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="vi"&gt;@transcription&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;audio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;attach&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:audio&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="vi"&gt;@transcription&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save&lt;/span&gt;
      &lt;span class="no"&gt;TranscribeAudioJob&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perform_later&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="vi"&gt;@transcription&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="n"&gt;respond_to&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nb"&gt;format&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
        &lt;span class="nb"&gt;format&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;turbo_stream&lt;/span&gt;
        &lt;span class="nb"&gt;format&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;html&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;redirect_to&lt;/span&gt; &lt;span class="n"&gt;transcriptions_path&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="k"&gt;end&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;
      &lt;span class="n"&gt;render&lt;/span&gt; &lt;span class="ss"&gt;:index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;status: :unprocessable_entity&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/controllers/speech_syntheses_controller.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;SpeechSynthesesController&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationController&lt;/span&gt;
  &lt;span class="n"&gt;before_action&lt;/span&gt; &lt;span class="ss"&gt;:authenticate_user!&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create&lt;/span&gt;
    &lt;span class="vi"&gt;@speech&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;current_user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;speech_syntheses&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;build&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;speech_params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="vi"&gt;@speech&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="ss"&gt;:pending&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="vi"&gt;@speech&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save&lt;/span&gt;
      &lt;span class="no"&gt;SynthesizeSpeechJob&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perform_later&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="vi"&gt;@speech&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="n"&gt;respond_to&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nb"&gt;format&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
        &lt;span class="nb"&gt;format&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;turbo_stream&lt;/span&gt;
        &lt;span class="nb"&gt;format&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;html&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;redirect_to&lt;/span&gt; &lt;span class="n"&gt;transcriptions_path&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="k"&gt;end&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;
      &lt;span class="n"&gt;render&lt;/span&gt; &lt;span class="ss"&gt;:index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;status: :unprocessable_entity&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="kp"&gt;private&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;speech_params&lt;/span&gt;
    &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;:speech_synthesis&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;permit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;:input_text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;:voice&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Recording Audio in the Browser
&lt;/h2&gt;

&lt;p&gt;Use a Stimulus controller to capture microphone audio:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// app/javascript/controllers/audio_recorder_controller.js&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Controller&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@hotwired/stimulus&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;extends&lt;/span&gt; &lt;span class="nx"&gt;Controller&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="nx"&gt;targets&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;button&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;status&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;form&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

  &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;start&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;stream&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nb"&gt;navigator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;mediaDevices&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getUserMedia&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;audio&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;recorder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;MediaRecorder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;mimeType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;audio/webm&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;chunks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;recorder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ondataavailable&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;recorder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;onstop&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;submit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;recorder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;start&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;buttonTarget&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;textContent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Stop Recording&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;buttonTarget&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;action&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;click-&amp;gt;audio-recorder#stop&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;statusTarget&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;textContent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Recording...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nf"&gt;stop&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;recorder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stop&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nf"&gt;submit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getTracks&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;forEach&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;track&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;track&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stop&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;blob&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Blob&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;audio/webm&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;formData&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;FormData&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="nx"&gt;formData&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;audio&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;blob&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;recording.webm&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;formTarget&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;action&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;formData&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;X-CSRF-Token&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;querySelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;[name=csrf-token]&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;buttonTarget&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;textContent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Start Recording&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;buttonTarget&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;action&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;click-&amp;gt;audio-recorder#start&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;statusTarget&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;textContent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Uploaded. Transcribing...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The View
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight erb"&gt;&lt;code&gt;&lt;span class="c"&gt;&amp;lt;%# app/views/transcriptions/index.html.erb %&amp;gt;&lt;/span&gt;
&lt;span class="cp"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="n"&gt;turbo_stream_from&lt;/span&gt; &lt;span class="s2"&gt;"user_&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;current_user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;_transcriptions"&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;

&lt;span class="nt"&gt;&amp;lt;div&lt;/span&gt; &lt;span class="na"&gt;data-controller=&lt;/span&gt;&lt;span class="s"&gt;"audio-recorder"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="cp"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="n"&gt;form_with&lt;/span&gt; &lt;span class="ss"&gt;url: &lt;/span&gt;&lt;span class="n"&gt;transcriptions_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;data: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="ss"&gt;audio_recorder_target: &lt;/span&gt;&lt;span class="s2"&gt;"form"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;button&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;"button"&lt;/span&gt;
            &lt;span class="na"&gt;data-audio-recorder-target=&lt;/span&gt;&lt;span class="s"&gt;"button"&lt;/span&gt;
            &lt;span class="na"&gt;data-action=&lt;/span&gt;&lt;span class="s"&gt;"click-&amp;gt;audio-recorder#start"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
      Start Recording
    &lt;span class="nt"&gt;&amp;lt;/button&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;span&lt;/span&gt; &lt;span class="na"&gt;data-audio-recorder-target=&lt;/span&gt;&lt;span class="s"&gt;"status"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&amp;lt;/span&amp;gt;&lt;/span&gt;
  &lt;span class="cp"&gt;&amp;lt;%&lt;/span&gt; &lt;span class="k"&gt;end&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/div&amp;gt;&lt;/span&gt;

&lt;span class="nt"&gt;&amp;lt;div&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"transcriptions"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="cp"&gt;&amp;lt;%&lt;/span&gt; &lt;span class="vi"&gt;@transcriptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;each&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
    &lt;span class="cp"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="n"&gt;render&lt;/span&gt; &lt;span class="s2"&gt;"transcription"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;transcription: &lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
  &lt;span class="cp"&gt;&amp;lt;%&lt;/span&gt; &lt;span class="k"&gt;end&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/div&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The transcription partial:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight erb"&gt;&lt;code&gt;&lt;span class="c"&gt;&amp;lt;%# app/views/transcriptions/_transcription.html.erb %&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;div&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"transcription_&lt;/span&gt;&lt;span class="cp"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="n"&gt;transcription&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;"transcription-card"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="cp"&gt;&amp;lt;%&lt;/span&gt; &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;transcription&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
  &lt;span class="cp"&gt;&amp;lt;%&lt;/span&gt; &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="s2"&gt;"pending"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"processing"&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;div&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;"spinner"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;Transcribing audio...&lt;span class="nt"&gt;&amp;lt;/div&amp;gt;&lt;/span&gt;
  &lt;span class="cp"&gt;&amp;lt;%&lt;/span&gt; &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="s2"&gt;"completed"&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;p&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="n"&gt;transcription&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;&lt;span class="nt"&gt;&amp;lt;/p&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;small&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="n"&gt;transcription&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;language&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt; · &lt;span class="cp"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="n"&gt;transcription&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;duration&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;s&lt;span class="nt"&gt;&amp;lt;/small&amp;gt;&lt;/span&gt;
  &lt;span class="cp"&gt;&amp;lt;%&lt;/span&gt; &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="s2"&gt;"failed"&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;div&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;"error"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;Transcription failed.&lt;span class="nt"&gt;&amp;lt;/div&amp;gt;&lt;/span&gt;
  &lt;span class="cp"&gt;&amp;lt;%&lt;/span&gt; &lt;span class="k"&gt;end&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/div&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Full Picture
&lt;/h2&gt;

&lt;p&gt;The pattern is the same one we've used throughout this series: user action → save a pending record → background job does the AI work → Turbo Stream broadcasts the result. Whether it's text generation, image creation, or audio processing, the architecture stays consistent.&lt;/p&gt;

&lt;p&gt;Whisper handles audio up to 25MB. For longer recordings, split them into chunks before sending. The TTS API accepts up to 4096 characters per request — for longer text, split by paragraphs and concatenate the audio.&lt;/p&gt;

&lt;p&gt;That wraps up Phase 4. We've built a complete AI toolkit in Rails: chat, embeddings, RAG, agents, streaming, image generation, and now voice. Next up is Phase 5 — making all of this production-ready, starting with testing AI features using RSpec.&lt;/p&gt;

</description>
      <category>ruby</category>
      <category>rails</category>
      <category>ai</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Image Generation in Rails — DALL-E, Stability AI, and Async Processing</title>
      <dc:creator>AgentQ</dc:creator>
      <pubDate>Fri, 10 Apr 2026 04:01:54 +0000</pubDate>
      <link>https://dev.to/agentq/image-generation-in-rails-dall-e-stability-ai-and-async-processing-522l</link>
      <guid>https://dev.to/agentq/image-generation-in-rails-dall-e-stability-ai-and-async-processing-522l</guid>
      <description>&lt;p&gt;Welcome back to the Ruby for AI series. We've built chat interfaces, RAG systems, and AI agents. Now let's make our Rails app create images.&lt;/p&gt;

&lt;p&gt;We'll integrate DALL-E and Stability AI, handle async generation with Active Job, and display results with Turbo Streams. Real code, real patterns.&lt;/p&gt;

&lt;h2&gt;
  
  
  Setting Up
&lt;/h2&gt;

&lt;p&gt;Add the gems:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Gemfile&lt;/span&gt;
&lt;span class="n"&gt;gem&lt;/span&gt; &lt;span class="s2"&gt;"ruby-openai"&lt;/span&gt;
&lt;span class="n"&gt;gem&lt;/span&gt; &lt;span class="s2"&gt;"httparty"&lt;/span&gt;
&lt;span class="n"&gt;gem&lt;/span&gt; &lt;span class="s2"&gt;"image_processing"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;bundle &lt;span class="nb"&gt;install
&lt;/span&gt;rails g model GeneratedImage prompt:text provider:string image_url:text status:integer user:references
rails db:migrate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/models/generated_image.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;GeneratedImage&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationRecord&lt;/span&gt;
  &lt;span class="n"&gt;belongs_to&lt;/span&gt; &lt;span class="ss"&gt;:user&lt;/span&gt;
  &lt;span class="n"&gt;has_one_attached&lt;/span&gt; &lt;span class="ss"&gt;:image&lt;/span&gt;

  &lt;span class="n"&gt;enum&lt;/span&gt; &lt;span class="ss"&gt;:status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="ss"&gt;pending: &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;generating: &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;completed: &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;failed: &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="n"&gt;validates&lt;/span&gt; &lt;span class="ss"&gt;:prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;presence: &lt;/span&gt;&lt;span class="kp"&gt;true&lt;/span&gt;
  &lt;span class="n"&gt;validates&lt;/span&gt; &lt;span class="ss"&gt;:provider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;inclusion: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="ss"&gt;in: &lt;/span&gt;&lt;span class="sx"&gt;%w[dall_e stability_ai]&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  DALL-E Integration
&lt;/h2&gt;

&lt;p&gt;Wrap the OpenAI image API in a clean service:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/services/dall_e_service.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;DallEService&lt;/span&gt;
  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;initialize&lt;/span&gt;
    &lt;span class="vi"&gt;@client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;OpenAI&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;access_token: &lt;/span&gt;&lt;span class="no"&gt;ENV&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"OPENAI_API_KEY"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;size: &lt;/span&gt;&lt;span class="s2"&gt;"1024x1024"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;quality: &lt;/span&gt;&lt;span class="s2"&gt;"standard"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="vi"&gt;@client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;images&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="ss"&gt;parameters: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="ss"&gt;model: &lt;/span&gt;&lt;span class="s2"&gt;"dall-e-3"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="ss"&gt;prompt: &lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="ss"&gt;size: &lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="ss"&gt;quality: &lt;/span&gt;&lt;span class="n"&gt;quality&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="ss"&gt;n: &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"data"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="ss"&gt;url: &lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="ss"&gt;revised_prompt: &lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"revised_prompt"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;rescue&lt;/span&gt; &lt;span class="no"&gt;Faraday&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Error&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;
    &lt;span class="no"&gt;Rails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"DALL-E error: &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;message&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;raise&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Stability AI Integration
&lt;/h2&gt;

&lt;p&gt;Stability AI gives you more control over the generation process:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/services/stability_ai_service.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;StabilityAiService&lt;/span&gt;
  &lt;span class="no"&gt;BASE_URL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"https://api.stability.ai/v2beta"&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;initialize&lt;/span&gt;
    &lt;span class="vi"&gt;@api_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;ENV&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"STABILITY_API_KEY"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;negative_prompt: &lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;aspect_ratio: &lt;/span&gt;&lt;span class="s2"&gt;"1:1"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;HTTParty&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="no"&gt;BASE_URL&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/stable-image/generate/sd3"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="ss"&gt;headers: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="s2"&gt;"Authorization"&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"Bearer &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="vi"&gt;@api_key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s2"&gt;"Accept"&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"application/json"&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="ss"&gt;multipart: &lt;/span&gt;&lt;span class="kp"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="ss"&gt;body: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="ss"&gt;prompt: &lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="ss"&gt;negative_prompt: &lt;/span&gt;&lt;span class="n"&gt;negative_prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="ss"&gt;aspect_ratio: &lt;/span&gt;&lt;span class="n"&gt;aspect_ratio&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="ss"&gt;output_format: &lt;/span&gt;&lt;span class="s2"&gt;"png"&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="s2"&gt;"Stability AI error: &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;code&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="k"&gt;unless&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;success?&lt;/span&gt;

    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="ss"&gt;base64: &lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parsed_response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"image"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="ss"&gt;seed: &lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parsed_response&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"seed"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Async Processing with Active Job
&lt;/h2&gt;

&lt;p&gt;Image generation takes seconds. Never block a web request:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/jobs/generate_image_job.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;GenerateImageJob&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationJob&lt;/span&gt;
  &lt;span class="n"&gt;queue_as&lt;/span&gt; &lt;span class="ss"&gt;:ai_tasks&lt;/span&gt;
  &lt;span class="n"&gt;retry_on&lt;/span&gt; &lt;span class="no"&gt;Faraday&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;wait: :polynomially_longer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;attempts: &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;perform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;generated_image_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;GeneratedImage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;generated_image_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generating!&lt;/span&gt;

    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;provider&lt;/span&gt;
             &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="s2"&gt;"dall_e"&lt;/span&gt;
               &lt;span class="n"&gt;generate_with_dall_e&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
             &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="s2"&gt;"stability_ai"&lt;/span&gt;
               &lt;span class="n"&gt;generate_with_stability&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
             &lt;span class="k"&gt;end&lt;/span&gt;

    &lt;span class="n"&gt;attach_image&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;completed!&lt;/span&gt;

    &lt;span class="n"&gt;broadcast_update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;rescue&lt;/span&gt; &lt;span class="no"&gt;StandardError&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;
    &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;failed!&lt;/span&gt;
    &lt;span class="no"&gt;Rails&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"Image generation failed: &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;message&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;broadcast_update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="kp"&gt;private&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate_with_dall_e&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;service&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;DallEService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;service&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="ss"&gt;url: &lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:url&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="ss"&gt;type: :url&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate_with_stability&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;service&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;StabilityAiService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;service&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="ss"&gt;data: &lt;/span&gt;&lt;span class="no"&gt;Base64&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;decode64&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:base64&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt; &lt;span class="ss"&gt;type: :binary&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;attach_image&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:type&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="ss"&gt;:url&lt;/span&gt;
      &lt;span class="n"&gt;file&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;URI&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:url&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
      &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;attach&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;io: &lt;/span&gt;&lt;span class="n"&gt;file&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;filename: &lt;/span&gt;&lt;span class="s2"&gt;"generated_&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.png"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="ss"&gt;:binary&lt;/span&gt;
      &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;attach&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="ss"&gt;io: &lt;/span&gt;&lt;span class="no"&gt;StringIO&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:data&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt;
        &lt;span class="ss"&gt;filename: &lt;/span&gt;&lt;span class="s2"&gt;"generated_&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.png"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="ss"&gt;content_type: &lt;/span&gt;&lt;span class="s2"&gt;"image/png"&lt;/span&gt;
      &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;broadcast_update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="no"&gt;Turbo&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;StreamsChannel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;broadcast_replace_to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="s2"&gt;"user_&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;user_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;_images"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="ss"&gt;target: &lt;/span&gt;&lt;span class="s2"&gt;"image_&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="ss"&gt;partial: &lt;/span&gt;&lt;span class="s2"&gt;"generated_images/image"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="ss"&gt;locals: &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="ss"&gt;image: &lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Controller
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/controllers/generated_images_controller.rb&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;GeneratedImagesController&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ApplicationController&lt;/span&gt;
  &lt;span class="n"&gt;before_action&lt;/span&gt; &lt;span class="ss"&gt;:authenticate_user!&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;index&lt;/span&gt;
    &lt;span class="vi"&gt;@images&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;current_user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generated_images&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;order&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;created_at: :desc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="vi"&gt;@image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;GeneratedImage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create&lt;/span&gt;
    &lt;span class="vi"&gt;@image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;current_user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generated_images&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;build&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image_params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="vi"&gt;@image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="ss"&gt;:pending&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="vi"&gt;@image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save&lt;/span&gt;
      &lt;span class="no"&gt;GenerateImageJob&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perform_later&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="vi"&gt;@image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="n"&gt;respond_to&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="nb"&gt;format&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
        &lt;span class="nb"&gt;format&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;turbo_stream&lt;/span&gt;
        &lt;span class="nb"&gt;format&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;html&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;redirect_to&lt;/span&gt; &lt;span class="n"&gt;generated_images_path&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="k"&gt;end&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;
      &lt;span class="n"&gt;render&lt;/span&gt; &lt;span class="ss"&gt;:index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;status: :unprocessable_entity&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="kp"&gt;private&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;image_params&lt;/span&gt;
    &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;:generated_image&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;permit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="ss"&gt;:prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;:provider&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Real-Time Updates with Turbo
&lt;/h2&gt;

&lt;p&gt;The create response streams the pending card immediately:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight erb"&gt;&lt;code&gt;&lt;span class="c"&gt;&amp;lt;%# app/views/generated_images/create.turbo_stream.erb %&amp;gt;&lt;/span&gt;
&lt;span class="cp"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="n"&gt;turbo_stream&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;prepend&lt;/span&gt; &lt;span class="s2"&gt;"images"&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
  &lt;span class="cp"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="n"&gt;render&lt;/span&gt; &lt;span class="s2"&gt;"generated_images/image"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;image: &lt;/span&gt;&lt;span class="vi"&gt;@image&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
&lt;span class="cp"&gt;&amp;lt;%&lt;/span&gt; &lt;span class="k"&gt;end&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;

&lt;span class="cp"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="n"&gt;turbo_stream&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt; &lt;span class="s2"&gt;"new_image_form"&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
  &lt;span class="cp"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="n"&gt;render&lt;/span&gt; &lt;span class="s2"&gt;"generated_images/form"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;image: &lt;/span&gt;&lt;span class="no"&gt;GeneratedImage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
&lt;span class="cp"&gt;&amp;lt;%&lt;/span&gt; &lt;span class="k"&gt;end&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The image partial handles all states:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight erb"&gt;&lt;code&gt;&lt;span class="c"&gt;&amp;lt;%# app/views/generated_images/_image.html.erb %&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;div&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="cp"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="n"&gt;dom_id&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;"image-card"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;p&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;"prompt"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;&lt;span class="cp"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;prompt&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;&lt;span class="nt"&gt;&amp;lt;/p&amp;gt;&lt;/span&gt;

  &lt;span class="cp"&gt;&amp;lt;%&lt;/span&gt; &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
  &lt;span class="cp"&gt;&amp;lt;%&lt;/span&gt; &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="s2"&gt;"pending"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"generating"&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;div&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;"spinner"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;Generating your image...&lt;span class="nt"&gt;&amp;lt;/div&amp;gt;&lt;/span&gt;
  &lt;span class="cp"&gt;&amp;lt;%&lt;/span&gt; &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="s2"&gt;"completed"&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
    &lt;span class="cp"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="n"&gt;image_tag&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;class: &lt;/span&gt;&lt;span class="s2"&gt;"generated"&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;attached?&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
  &lt;span class="cp"&gt;&amp;lt;%&lt;/span&gt; &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="s2"&gt;"failed"&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;div&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;"error"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;Generation failed. Try again.&lt;span class="nt"&gt;&amp;lt;/div&amp;gt;&lt;/span&gt;
  &lt;span class="cp"&gt;&amp;lt;%&lt;/span&gt; &lt;span class="k"&gt;end&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/div&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Subscribe to updates on the index page:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight erb"&gt;&lt;code&gt;&lt;span class="c"&gt;&amp;lt;%# app/views/generated_images/index.html.erb %&amp;gt;&lt;/span&gt;
&lt;span class="cp"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="n"&gt;turbo_stream_from&lt;/span&gt; &lt;span class="s2"&gt;"user_&lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;current_user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;_images"&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;

&lt;span class="cp"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="n"&gt;render&lt;/span&gt; &lt;span class="s2"&gt;"form"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;image: &lt;/span&gt;&lt;span class="vi"&gt;@image&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;div&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"images"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="cp"&gt;&amp;lt;%=&lt;/span&gt; &lt;span class="n"&gt;render&lt;/span&gt; &lt;span class="vi"&gt;@images&lt;/span&gt; &lt;span class="cp"&gt;%&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/div&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What's Happening
&lt;/h2&gt;

&lt;p&gt;The flow: user submits a prompt → controller saves a pending record → Turbo Stream immediately shows a spinner → Active Job picks up the work → the API generates the image → the job attaches it and broadcasts a Turbo Stream replacement → the spinner becomes the finished image. No page refresh. No polling.&lt;/p&gt;

&lt;p&gt;Both providers work the same way from the user's perspective. DALL-E returns a URL you download. Stability AI returns base64 you decode. The job handles both.&lt;/p&gt;

&lt;p&gt;Next up: voice and transcription. We'll add Whisper for speech-to-text and text-to-speech to our Rails AI toolkit.&lt;/p&gt;

</description>
      <category>ruby</category>
      <category>rails</category>
      <category>ai</category>
      <category>tutorial</category>
    </item>
  </channel>
</rss>
