Featured Post

Introducing PlanLlama: Reliable Programming at Scale

product-updates
Matt Sergeant5 min read

I'm personally very excited to introduce PlanLlama, a cloud-based job reliable programming platform built for developers who need reliable, scalable workflows without the operational overhead.

Why PlanLlama?

I've spent the past 30 years of my career building products for startups that have grown to scale. Every one of those products has needed some form of queue to fulfil the needs of reliable programming. They've also needed scheduled jobs (cron jobs) that work at scale, and more recently as I dabbled into modern LLMs, as every other developer has, I identified a need for a simple easy to use reliability layer.

Recently I've used PostgreSQL as the queue storage layer for all the reasons that people are now going to Postgres as the tool of choice for everything they do. Just Use Postgres is the term of the era. That direction led me to pg-boss, and more specifically the use of SKIP LOCKED as a way of building scalable queue infrastructure based on Postgres.

Pg-boss already solves many problems for many people: It reliably runs cron jobs across distributed infrastructure. It provides the ability to have large numbers of items in the queue without breaking a sweat (something anyone who has used queueu software has experienced problems with). It provides 100% reliability based on Postgres guarantees. But something was still missing.

  • Pg-boss relies on polling your database periodically.
  • Other queues built on Postgres' LISTEN/NOTIFY have significant scalability issues (they requiring a global lock).
  • There were no libraries for other languages, making pg-boss locked to the JS/TS ecosystem.
  • There was a lack of commercial support offerings.

PlanLlama fills those gaps.

And finally to my philosophy: PlanLlama has a whole bunch of competitors. Competition is healthy. My philosophy is simplicity. Developers crave simple APIs and PlanLlama brings that. Getting started is trivial, and I build products to scale and last.

AI Boom

And then the AI boom happened. One thing that quickly became clear is that AI models can often fail and become unreliable. While I'm sure some of that instability will resolve itself over time, developers need a reliability layer now. Furthermore the dashboard of PlanLlama makes an excellent place to inspect the health of your AI workflows.

How does PlanLlama help with this? The simplicity of the PlanLlama API meant I could add a reliability layer to Vercel's excellent AI SDK in just a few lines of code. This allows you to build AI workflows that cross multiple languages and will reliably execute the workflow even in the event of any failures in the system.

Key Features

1. Simple SDK Integration

Getting started takes just a few lines of code:

import { PlanLlama } from "planllama";

const client = new PlanLlama({
  apiToken: process.env.PLANLLAMA_TOKEN,
});

await client.start();

await client.work("send-email", async ({ data }) => {
  // send email based on data.to and data.subject
})

// Publish a one-time job
await client.publish("send-email", {
  to: "user@example.com",
  subject: "Welcome!",
});

// Schedule a recurring job (cron syntax)
await client.schedule("daily-reports", "0 9 * * *", {
  reportType: "analytics",
});

2. WebSocket-Based Real-Time Processing

Unlike polling-based solutions, PlanLlama uses WebSockets for instant job delivery to your workers. This means:

  • Lower latency - Jobs start executing immediately
  • Reduced load - No constant polling required
  • Better scalability - Workers connect once and stay connected
  • Browser usage - PlanLlama works equally well in the browser as it does on the server

3. Built-in Retry Logic

Configure retry behavior per job:

await client.publish(
  "process-payment",
  { amount: 100, currency: "USD" },
  {
    retryLimit: 3,
    retryDelay: 5,
    retryBackoff: true,
  }
);

4. Comprehensive Monitoring

Track job execution with detailed metrics:

  • Success/failure rates
  • Execution duration
  • Queue depths
  • Worker health

Architecture Overview

PlanLlama consists of three main components:

  1. Client SDK - Lightweight Node.js library for publishing jobs and registering workers
  2. Backend API - Service handling job orchestration
  3. Dashboard - Web application for monitoring and configuration

The system uses pg-boss under the hood for reliable job queueing with PostgreSQL, ensuring ACID guarantees and excellent performance, but with our own magic sauce to remove the need for polling.

Real-World Use Cases

We've seen teams using PlanLlama for:

  • Email campaigns - Schedule newsletters and drip campaigns
  • Data processing - Run ETL jobs and report generation
  • API integrations - Sync data between systems
  • Cleanup tasks - Archive old records and prune caches
  • Health checks - Monitor external services

Getting Started

  1. Sign up at planllama.com
  2. Create a project and get your API token
  3. Install the SDK: npm install planllama
  4. Start scheduling jobs!

Check out our documentation for detailed guides and examples.

What's Next?

We're actively working on:

  • Support for more programming languages (currently we support JS/TS, Python, Go, and Ruby)
  • Advanced scheduling patterns (intervals, one-time delays)
  • Webhook notifications for job lifecycle events
  • Enhanced analytics and alerting

We'd love to hear your feedback! Join our community on Discord and let us know what features you'd like to see.

Conclusion

PlanLlama takes the pain out of job scheduling so you can focus on building great features. Whether you're a solo developer or part of a large team, we've designed the platform to scale with your needs.

Try it today and experience the difference of truly reliable job scheduling.

Have questions or feedback?

Join our community on Discord to discuss this post and connect with other developers.

DiscordDiscuss on Discord
PlanLlama: Cloud-Based Job Scheduling & Cron Management Platform | PlanLlama Blog