19 May 2026

5 Ways to Reduce AI Tokens and Avoid Rate Limits

Every time Standard Time® AI Chat® analyzes a project it sends a payload of task names, assignments, dates, time logs, and notes to the AI model. On large projects that payload can burn through your Tokens Per Minute (TPM) allowance in a single request — triggering an HTTP 429 error that blocks the session. These five steps shrink the payload before it leaves Standard Time®, so you get more analyses per minute and fewer interruptions.

  1. 1

    Enable Exclusion Flags in stdata.AiModelSet

    This is the highest-impact change you can make. Standard Time® stores each AI model configuration in the stdata.AiModelSet table, and each row includes a set of Exclusion Flags that strip data categories from the AI payload before the request is sent. The AI never receives the excluded fields — which means fewer tokens are consumed on every request, automatically, without changing how anyone uses the chat window.

    Start with these two flags: ExcludeTimeLogs removes every time log entry from the payload (typically 200–800 tokens per project). ExcludeCompletedTasks drops tasks whose status is Completed or Closed — often the largest category on a mid-project workload, saving 300–2,000 tokens per project. Together these two flags typically cut payload size by 60–80% with no loss of useful context for the AI.

    Other available flags: ExcludeNotes, ExcludeAssignments, ExcludeBudget, and ExcludeCustomFields — each removes a data category that may be irrelevant to the question you're asking. Enable only what you don't need; disable a flag the moment you ask a question that requires that data.

  2. 2

    Analyze One Project at a Time

    When you click a single project row in the Projects page before opening AI Chat, Standard Time® sends only that project's data. Asking for a portfolio-wide summary — "how are all my projects doing?" — forces every active project into the payload at once. On a shop with 20 active work orders, that can multiply token consumption by 10x or more in a single request.

    The fix is simple: click the specific project you care about before typing in AI Chat. The AI already knows which project you have selected — you don't need to name it in your prompt. If you genuinely need a portfolio view, ask about two or three projects in separate follow-up turns rather than combining them into one request.

    Saves: 1,000–10,000+ tokens per session depending on how many projects are active.

  3. 3

    Write Short, Direct Prompts

    Prompt tokens count against your TPM limit just like data payload tokens. A long, descriptive question costs more than a short one — and the AI already has the project context, so you don't need to explain it.

    Instead of: "I'd like you to take a careful look at the project I currently have open and give me a thorough analysis of which tasks are at risk of running over deadline and why that might be the case."

    Write: "Which tasks are at risk of missing the deadline?"

    Both prompts ask the same question. The short version uses roughly 95% fewer prompt tokens. Over a session with five or six follow-up questions, that difference adds up to hundreds of tokens saved — enough to stay inside the TPM window on a tight plan.

    Saves: 50–300 tokens per message, compounding across a long session.

  4. 4

    Keep Sessions Focused; Start Fresh When the Topic Changes

    Each message in an AI Chat session carries the full conversation history as part of the prompt. A 10-turn conversation compresses that history into the context of message 11 — so every follow-up question in a long session costs more tokens than the one before it, even if the question itself is short.

    When you shift to a new topic — moving from a scheduling question to a budget question, or switching from one project to another — close the current session and start a new one. A fresh session has no accumulated context, so the first question in it costs the minimum possible tokens.

    This is especially important near the end of the day when you are close to a Tokens Per Day (TPD) ceiling. Finishing one focused session and starting another keeps each session's total manageable and avoids the reset-at-midnight situation where everyone on the team is locked out until UTC rollover.

  5. 5

    Upgrade Your AI Provider Plan

    If your team runs more than a handful of AI Chat sessions per day, or if your projects are large enough that even a single-project analysis pushes the TPM limit, a paid AI plan is the most durable fix. The free on-demand tier on Groq caps at 200,000 tokens per day and 6,000–30,000 tokens per minute depending on the model. Paid tiers raise those numbers by 5x to 20x — removing the ceiling entirely for most shop-floor workloads.

    In Standard Time®, upgrading is a configuration change: get a new API key from your provider, open the AI model settings, paste the key, and save. No software update required. The higher limits take effect immediately on the next AI Chat request.

    Upgrade links:

    Alternatively, switch to Ollama — a locally-installed AI model that runs entirely on your own hardware. Ollama has no API rate limits because there is no external service to limit you. Standard Time® supports Ollama as a model source; contact Scoutwest for setup guidance.

Start With Steps 1 and 2

Enabling ExcludeTimeLogs + ExcludeCompletedTasks and switching to single-project analysis will cut most teams' token consumption by 60–80% immediately, with zero change to how anyone uses the chat window. Do those two things first. Add the remaining steps only if you still hit limits after that.

Related Articles

Getting Rate Limit Errors in AI Chat?

Scoutwest can configure Exclusion Flags and AI model settings so your team gets the most out of Standard Time® AI Chat without hitting limits. Start a free trial or contact us to get set up.

View Pricing Contact Us