Activity
Mon
Wed
Fri
Sun
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
What is this?
Less
More

Owned by Juanes

PA
Polar AI

1 member • Free

Memberships

🎙️ Voice AI Bootcamp

7.6k members • Free

Road to Partner | Nico Seoane

14.5k members • Free

Brendan's AI Community

21.6k members • Free

La Tribu Divisual

1.2k members • $10,000/y

Oscar's Community

3.8k members • $39/m

NextGen Lite

2.3k members • Free

Skoolers

183.7k members • Free

AI Automation Agency Hub

272.4k members • Free

Adonis Gang

185.9k members • Free

72 contributions to Assistable.ai
Feature Release: Chat History Token Optimization
So, when using your own openai key (and even us as a business), you notice with agent stack (tools, prompt, convo history, RAG, etc) it starts to stack up quick - especially if you have a really involved process. We implemented a token optimization model before our chat completions to ensure you get the cost savings and ill share some data at the end :) So, we are now truncating and summarizing conversation history - we noticed there are large chat completeions coming through with 300-400+ message histories. This becomes expensive overtime if its a lead you've been working or following up with for a while engaging in conversation, so we are reducing that number and summarizing the history to ensure the intelligence stays the same but the token consumption goes way down (98% decrease on larger runs) Another thing we are doing is truncating large tool call outputs within the window that are not relevant to the current task - meaning, if there are tool calls with large outputs (like get_availability), if they are not relevant to the current task at hand, we truncate the response to show the agent that the action happened but the context is shorter. This saw a huge reduction in token consuption as well (96% decrease on larger runs) Here is the before and after, this is the same exact conversation history, assistant ID, tools, custom fields, knowledge base, etc - but see the speed and cost difference and the output was the exact same message: Differences: - 35 seconds faster - 95.95% cheaper ---- Before: "error_type": null, "usage_cost": { "notes": null, "tokens": { "output": 211, "input_total": 175948, "input_cached": 0, "input_noncached": 175948 }, "total_cost": 0.353584, "model_normalized": "gpt-4o", "models_encountered": [ "gpt-4o" ], "price_used_per_million": { "input": 2.5, "cached_input": 1.25, "output": 10 }, "error_message": null, "run_time_seconds": 32.692, "returned_an_error": false, After: "run_time_seconds": 2.618, "returned_an_error": false,
0 likes • 2d
@Jorden Williams Volume is high, but 58 Million tokens in input is crazy, I have like 400 conversations daily
1 like • 2d
@Brandon Duncan No, every conversation is like 10-15 messages and that’s it, and I have a limit for 20 messages in each conversation
Double Messages | PATCHED
So, saw some stuff come in about duplicate / double messages coming through - started as just FB but it seems like meta and custom conversation providers (whatsapp etc). Triaged it to find there was a timeout-retry happening for wait times & / or large tool stacks - the server would think the request is taking too long and try again only to find out that the request was just finishing up. I added a proxy before the chat completion run which allows us to queue the run so this never happens again :) let me know if you see different, happy monday
0 likes • 10d
can you help me with the API consumption? Support haven’t said anything
0 likes • 10d
@Jorden Williams thanks bro
API consumption again...
@Jorden Williams I need help again with the token usage, it consumed 47M tokens today with over 6000 requests, I'm using 4.1 mini
0 likes • 12d
@Jorden Williams @Mike Copeland @Assistable Ai 15 usd in just one day with 4.1 mini??? How am I getting 54M tokens in the input?
0 likes • 12d
Please help
Oauth Issue
We're aware of reports of an Oauth issue that's linked to Google Cloud. We're investigating and will keep you posted. Thanks!
0 likes • 14d
AI is not answering again
0 likes • 14d
@Mike Copeland
Follow up action fixed
Heard something about follow up action not working - went to look and GHL wasn't passing a url parameter to us. So, we just rebuilt it using meta data - should be good now, will look at other actions this may be effected by as I go through my current dev sprint
0 likes • 16d
@Jorden Williams yeah, I’m using the endpoint so when I use it I get the conversation and it generates the message, but I want it to not send the message in some specific cases
0 likes • 14d
@Jorden Williams The endpoint is not working now
1-10 of 72
Juanes Correa
3
4points to level up
@juanes-correa-3699
...

Active 5h ago
Joined Jan 6, 2025
Powered by