Is Running Your Own LLM More Economical than OpenAI?

Atharv Donadkar

3 min readApr 15, 2024

FREQUENT QUERIES FROM STARTUPS

𝐎𝐩𝐞𝐧 𝐀𝐈 𝐀𝐩𝐢 𝐏𝐫𝐢𝐜𝐢𝐧𝐠:

Charges are calculated per tokens. 1000 tokens approx 750 words.

Model wise cost:

1. 𝐆𝐏𝐓-𝟒

𝐈𝐧𝐩𝐮𝐭 𝐂𝐨𝐬𝐭->$0.03 / 1K tokens 𝐎𝐮𝐭𝐩𝐮𝐭 𝐂𝐨𝐬𝐭-> $0.06 / 1K tokens

2. 𝐆𝐏𝐓-𝟑.𝟓 𝐓𝐮𝐫𝐛𝐨

 𝐈𝐧𝐩𝐮𝐭 𝐂𝐨𝐬𝐭-> $0.0010 / 1K tokens 𝐎𝐮𝐭𝐩𝐮𝐭 𝐂𝐨𝐬𝐭->$0.0020 / 1K tokens

𝐅𝐨𝐫 𝐒𝐡𝐨𝐫𝐭 𝐜𝐨𝐧𝐭𝐞𝐧𝐭 𝐚𝐩𝐩 𝐒𝐨𝐥𝐮𝐭𝐢𝐨𝐧𝐬:

If this app writes marketing copywriting posts for users that takes around 100–150 words as input and outputs 600 words content.

That means 1 email costs 1000 tokens for one user around

1. GPT-4 Model

Let's break down the cost calculation based on the given information for GPT-4:

Input Cost:

Input tokens = (150 words) * (1000 tokens / 750 words) = 200 tokens
Input Cost = (200 tokens) * ($0.03 / 1K tokens) = $0.006
Output Cost:

Output tokens = (600 words) * (1000 tokens / 750 words) = 800 tokens
Output Cost = (800 tokens) * ($0.06 / 1K tokens) = $0.048
Total Cost:

Total Cost = Input Cost + Output Cost = $0.006 + $0.048 = $0.054

Total Cost per user = $0.006 + $0.048 = $0.054

If we receive 1000 users requests per day to write Copywriting email then the average monthly cost would be approximately

Month Total Cost = (Cost per user) x (1000 requests per day) x (30 days)
𝐌𝐨𝐧𝐭𝐡 𝐓𝐨𝐭𝐚𝐥 𝐂𝐨𝐬𝐭 = $1,620

Therefore, with GPT4 model if 1000 users use your service every day for 30 days, it would cost $1,620 in total.

2. GPT-3.5 Turbo

Monthly Cost =  ($0.0018 per user) x (1000 requests per day) x (30 days)
𝐌𝐨𝐧𝐭𝐡 𝐓𝐨𝐭𝐚𝐥 𝐂𝐨𝐬𝐭 = $54

Therefore, with GPT-3.5 Turbo model if 1000 users use your service every day for 30 days, it would cost $54 in total.

Host Your Own LLM Pricing:

Hosting on AWS: The Llama-2 7b Scenario

For those considering AWS hosting, particularly for the Llama-2 7b model boasting 7 billion parameters, the minimum server requirement is the EC2 g5.2xlarge instance, priced around $850 per month. Additionally, connecting the model to an API via AWS API Gateway & AWS Lambda entails extra costs. However, with a modest load of 1000 requests per day, this expense remains below $100 monthly.
Overall, the estimated monthly cost for AWS hosting, including server and API usage, hovers around $1,000.

Scaling with AWS: Handling Increased Demand

An advantage of the AWS setup is its ability to seamlessly manage increased loads. Should daily requests surge to 2,000, the monthly cost would double to $2,000. Nevertheless, AWS infrastructure ensures stable performance without requiring additional scaling efforts, maintaining a consistent monthly expenditure at $1,000.

Upgrade Considerations: Moving to Llama-2 13B

Despite initial utilization of Llama-2 7B, concerns about output quality prompted exploration into Llama-2 13B, which notably enhanced results. However, transitioning to Llama-2 13B demands a more robust server, driving monthly expenses up to approximately $5,000 — $3,000 more than utilizing the OpenAI API.

Conclusion:

Experiment with different models to identify optimal results.
Assess expected input and output text volumes for each model.
If text volume is consistent and low, and security isn’t paramount, OpenAI may be preferable.
Otherwise, conduct a cost analysis for AWS to tailor the solution to your specific needs.