Image by newmaterial

Different Ways to Implement AI in Software

AI Jul 20, 2024

There's hardly a single piece of software available today that hasn't made a big deal out of adding AI functionality. On top of adding AI features to existing software, startups are born daily based on a new idea of how to implement AI in a way that tries to improves our lives. Or at least claims to do so well enough to get some funding.

Some AI implementations are genuinely useful, some are pure marketing, and some are somewhere in the middle still trying to find their place. As companies feel forced to add AI functionality just to seem relevant, those AI features may not bring much value. Understanding how a company chose to implement AI can help you ask the right questions about how your data will be used, what kind of quality you can expect, and how well differentiated their solution might be.

The various ways to implement AI really come down to 5 broad categories:

1. Operating System AI

Apple announced their AI features last month, and gave insight into how those features can be used by developers through official API's. Apple was the last holdout for AI in the operating system, and now MacOS, Windows, iOS, and Android all have different AI-related functionality.

Just as app developers can use your phone camera right in their mobile app, they can now use the operating system's AI functionality right in their app too. This is probably the least mature category of software AI, and there will certainly be heavy evolution here over the next few years. In theory your device can offer very powerful AI functionality to apps because your devices knows everything about you; in practice this may prove hard to do in a way that maintains your privacy.

Imagine being able to build an app where you can make a cartoon of a boy that resembles family members right on your device, without reaching out to another service, because your device has all his photos saved.

Or imagine an app that can scan your text messages to see when tension is getting a little high with your partner and automatically schedule you a date night and send a calendar invite over to your partner.

Sure, these are contrived examples and probably have too many privacy concerns to ever be a reality, but as this functionality matures we will certainly start to see apps that take advantage of this approach.

2. GPT Wrappers

Apps like this are the most simple in this category. That's not to say they aren't useful, but their simplicity makes them easy to replicate. These apps simply take a request, send it to an external API (such as OpenAI), and present the results back to you.

For example: There are plenty of "song lyrics writer" apps that ask you to describe the song they want, and then they give you that generated song. The entirety of their magic is completely handled by OpenAPI, but the app can still build an incredible user experience around that magic - or implement some new twist that makes the app compelling beyond the generated lyrics themselves.

Plenty of useful software implements AI-related features with a simple approach like this. When that's what the feature calls for, the simplicity is great.

3. Retrieval Augmented Generation (RAG)

Many software companies find that the sweet spot for their AI applications or features is with the use of different RAG techniques. But what does this mean?

In a nutshell, Retrieval Augmented Generation refers to the approach of feeding extra context to an AI model such as ChatGPT to supplement the prompt. This can help the AI model tailor its output in a way that's more useful.

For a very simplistic example, let's take our "song lyrics writer" app from before, and add some RAG techniques. When someone submits a prompt such as "Write a song about how much I love hanging out with my boyfriend", the app would look at your music library and see what your top played songs are.

For one user the app turns that prompt into "Write a song about how much I love hanging out with my boyfriend, after the style of Billie Eilish, Taylor Swift, and Sabrina Carpenter" before sending it into the AI model. For another user the prompt might be "Write a song about how much I love hanging out with my boyfriend, after the style of Metallica, Green Day, and the Foo Fighters". The generated lyrics would be different, specifically tailored to that user even though they both typed the same prompt into the app.

Apps that take a more sophisticated approach to RAG might be querying customer history, searching for similar scenarios in a company database, etc. and using that to supplement the prompt. A whole class of data storage (vector databases) has become popular because they can help make RAG easier while interacting with LLMs.

4. Fine-Tuned Open Models

There's quite a few open source AI models out there. Industry titans such as Meta themselves have made models available for anybody to download and use.

The advantage those models have is that they can be "fine-tuned". If we have an open model that has been trained on millions or billions of customer service chats, the model will be able to handle the basics of how to interact with a customer. But, the model probably won't know the details of your products or services, or know the details about how you like to interact with your customers.

Training a large model costs millions of dollars, but fine-tuning a model can be done for a tiny fraction of that amount. By taking a starting point and then feeding a few years of your own chat history logs, you can end up with a fine-tuned model that understands exactly how to interact with your customers.

For many approaches this fine-tuning has fallen out of favor compared to RAG, but especially when you have a large amount of data and the infrastructure to run your own models this approach can still give you an edge.

5. Custom Models

Building an AI model from scratch costs hundreds of millions of dollars, and this is after paying highly-specialized engineers and scientists millions of dollars of salary. You can count on your hands the number of companies that have the resources to actually do this.

There are other techniques that fall more into what is considered Machine Learning (ML), and those do not require the major up-front investment. Most of the buzz around AI in the last year or two has been around generative AI where a new creation (text, image, etc.) is synthesized by the model, but other types of models have been just as useful. Running financial transactions against a model to rate them for the likelihood of fraud is an example of this type of model, and these techniques have been in use for many years longer than generative AI.

Unless the company you are getting your software from is worth a few billion dollars, it's highly unlikely that they are building their own models for "AI-powered chat" or "marketing content suggestion" or other such features.

Takeaways

So, does it really matter which approach a company uses to implement AI in their software? The results matter, and what you get out of the feature is much more important than how it is done.

Once you understand that 99% of AI implementations are either simple GPT wrappers or use some type of a straightforward RAG approach to customizing the context available to a standard AI model, hopefully the magic is a little bit less, well, magical. What Open AI (And Google, and Facebook, and others) have built is magical. Most AI features out there are solid implementations built on their magic, nothing more.

To be clear, this doesn't take away from a good implementation. All software is built on the shoulders of giants in the form of open source libraries, architecture lessons across 40 years of experience, and things we just take for granted like REST and all the layers underneath it (HTTP, TCP, IP, etc.)

Most importantly, don't just buy hype. When AI functionality is done well, it's an amazing addition. Make sure that you evaluate any AI functionality on the actual value it brings you, not any theoretical value that is being marketed to you.

Tags