White Arrow Pointing To The Left Of The Screen
Blog
By
Manuel Aparicio
|
Related
Content
Back to Blog

The OpenAI Release of GPT4

22
Mar
2023

An AI platform scoring in the top 10% of all bar exams sounds like a sci-fi fantasy, doesn’t it? Yet, it’s happening right now! The next-generation Large Language model, GPT4, can analyze multiple pages of complex data and deliver a recap of them in seconds. According to its paper, the new neural network landed at the 90th percentile of all test takers. It’s an improvement over previous models, like ChatGPT-3.5, which scored in the bottom 10%! The paper also shows how the OpenAI GPT-4 model can analyze images individually. This way, it identifies when something doesn't make sense. The used example was a VGA connector charging a phone. That's as exciting and frightening as Artificial Intelligence (AI) can get in movies!

You must be thinking that something like this wouldn't be possible. Not this soon, at least! If you have been reading ChatGPT-3.5 news since its release in November 2022, you know how fast it changed the world. Among its triumphs, it passed gold-standard US medical exams and a Google interview for a level 3 Engineer paid at $183K a year. It was so proficient that some hospitals began to use it to rewrite complex reports!

These are only a few things the previous GPT-4 could accomplish in a few months (or even less). And now OpenAI has taken that power to a new level of greatness. We told you it was as scary as AI gets in sci-fi movies! So, what is GPT4 all about?

The OpenAI Release of GPT4

Source: GPT-4 Technical Report, page 5.

What is ChatGPT4?

GPT4 has been available since March 14th and has blown away everyone already. Just like its previous versions, GPT-4 is an AI generative tool. It means it can create new content from user prompts. As you may know, GPT is an acronym for Generative Pre-trained Transformer. This technology processes and learns from massive amounts of data.

Think of GPT-4 as a much-refined version of Chat GPT-3.5. It significantly improved its predecessor's performance. GPT models are famous for answering questions, summarizing text, and writing code. Besides, people often use it to make blog posts, social media posts, songs, and even poems. The first version, GPT-1, took 117 million parameters. It scaled up to 175 billion with its 3.5 version. While the number of parameters in GPT-4 is still unknown, we can only expect it to be astronomical as it is the most advanced version of the GPT Language models.

The OpenAI Release of GPT4

Source: GPT-4 Technical Report, page 7.

Likewise, it has advanced AI capabilities in arithmetic, math, and science fields. And, as if all that wasn’t enough, it's now much better at writing Python code. There's no doubt it will improve software development to a large extent. Apart from being much more powerful, it has a few more functionalities. As mentioned, it can now process images to generate text output. That's why they use the term multimodal model to describe it.

What is New in GPT 4?

When you compare it to ChatGPT-3.5, they look almost identical. However, GPT-4 has a much larger context window. Hence, it has a much better understanding of the prompts and more targeted results. Let's review the most popular features that got everyone talking!

GPT4 and Visual Inputs

Yes, GPT-4 accepts images and visual inputs as prompts. This feature must be the most incredible tech enhancement we’ve seen in decades! Sadly, it's currently unavailable, as OpenAI mentioned on its March 14, 2023 Develop Livestream. At the event, Greg Brockman highlighted the challenges of these features' full availability.

Yet, it goes beyond accepting image inputs! We already mentioned that GPT4 could understand the irony of charging a phone with a VGA connector. Yet also, it explained a meme with chicken nuggets resembling the shape of the world. It seems like GPT-4 is close to understanding sarcasm. Bazinga! Look at the response it gave to the prompt, "Can you explain this meme?"

The model gave a detailed answer explaining all the reasoning behind the meme step by step. That is an enormous advancement, as it is the first model to accept input images! GPT-4 also demonstrated its capability to read text from a photo and do math operations. Not only did it give an accurate answer, but it also explained its step-by-step reasoning. The prompt was "What is the sum of average daily meat consumption for Georgia and Western Asia? Provide a step-by-step reasoning before providing your answer." Again, it gave an excellent response that required image and text processing.

There are wide range of examples in the technical report on OpenAI's website. One shows how it can recognize the incongruence of a man attached to the roof of a taxi as he irons clothes. There's another example where it acknowledges an image's components. The picture includes ingredients like eggs, milk, butter, and flour. With that input, GPT4 gave the user great options like pancakes, waffles, and muffins. Awesome, right?

The most mind-blowing example was Greg Brockman using GPT-4 to create a Discord bot in minutes. He took a screenshot of his chat in Discord and gave it to the bot. Since the bot had GPT-4's API, it processed the screenshot and described it thoroughly. The description included the number of users, tags, and locations.

The OpenAI Release of GPT4

Source: GPT-4 Technical Report (page 33, table 14.)

GPT4 and Creativity

OpenAI has promised that GPT-4 is a much more creative version than GPT-3.5. During the stream, Greg asked GPT-3.5 to summarize a few paragraphs. On top of that, it could only use words that begin with the letter "G." While the length was a single sentence, GPT3's answer was quite poor. Likewise, Brockman asked GPT-4 to do the same thing but with the letter "Q." The only model’s mistake was using the word "AI." Its response was: “GPT-4 quintessentially quickens quality quantifications, quelling questionable quandaries.” If that isn't jaw-dropping, we don't know what is!

GPT4 and Accuracy

OpenAI claims the new model is 82% less likely to give answers for disallowed content. The team fine-tuned the model by improving its ability to receive feedback and handle nuanced instructions than previous GPT models. That's also known as Reinforcement Learning with Human Feedback. Engineers used safety-relevant RLHF training prompts to further appropriate behavior.

Let’s take that word puzzle from before as an example. When told the word “AI” wasn’t allowed, GPT4 rewrote and nailed it. Thus, it shows that it can improve results when you let it know how to do it. Likewise, when the presenter built the Discord bot using GPT-4, it had some bugs. The code needed to be updated since GPT-4's knowledge only has data until 2021. The presenter copied and pasted the code into GPT-4's context window, sent it, and GPT-4 fixed it. In other words, you can improve results by giving GPT-4 feedback. Although results improved significantly, it still can make mistakes. Moreover, it still "hallucinates." That's why OpenAI encourages human review and recommends avoiding high-stakes topics. There's still a need for double-checking and proofreading. Nonetheless, this chart shows how GPT-4 improved safety responses and factual responses.

The OpenAI Release of GPT4

Source: GPT-4 Technical Report (page 10, figure 6.)

GPT4 Drawbacks

GPT 4 Information

GPT4 still doesn’t have information after September 2021. Many people stood disappointed as it doesn't represent an upgrade. The results it gives are just as outdated as the ones of GPT-3.5. As a result, it'll be of little help in solving problems that happened after that date. Yet, as seen above, it’s more open to feedback with updated data to solve problems. Here is where the larger context window plays a key role. GPT-4 now accepts 25,000 characters. The one that Brockman used in his presentation would receive 32,000. We expect it to be that large in the foreseeable future.

GPT 4 Hallucinations

That sounds weird to you if you are unfamiliar with GPT-3.5 limitations. This concept means GPT-4 makes things up or "hallucinates" sometimes. Plus, it can also give incorrect premises. Hence, you must fact-check the information it provides you. That's even more critical when handling matters that require reliability.

GPT 4 Availability

Until now, people must pay a monthly subscription to access GPT-4 for free. Its most important feature, accepting visual input, has yet to be available to the public. On the other hand, its technical report lacks critical information. The number of parameters they used to train the model was missing. It even led to people criticizing a transparency shortage.

“How can I try GPT-4?” We’re sure you're asking yourself that question. GPT4 has been available since March 14th. Yet, you can use it with ChatGPT’s Premium Version, which costs around $20 a month. Nonetheless, Microsoft confirmed that Bing is running on GPT-4.

As for now, it's hard to tell if it will be free to use through ChatGPT. It's not a secret that OpenAI needs to handle the expenses of millions of people using it somehow. It's worth noting that Chat GPT-3.5 was costing OpenAI around $3 million a month. This number can explain the need to start charging for the service.

Final Thoughts

OpenAI did outstanding work improving the capabilities of GPT-3.5. Although the model is much closer to perfection, it can still make mistakes. Here, overreliance could be a potential issue. As adoption increases, it gets easier to forget that it might give you a wrong answer.

Despite its unresolved issues, it's clear that GPT-4 is improving its efficiency. Further, many companies have integrated it into their products so users can gain hands-on experience on the AI field. Some of them are Stripe, Duolingo, Be My Eyes, Morgan Stanley, and even the government of Iceland. These technological enhancements comprise potential safety challenges. Some include harmful content, privacy issues, economic impact, and cybersecurity risks. Even OpenAI's CEO, Sam Altman, admitted he feared AI models! Undoubtedly, GPT-4, Deep Learning, Large Language Models (LLMs), and Large Multimodal Models are disrupting the present and will revolutionize the future.

We can only wait to see how this powerful tool unfolds and hope for the better!