The moment we've been waiting for has finally arrived. GPT-4 is expected to launch officially next week and this time it's not just limited to text outputs. It's bigger than what we had imagined. GPT-4 will be the boost that AI has been looking for since the beginning of its experimental phases.
Open AI's GPT-4 is about to give a tough competition to Google through its multimodal nature and accuracy that is difficult to beat. But the question is, how do we know it's going to be a multimodal software? Well, this is not something we just made up, it's a revelation by Andreas Braun, CTO of Microsoft Germany himself. He has hinted that the official launch can be expected somewhere around March 15. This news has taken over the globe and people are now wondering what this new multimodal AI software is bringing with it?
Based on this news, assumptions are being made that GPT-4's multimodal nature will allow users to input images and maybe even videos to get an output with multiple possibilities. If this becomes true then we're surely about to witness something that has never been witnessed before. The previous versions of GPT (3.0 & 3.5) allowed only textual inputs and outputs. As far as GPT-4 is concerned, some German reports are claiming up to 4 modality levels.
This means that GPT-4 might be able to support 4 different types of inputs: Text, Video, Audio and Images. This opens a gateway to infinite possibilities and outcomes. Along with this, Microsoft is also working on some metrics with the goal of improving reliability of it's AI software. This will help the tech giant to improve accuracy of the software, gaining confidence of users and attracting a larger base of loyal users.