In response to the numerous requests from our valued followers and readers, we are pleased to announce the launch of a new segment on Digital Pakistan, entitled “Ask the Expert”. This segment will feature renowned experts from various fields, who will discuss and analyze trending topics and news. The objective of this series is to simplify complex research, theories, and discoveries, making it more accessible for our readers.
We are excited to commence the series with an in-depth discussion on the rapidly evolving topic of artificial intelligence (AI). With the emergence of ChatGPT revolutionizing the way we interact with AI, many questions and queries have arisen amongst the general public. Our experts will provide insight and clarify misconceptions about this buzzing topic.
What is generative AI?
Generative AI is a type of artificial intelligence that can create various forms of content including text, imagery, audio, and synthetic data. Recent advances in user interfaces have made the technology easier to use, resulting in high-quality outputs within seconds. Despite being around since the 1960s, it wasn’t until 2014 with the introduction of generative adversarial networks that generative AI could convincingly create authentic images, videos, and audio of real people. While this technology opens up new opportunities for better movie dubbing and rich educational content, it also raises concerns about deep fakes and cybersecurity attacks. Recent advances in transformers and breakthrough language models have played a critical role in bringing generative AI to the forefront, but we are still in the early stages of its development. Early implementations have had accuracy and bias issues, as well as hallucinations and weird responses. Nevertheless, progress so far indicates that generative AI has the potential to fundamentally change businesses.
How does it work?
Generative AI model works by utilizing machine learning algorithms, deep neural networks, and large language models to generate new content based on a given input or prompt. At its core, generative AI starts with a prompt that can be in the form of a text, an image, a video, musical notes, or any input that the AI system can process. The AI system then processes the input and generates new content in response to the prompt. The AI algorithms use statistical models to analyse vast amounts of data, learn patterns, and identify relationships between different inputs and outputs. Deep neural networks play a crucial role in generative AI as they help the system to learn from the data and generate new content based on the learned patterns.
One of the recent breakthroughs in generative AI has been the introduction of transformers and breakthrough language models. These machine-learning models made it possible for researchers to train ever-larger models without having to label all the data in advance. New models could thus be trained on billions of pages of text, resulting in answers with more depth. Transformers unlocked a new notion called attention that enabled models to track the connections between words across pages, chapters, and books, rather than just in individual sentences. And not just words: Transformers could also use their ability to track connections to analyse code, proteins, chemicals, and DNA.
One of the recent breakthroughs in generative AI has been the introduction of transformers and attention-based large language models. These models could thus be trained on billions of pages of text, resulting in answers with more depth. Transformers unlocked a new notion called attention that enabled models to track the connections between words across pages, chapters, and books, rather than just in individual sentences. And not just words: Transformers could also use their ability to track connections to analyze code, proteins, chemicals, and DNA.
Types of Generative AI (Models)
Generative AI can be used for various types of data, including text, video, image, audio, and more. Some examples of generative models for different types of data are as under:
Text : Generative AI language models such as ChatGPT, Cohere, and Copy.ai are built on complex neural network architectures that rely on vast amounts of training data. These models generate new text based on a given prompt or input, with a level of sophistication and realism that can often rival human-generated text. The models are trained on large datasets of human-generated text, which provides them with the knowledge and vocabulary necessary to generate high-quality responses. Many of these models are fine-tuned for specific tasks, such as chatbot or content creation. This involves further training the models on a smaller dataset that is tailored to the specific task, allowing them to learn the nuances and requirements of that task more effectively.
Image : Generative Adversarial Networks (GANs) are a specific type of neural network architecture used to generate new images that are similar to real images. GANs have been used to create impressive new images that look like they were created by humans. Projects like Midjourney and OpenAI’s DALL-E are one of the leading models in visual content generation. These models have numerous potential uses, such as creating realistic content for training purposes, generating new artwork or design, or even creating realistic 3D models of objects.
Video : GANs can be trained on large datasets of videos to predict the next frames of a video sequence based on the preceding frames. One example of a generative video model is Runway, a tool that allows users to create video models and explore different generative techniques. Another example is Synthesia, which uses generative models to create photorealistic video content without the need for expensive video production equipment. These models have the potential to revolutionize the way video content is created and shared, opening up new possibilities for creative expression and communication.
Audio : Generative AI models have made significant strides in recent years in the field of audio synthesis, enabling the creation of original music and speech synthesis that can be difficult to distinguish from human-generated audio. For example, Murf AI is one of the speech synthesis applications that utilizes generative models to produce realistic-sounding speech. It uses deep learning algorithms to analyze large amounts of speech data and then generates new speech based on that analysis. The resulting output can be used in a wide range of applications, from virtual assistants to audiobooks. Generative AI models can also be used for music generation. These models learn from large datasets of existing music, analyzing elements like melody, rhythm, and harmony, and then generate new music that fits within those parameters. This can be particularly useful for music composition in areas like film scoring, where composers can use generative AI to generate music that fits the tone and style of a particular scene.
Art : Generative AI has opened up exciting possibilities for digital art creation. Style transfer, which involves combining the style of one image with the content of another, has become a popular technique for generating new digital art. OpenArt AI is an example of a generative AI platform that uses GANs to generate new art pieces. The platform allows users to choose a style or theme and then generates unique artwork based on that input. Similarly, NightCafe Studio uses a combination of GANs and style transfer to create digital art pieces that are both unique and visually stunning.
Code : Generative AI can be used to automate the process of coding by generating code based on specific criteria or inputs. Code generation can be useful in a variety of applications, such as generating code for specific programming languages, automating repetitive coding tasks, and even generating code that can optimize itself based on certain criteria. Replit, for example, is a platform that provides an interactive coding environment where users can generate and share code snippets. Tabnine, on the other hand, is an AI-powered code completion tool that suggests code snippets based on context, saving programmers time and increasing their productivity. In addition to generating code, generative AI can also be used to detect and fix bugs in code. CodeSonar, for instance, is a static analysis tool that uses generative techniques to detect errors and vulnerabilities in code, ensuring that it runs smoothly and securely.
About the writer: Dr. Usman Zia is an Assistant Professor at the School of Interdisciplinary Engineering and Sciences, National University of Sciences and Technology (NUST), Pakistan. His research interests are Natural Language Processing and Machine Learning. He has authored numerous publications on language generation and machine learning. As an AI enthusiast, he is actively involved in a number of projects related to generative AI and NLP.