Toon then delves into the Azure OpenAI Service, which is a cloud-based AI platform that allows developers to build and deploy large language models. He explains that the service can process large amounts of data and generate human-like text, making it useful for various applications such as classification, automation, and data extraction.
One of the advantages of the Azure OpenAI Service is its multi-language capability. Toon mentions that the language models are typically language-agnostic, meaning that they can work in one language and typically work for another one. This is particularly useful for organizations that operate in multiple languages.
Toon also discusses the different models and versions available in the Azure OpenAI Service. He notes that the models are constantly improving, and choosing the right one depends on factors such as the number of tokens it can process, the price, and the performance.
For instance, the GPT 3.5 durable is faster but less capable in reasoning, while the GPT 4 is slower but can make more logical decisions based on data. The GPT 4o, which was announced during the presentation, is even more powerful and can recognize objects on a table or screen, making it useful for applications such as remote assistance and education.
Toon then introduces the concept of retrieval-augmented generation, which involves retrieving relevant information from documents or a knowledge base to answer a question. He explains that this approach combines the strengths of traditional search and large language models, resulting in more accurate and contextually relevant answers.
The process involves searching for relevant information using vector search, which is based on the mathematical representation of text. The text is transformed into a vector, and the cosine similarity is used to determine the similarity between vectors. The nearest neighbor is then selected as the most relevant information.
Toon also discusses the importance of chunking, which involves breaking down large documents into smaller pieces to make them more manageable for the language model. He notes that this process can be automated using libraries such as L chain.
Toon then demonstrates a job matcher copilot that he and his team built for one of their customers. The copilot matches job seekers with relevant job openings based on their CVs.
Toon explains that the CVs are stored in SharePoint in Word and PDF formats. They are then chunked, enriched with metadata, and embedded using Azure OpenAI Service. The resulting vectors are stored in Azure Cognitive Search, which allows for fast and accurate retrieval of relevant information.
Toon then shows how the job matcher copilot can be used to find a .NET developer with at least five years of experience. The copilot uses vector search to find the most relevant CVs and then uses the GP T 35 turbo model to rank the candidates based on their relevance to the job opening.
Toon also demonstrates how the copilot can handle hard requirements, such as a specific number of years of experience. He notes that in such cases, traditional search can be used in combination with the language model to ensure accurate results.
Toon concludes his presentation by sharing some lessons learned and best practices for working with generative AI and retrieval-augmented generation. He emphasizes the importance of getting out of one’s comfort zone, being prepared for non-deterministic results, and keeping up with the rapid evolution of the technology.
He also notes that integration is a critical component of generative AI projects, as they often involve accessing data from different applications and systems. Toon encourages integration developers to invest in generative AI and start playing around with it, as they are well-positioned to be the Swiss knife of the AI world.