Building an MCQ App with Wikipedia and RAG: A Step-by-Step Guide
The article outlines the development of a multiple-choice question (MCQ) app that generates questions from user-defined topics by leveraging Wikipedia content and retrieval-augmented generation (RAG). Here’s a concise summary of how the app works, its key components, and potential enhancements. App Demo The app's user interface is straightforward and intuitive. It begins with a start screen where users input the context for the MCQs they want to generate, such as "Ask me anything about stars and planets." Upon submission, the app searches Wikipedia articles related to the query. The next screen displays a question with four answer choices. Users can select an answer, skip the question if it doesn’t meet their expectations, or end the session. If an answer is submitted, the app provides immediate feedback, indicating whether the answer was correct and offering an explanation. At the end of a session, the app summarizes the user's performance, including the number of correct answers and rejected questions. Users can then start a new session with a different context. Concept The primary goal of the app is to create high-quality, up-to-date MCQs tailored to the user's interests. User feedback is crucial, as it helps refine the content to better align with the user’s expectations and learning goals. Context Retrieval Workflow User Query: Users provide a query describing the context for the MCQs. Keyword Extraction: The query is converted into relevant keywords, such as “Stars,” “Planets,” “Astronomy,” “Solar System,” and “Galaxy.” Wikipedia Search: A search is conducted using these keywords, and the top three pages for each keyword are selected. Irrelevant Page Filtering: The relevance of each page is assessed using vector similarity between the user query and the page excerpts. Low-similarity pages are filtered out. Section Splitting: The remaining pages are read and divided into sections. Section Filtering: Each section is scored based on its similarity to the user query, and low-similarity sections are excluded. Scoring: Each section is assigned a score, which considers both the similarity to the user query and the number of times it has been rejected. This score is used to sample sections for question generation. Question Generation Workflow Section Sampling: A section is chosen based on its score. Chat Model Invocation: The selected section and user query are combined into a prompt to generate a new MCQ. The chat model returns a JSON response containing the question, answer choices, and an explanation of the correct answer. Answer Evaluation: After the user submits an answer, the app determines if it was correct and provides an explanation. Feedback Loop: If a question is rejected, the section's score is downgraded to reduce the likelihood of it being chosen again. Key Components Wikipedia API: The app uses the Wikipedia API to search for and retrieve articles. Keywords are extracted from the user query to find the most relevant pages. Vector Similarity: A crucial part of the filtering process, which ensures that only relevant sections are used to generate questions. Section scores are updated based on user feedback and similarity metrics. Prompt Engineering: Two types of prompts are used: Keyword Generation Prompt: Helps in creating relevant keywords from the user query. MCQ Generation Prompt: Ensures the chat model generates questions and explanations in a specific JSON format. Streamlit App Framework: Developed using Streamlit, a Python framework that simplifies the creation of web applications. It handles user interactions and state management efficiently. Context Retrieval The context retrieval process involves: Search Request: A search request is sent to the Wikipedia API with the user query and a specified number of results. Page Import: The app uses the wikipediaapi library to fetch the content of the selected pages. Section Dictionary: A function splits the pages into sections and returns a dictionary mapping sections to their text. Context Scoring The score for each section is calculated to prioritize relevant sections. The formula combines the rejection frequency and semantic similarity: [ s_{section} = w_{rejection} s_{rejection} + (1 - w_{rejection}) s_{sim} ] Where ( s_{rejection} ) is derived from the number of rejections for the section and its page, and ( s_{sim} ) is the similarity score. This ensures that sections frequently rejected by users have a lower chance of being selected again. MCQ Prompt Engineering The MCQ prompt includes the user query and the selected section text. The prompt instructs the chat model to generate a question, answer choices, and an explanation in a standardized JSON format. Features like including previous questions and rejected questions help prevent repetition and improve question relevance. Enhancements Several potential enhancements can elevate the app: Custom Document Uploads: Allow users to upload their own PDFs, such as lecture notes or textbooks, to generate questions. This would cater to specific learning needs and personal study materials. Adaptive Question Sequencing: Train a machine learning model to predict the likelihood of a question being rejected based on features like similarity to accepted and rejected questions. This would optimize the question selection process. Repeating Sessions: Save generated questions and use an algorithm to emphasize previously incorrect questions in future sessions, focusing on areas where the user needs improvement. Evaluation and Industry Insights The app's use of RAG and feedback-driven scoring systems represents a significant advancement in automated learning tools. Industry experts like Shaw Talebi and Avishek Biswas have highlighted the effectiveness of RAG in enhancing the quality and relevance of generated content. Harrison Hoffman's work on embeddings and vector databases further supports the app’s approach to context filtering. The app's reliance on Streamlit simplifies development and deployment, making it a valuable tool for educators, students, and developers. With planned enhancements, such as custom document uploading and adaptive question sequencing, the app has the potential to become a comprehensive and personalized learning resource. Company Profiles The app leverages several open-source technologies: - Wikipedia API: Provides access to a vast repository of educational content. - wikipediaapi Library: Facilitates efficient retrieval and parsing of Wikipedia articles. - Streamlit: Streamlines the app development process with easy-to-use Python functions for web UI creation. By integrating these tools, the app offers a dynamic and interactive learning experience, adapting to user feedback and optimizing question quality.