منذ عام واحد

Yumou Wei Paulo Carvalho John Stamper

نشر نموذج DePLM بضغطة واحدة: تحسين البروتينات باستخدام نماذج لغوية مع ضوضاء (عدد قليل من الأمثلة)

20 ساعة فقط من موارد حوسبة RTX 5090 $1 (قيمة $7)

جدول المحتويات

الملخص

أصبح نموذج GPT مرادفًا تقريبًا لنماذج اللغات الكبيرة (LLMs)، وهو مصطلح يزداد شعبية في وقائع مؤتمرات الذكاء الاصطناعي في التعليم (AIED). تكشف عملية البحث البسيطة القائمة على الكلمات المفتاحية أن 61% من الأوراق البحثية الطويلة والقصيرة البالغ عددها 76 ورقة، والمقدمة في مؤتمر AIED 2024، تصف حلولاً مبتكرة تستخدم نماذج اللغات الكبيرة (LLMs) لمعالجة بعض التحديات طويلة الأمد في مجال التعليم، بينما تذكر 43% منها نموذج GPT بشكل صريح. وعلى الرغم من أن نماذج اللغات الكبيرة (LLMs)، التي روج لها نموذج GPT، تخلق فرصًا مثيرة لتعزيز تأثير الذكاء الاصطناعي على التعليم، فإننا نجادل بأن التركيز المهيمن في هذا المجال على نموذج GPT ونماذج اللغات الكبيرة (LLMs) الأخرى كثيفة الموارد (التي تتجاوز 10 مليارات معلمة) قد يعرّض المجال لخطر إهمال الأثر المحتمل الذي يمكن أن تحدثه نماذج اللغات الصغيرة (SLMs) في توفير وصول عادل وميسور التكلفة إلى أدوات ذكاء اصطناعي عالية الجودة للمؤسسات محدودة الموارد. وبدعم من نتائج إيجابية بشأن اكتشاف مكونات المعرفة (KC)، وهو تحدٍ حاسم في مجال AIED، نوضح أن نماذج اللغات الصغيرة (SLMs)، مثل Phi-2، يمكنها إنتاج حلول فعالة دون الحاجة إلى استراتيجيات توجيه معقدة. وعليه، فإننا ندعو إلى توجيه مزيد من الاهتمام لتطوير أساليب قائمة على نماذج اللغات الصغيرة (SLMs) في مجال الذكاء الاصطناعي في التعليم (AIED).

One-sentence Summary

Demonstrating that the small language model Phi-2 effectively solves knowledge component discovery without elaborate prompting, the authors advocate for SLMs as a resource-efficient alternative to large language models to advance equitable access in AIED.

Key Contributions

This work introduces Phi-2, a small language model trained on curated textbook-quality data, which requires only 5.4 GB of memory to enable local inference on consumer-grade hardware for resource-constrained educational settings.
Empirical evaluations on GSM8K, HumanEval, MBPP, and MMLU demonstrate that Phi-2 matches or exceeds the performance of significantly larger architectures such as Llama-2 and Mistral across mathematical reasoning, coding, and broad academic knowledge tasks.
A knowledge component discovery algorithm is developed that leverages the model's direct token generation capabilities to outperform instructional experts and GPT-based baselines without relying on elaborate prompting strategies.

Introduction

The rapid integration of large language models into educational technology promises advanced AI-driven tutoring and assessment capabilities, yet their substantial computational requirements and reliance on third-party cloud APIs create significant barriers for underfunded institutions and raise critical student privacy concerns. This community-wide preference for resource-heavy architectures often ignores the practical constraints of classroom deployment, where limited budgets, modest hardware, and data sovereignty dictate technology adoption. The authors leverage small language models like Phi-2 to demonstrate that prioritizing data quality over parameter count yields highly capable tools that run efficiently on consumer-grade hardware. By repurposing Phi-2 as a probabilistic similarity engine for knowledge component discovery, they prove that smaller models can outperform both human experts and larger GPT systems while delivering a more accessible, affordable, and privacy-safe solution for educational settings.

Method

The authors leverage the intrinsic probabilistic capabilities of a language model to develop a novel approach for knowledge component (KC) discovery, moving beyond conventional text generation methods. Rather than relying on prompting large language models (LLMs) to generate KC labels directly, the method treats the language model as a "probability machine" that can estimate the likelihood of textual sequences. This allows the authors to define a measure of question similarity based on the concept of question congruity, which is mathematically equivalent to pointwise mutual information (PMI) between two questions. The core idea is that if the presence of one question increases the probability of another question appearing in a given context, the two questions are considered congruent and likely to share a common knowledge component.

To operationalize this, the authors use Phi-2, a small language model (SLM) tuned for educational applications, to compute the necessary probabilities for the congruity formula. The model is configured to use top-1 sampling, ensuring deterministic token selection at each step, which enables reliable estimation of conditional probabilities. By evaluating pairs of multiple-choice questions (MCQs), the framework calculates the congruity score, which reflects how strongly two questions are related in terms of their underlying KCs. This similarity measure is then fed into a clustering algorithm to group questions that are likely to share the same KC.

ملف PDF المصدر

جدول المحتويات

بناء الذكاء الاصطناعي بالذكاء الاصطناعي

من الفكرة إلى الإطلاق — سرّع تطوير الذكاء الاصطناعي الخاص بك مع المساعدة البرمجية المجانية بالذكاء الاصطناعي، وبيئة جاهزة للاستخدام، وأفضل أسعار لوحدات معالجة الرسومات.

البرمجة التعاونية باستخدام الذكاء الاصطناعي

وحدات GPU جاهزة للعمل

أفضل الأسعار

ابدأ عرض الأسعار

HyperAI Newsletters

اشترك في آخر تحديثاتنا

سنرسل لك أحدث التحديثات الأسبوعية إلى بريدك الإلكتروني في الساعة التاسعة من صباح كل يوم اثنين

مدعوم بواسطة MailChimp

HyperAI

شغّل هذا الـNotebook ناقش على Discord

منذ عام واحد

Yumou Wei Paulo Carvalho John Stamper

نشر نموذج DePLM بضغطة واحدة: تحسين البروتينات باستخدام نماذج لغوية مع ضوضاء (عدد قليل من الأمثلة)

20 ساعة فقط من موارد حوسبة RTX 5090 $1 (قيمة $7)

الانتقال إلى دفتر

جدول المحتويات

الملخص

One-sentence Summary

Key Contributions

This work introduces Phi-2, a small language model trained on curated textbook-quality data, which requires only 5.4 GB of memory to enable local inference on consumer-grade hardware for resource-constrained educational settings.
Empirical evaluations on GSM8K, HumanEval, MBPP, and MMLU demonstrate that Phi-2 matches or exceeds the performance of significantly larger architectures such as Llama-2 and Mistral across mathematical reasoning, coding, and broad academic knowledge tasks.
A knowledge component discovery algorithm is developed that leverages the model's direct token generation capabilities to outperform instructional experts and GPT-based baselines without relying on elaborate prompting strategies.

Introduction

Method

ملف PDF المصدر

جدول المحتويات

بناء الذكاء الاصطناعي بالذكاء الاصطناعي

البرمجة التعاونية باستخدام الذكاء الاصطناعي

وحدات GPU جاهزة للعمل

أفضل الأسعار

ابدأ عرض الأسعار

HyperAI Newsletters

اشترك في آخر تحديثاتنا

سنرسل لك أحدث التحديثات الأسبوعية إلى بريدك الإلكتروني في الساعة التاسعة من صباح كل يوم اثنين

مدعوم بواسطة MailChimp

Command Palette

صغيرة لكنها ذات أهمية كبيرة: حول وعود النماذج اللغوية الصغيرة لنظم التعليم الذكي القابلة للوصول

Yumou Wei Paulo Carvalho John Stamper

نشر نموذج DePLM بضغطة واحدة: تحسين البروتينات باستخدام نماذج لغوية مع ضوضاء (عدد قليل من الأمثلة)

الملخص

One-sentence Summary

Key Contributions

Introduction

Method

بناء الذكاء الاصطناعي بالذكاء الاصطناعي

HyperAI Newsletters

Command Palette

صغيرة لكنها ذات أهمية كبيرة: حول وعود النماذج اللغوية الصغيرة لنظم التعليم الذكي القابلة للوصول

Yumou Wei Paulo Carvalho John Stamper

نشر نموذج DePLM بضغطة واحدة: تحسين البروتينات باستخدام نماذج لغوية مع ضوضاء (عدد قليل من الأمثلة)

الملخص

One-sentence Summary

Key Contributions

Introduction

Method

بناء الذكاء الاصطناعي بالذكاء الاصطناعي

HyperAI Newsletters

Command Palette

صغيرة لكنها ذات أهمية كبيرة: حول وعود النماذج اللغوية الصغيرة لنظم التعليم الذكي القابلة للوصول

Yumou Wei Paulo Carvalho John Stamper

نشر نموذج DePLM بضغطة واحدة: تحسين البروتينات باستخدام نماذج لغوية مع ضوضاء (عدد قليل من الأمثلة)

الملخص

One-sentence Summary

Key Contributions

Introduction

Method

بناء الذكاء الاصطناعي بالذكاء الاصطناعي

HyperAI Newsletters