Partial-Correlation Learning for Large Language Models with Skip-Tuning¶

Venue: iclr2026 (Withdraw) Authors: Yuheng Lu, Zuhe Song, Caixia Yuan, Xiaojie Wang OpenReview: https://openreview.net/forum?id=21j00qo8P4

Relevance¶

LLM score: 1/3 — Skip-Tuning reduces fine-tuning data by using noncontiguous segments, potentially offering efficiency benefits, but the paper's primary focus is on preventing overfitting and catastrophic forgetting rather than energy-efficient training or data movement optimization. Keyword hits: sparse

TLDR¶

(none provided)

Abstract¶

Large Language Models (LLMs) require post-training to adapt to specific applications, with Supervised Fine-Tuning (SFT) crucial for injecting emerging or domain-specific knowledge. Conventional SFT using complete sequential text risks causing a distribution shift from pretraining corpora due to large volumes of common-style text, potentially leading to overfitting and catastrophic forgetting. We introduce Skip-Tuning, a novel fine-tuning strategy that utilizes noncontinuous text segments instead. Skip-Tuning performs skipped language modeling on text segments and enables a paradigm of partial-correlation learning, where the model learns from sparse but meaningful text fragments. By excluding common-style texts and using only knowledge-intensive text for fine-tuning, Skip-Tuning demonstrates improvements in fine-tuning effectiveness and generalization in the knowledge editing setting. Furthermore, we demonstrate the effectiveness of partial-correlation learning in a system-prompt following task, which illustrates the broad application of Skip-Tuning across various NLP scenarios.

Keywords¶

Large Language Models, Supervised Fine-Tuning