7 8 15

Yingfa Chen

chen-yingfa

https://chen-yingfa.github.io

AI & ML interests

Long-context modeling, continual learning, architectures

Recent Activity

authored a paper 7 days ago

StateX: Enhancing RNN Recall via Post-training State Expansion

upvoted a paper 7 days ago

StateX: Enhancing RNN Recall via Post-training State Expansion

commented on a paper 7 days ago

StateX: Enhancing RNN Recall via Post-training State Expansion

View all activity

Organizations

None yet

authored a paper 7 days ago

StateX: Enhancing RNN Recall via Post-training State Expansion

Paper • 2509.22630 • Published 9 days ago • 2

upvoted a paper 7 days ago

StateX: Enhancing RNN Recall via Post-training State Expansion

Paper • 2509.22630 • Published 9 days ago • 2

commented a paper 7 days ago

StateX: Enhancing RNN Recall via Post-training State Expansion

Paper • 2509.22630 • Published 9 days ago • 2 •

updated a collection 22 days ago

Cost-Optimal GQA Models

Collection

2 items • Updated 22 days ago

published 2 models 22 days ago

chen-yingfa/cogqa-19m

Updated 22 days ago

chen-yingfa/cogqa-3m

Updated 22 days ago

updated a collection 25 days ago

MLP

Collection

2 items • Updated 25 days ago

New activity in HuggingFaceFW/finepdfs 27 days ago

what is the distribution of context length in this dataset?

#5 opened 28 days ago by

chen-yingfa

liked a model 29 days ago

openbmb/MiniCPM4.1-8B

Text Generation • 8B • Updated 6 days ago • 4.37k • 309

authored a paper 3 months ago

BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity

Paper • 2507.08771 • Published Jul 11 • 9

upvoted a paper 3 months ago

BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity

Paper • 2507.08771 • Published Jul 11 • 9

upvoted an article 4 months ago

Article

Enhance Your Models in 5 Minutes with the Hugging Face Kernel Hub

and 6 others •

Jun 12

• 144

liked 2 models 6 months ago

Test-Time-Training/ttt-mlp-350m-books-2k

Updated Jul 28, 2024 • 1

Test-Time-Training/ttt-mlp-1.3b-pile-8k

Updated Jul 28, 2024 • 1

authored a paper 7 months ago

Cost-Optimal Grouped-Query Attention for Long-Context LLMs

Paper • 2503.09579 • Published Mar 12 • 5

upvoted a paper 7 months ago

Cost-Optimal Grouped-Query Attention for Long-Context LLMs

Paper • 2503.09579 • Published Mar 12 • 5

commented a paper 7 months ago

Cost-Optimal Grouped-Query Attention for Long-Context LLMs

Paper • 2503.09579 • Published Mar 12 • 5 •

updated a dataset 10 months ago

chen-yingfa/CFDBench-raw

Viewer • Updated Dec 12, 2024 • 5.13B • 525 • 1

upvoted a paper 11 months ago

MARS: Unleashing the Power of Variance Reduction for Training Large Models

Paper • 2411.10438 • Published Nov 15, 2024 • 13

authored a paper 11 months ago

Sparsing Law: Towards Large Language Models with Greater Activation Sparsity

Paper • 2411.02335 • Published Nov 4, 2024 • 11

Yingfa Chen

AI & ML interests

Recent Activity

Organizations

chen-yingfa's activity

what is the distribution of context length in this dataset?

Enhance Your Models in 5 Minutes with the Hugging Face Kernel Hub