πŸ’¬ MDLM AR Model (Korean) - Hanbin42

이 λͺ¨λΈμ€ MDLM (Masked Diffusion Language Model) ꡬ쑰λ₯Ό 기반으둜 ν•œ Autoregressive Korean Language Modelμž…λ‹ˆλ‹€.
Hanbin42/my-mdlm-ar-model은 skt/kogpt2-base-v2 ν† ν¬λ‚˜μ΄μ €μ™€ parkseongjun/psjkodata ν•œκ΅­μ–΄ λ°μ΄ν„°μ…‹μœΌλ‘œ ν•™μŠ΅λ˜μ—ˆμŠ΅λ‹ˆλ‹€.


🧠 Model Details

  • Backbone: Autoregressive (AR)
  • Diffusion Type: Absorbing State
  • Input Length: 1024 tokens
  • Vocab Size: 51200 (KoGPT2 κΈ°μ€€)
  • Training Steps: 50,000
  • Sampling Steps: 128 (DDPM-style)
  • Precision: bfloat16
  • EMA: Enabled (0.9999)

πŸ“¦ Files

File Description
best.ckpt PyTorch Lightning λͺ¨λΈ 체크포인트
config.yaml ν•™μŠ΅ μ‹œ μ‚¬μš©ν•œ ν•˜μ΄νΌνŒŒλΌλ―Έν„° μ„€μ •
README.md λͺ¨λΈ μ„€λͺ… λ¬Έμ„œ

πŸš€ How to Use

import torch
from lightning.pytorch import LightningModule
from diffusion import Diffusion  # 이 ν”„λ‘œμ νŠΈ κΈ°μ€€μœΌλ‘œ μ •μ˜λ¨

model = Diffusion.load_from_checkpoint("best.ckpt", config=..., tokenizer=...)
model.eval()
Downloads last month
5
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support