metadata

license: mit
pipeline_tag: text-generation
library_name: transformers
tags:
  - code-generation
  - formal-verification
  - reinforcement-learning
  - dafny

Re:Form -- Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs: A Preliminary Study on Dafny

This repository hosts the sft_0.5B model, part of the Re:Form framework, which focuses on enhancing formal software verification using Large Language Models (LLMs) and Reinforcement Learning (RL).

Re:Form addresses the limitations of informal language-based LLMs by grounding them in rigorous formal systems, such as Dafny. This approach enables automatic and mathematically provable verification of the LLMs' reasoning processes and outcomes, crucial for large-scale, reliable formal software verification. The framework introduces an automatic and scalable data curation pipeline and integrates RL designs with feedback from a formal language verifier to reduce reliance on human priors.

Paper: Re:Form -- Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs: A Preliminary Study on Dafny
Project Page: https://veri-code.github.io/ReForm-page
Code: https://github.com/Veri-Code/Veri-Code

Overall Pipeline of Re:Form

Usage

You can load and use the Re:Form model with the transformers library for Dafny code generation.

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "Veri-Code/sft_0.5B" # Example checkpoint; other available include sft_1.5B, sft_3B, sft_7B, sft_14B, 14B-RL-entropy

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16, # Use bfloat16 for optimal performance if supported
    device_map="auto",
    trust_remote_code=True,
)

# Example prompt for Dafny code generation
# This prompt asks the model to implement a simple Max method in Dafny.
prompt = "method Max(a: int, b: int) returns (m: int)\
  ensures m == a || m == b\
  ensures m >= a && m >= b\
{\
"

input_ids = tokenizer(prompt, return_tensors="pt").to(model.device)

# Generate Dafny code
generated_ids = model.generate(
    **input_ids,
    max_new_tokens=100,
    temperature=0.7,
    do_sample=True,
    eos_token_id=tokenizer.eos_token_id,
)

generated_text = tokenizer.decode(generated_ids[0], skip_special_tokens=True)
print(generated_text)

Citation

If you use this work in your research, please cite:

@misc{yan2025reformreducinghuman,
      title={Re:Form -- Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs: A Preliminary Study on Dafny}, 
      author={Chuanhao Yan and Fengdi Che and Xuhan Huang and Xu Xu and Xin Li and Yizhi Li and Xingwei Qu and Jingzhe Shi and Zhuangzhuang He and Chenghua Lin and Yaodong Yang and Binhang Yuan and Hang Zhao and Yu Qiao and Bowen Zhou and Jie Fu},
      year={2025},
      eprint={2507.16331},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2507.16331}, 
}