--- license: apache-2.0 language: - en library_name: transformers pipeline_tag: text-generation inference: true widget: - text: "public class HelloWorld {\n public static void main(String[] args) {" example_title: Hello world group: Java --- # JavaCoder ## Table of Contents 1. [Model Summary](##model-summary) 2. [Use](##use) 3. [Limitations](##limitations) 4. [Training](##training) 5. [License](##license) 6. [Citation](##citation) ## Model Summary The JavaCoder models are !B parameter models trained on 80+ programming languages from [The Stack (v1.2)](https://huggingface.co/datasets/bigcode/the-stack), with opt-out requests excluded. The model uses [Multi Query Attention](https://arxiv.org/abs/1911.02150), [a context window of 8192 tokens](https://arxiv.org/abs/2205.14135), and was trained using the [Fill-in-the-Middle objective](https://arxiv.org/abs/2207.14255) on 1 trillion tokens. - **Repository:** - **Project Website:** - **Paper:** - **Point of Contact:** - **Languages:** 80+ Programming languages ## Use ### Intended use The model was trained on GitHub code. As such it is _not_ an instruction model and commands like "Write a function that computes the square root." do not work well. However, by using the [Tech Assistant prompt](https://huggingface.co/datasets/bigcode/ta-prompt) you can turn it into a capable technical assistant. **Feel free to share your generations in the Community tab!** ### Generation ```Java # pip install -q transformers from transformers import AutoModelForCausalLM, AutoTokenizer checkpoint = "infosys/javacoder-1b" device = "cuda" # for GPU usage or "cpu" for CPU usage tokenizer = AutoTokenizer.from_pretrained(checkpoint) model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device) inputs = tokenizer.encode("public class HelloWorld {\n public static void main(String[] args) {", return_tensors="pt").to(device) outputs = model.generate(inputs) print(tokenizer.decode(outputs[0])) ``` ### Fill-in-the-middle Fill-in-the-middle uses special tokens to identify the prefix/middle/suffix part of the input and output: ```Java input_text = "public class HelloWorld {\n public static void main(String[] args) {}\n}" inputs = tokenizer.encode(input_text, return_tensors="pt").to(device) outputs = model.generate(inputs) print(tokenizer.decode(outputs[0])) ```