File size: 1,960 Bytes
576f027 2c743a1 576f027 2c743a1 576f027 2c743a1 576f027 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 |
---
language: en
tags:
- bert
- pytorch
- tensorflow-converted
- uncased
license: apache-2.0
model-index:
- name: bert-uncased_L-10_H-256_A-4
results: []
---
# BERT bert-uncased_L-10_H-256_A-4
This model is a PyTorch conversion of the original TensorFlow BERT checkpoint.
## Model Details
- **Model Type**: BERT (Bidirectional Encoder Representations from Transformers)
- **Language**: English (uncased)
- **Architecture**:
- Layers: 10
- Hidden Size: 256
- Attention Heads: 4
- Vocabulary Size: 30522
- Max Position Embeddings: 512
## Model Configuration
```json
{
"hidden_size": 256,
"hidden_act": "gelu",
"initializer_range": 0.02,
"vocab_size": 30522,
"hidden_dropout_prob": 0.1,
"num_attention_heads": 4,
"type_vocab_size": 2,
"max_position_embeddings": 512,
"num_hidden_layers": 10,
"intermediate_size": 1024,
"attention_probs_dropout_prob": 0.1
}
```
## Usage
```python
from transformers import BertForPreTraining, BertTokenizer
# Load the model and tokenizer
model = BertForPreTraining.from_pretrained('bansalaman18/bert-uncased_L-10_H-256_A-4')
tokenizer = BertTokenizer.from_pretrained('bansalaman18/bert-uncased_L-10_H-256_A-4')
# Example usage
text = "Hello, this is a sample text for BERT."
inputs = tokenizer(text, return_tensors='pt')
outputs = model(**inputs)
```
## Training Data
This model was originally trained on the same data as the standard BERT models:
- English Wikipedia (2500M words)
- BookCorpus (800M words)
## Conversion Details
This model was converted from the original TensorFlow checkpoint to PyTorch format using a custom conversion script with the Hugging Face Transformers library.
## Citation
```bibtex
@article{devlin2018bert,
title={BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding},
author={Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina},
journal={arXiv preprint arXiv:1810.04805},
year={2018}
}
```
|