bansalaman18 commited on
Commit
576f027
·
verified ·
1 Parent(s): c1d7c4e

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +81 -0
README.md ADDED
@@ -0,0 +1,81 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ tags:
4
+ - bert
5
+ - pytorch
6
+ - tensorflow-converted
7
+ - uncased
8
+ license: apache-2.0
9
+ model-index:
10
+ - name: uncased_L-10_H-256_A-4
11
+ results: []
12
+ ---
13
+
14
+ # BERT uncased_L-10_H-256_A-4
15
+
16
+ This model is a PyTorch conversion of the original TensorFlow BERT checkpoint.
17
+
18
+ ## Model Details
19
+
20
+ - **Model Type**: BERT (Bidirectional Encoder Representations from Transformers)
21
+ - **Language**: English (uncased)
22
+ - **Architecture**:
23
+ - Layers: 10
24
+ - Hidden Size: 256
25
+ - Attention Heads: 4
26
+ - Vocabulary Size: 30522
27
+ - Max Position Embeddings: 512
28
+
29
+ ## Model Configuration
30
+
31
+ ```json
32
+ {
33
+ "hidden_size": 256,
34
+ "hidden_act": "gelu",
35
+ "initializer_range": 0.02,
36
+ "vocab_size": 30522,
37
+ "hidden_dropout_prob": 0.1,
38
+ "num_attention_heads": 4,
39
+ "type_vocab_size": 2,
40
+ "max_position_embeddings": 512,
41
+ "num_hidden_layers": 10,
42
+ "intermediate_size": 1024,
43
+ "attention_probs_dropout_prob": 0.1
44
+ }
45
+ ```
46
+
47
+ ## Usage
48
+
49
+ ```python
50
+ from transformers import BertForPreTraining, BertTokenizer
51
+
52
+ # Load the model and tokenizer
53
+ model = BertForPreTraining.from_pretrained('bansalaman18/uncased_L-10_H-256_A-4')
54
+ tokenizer = BertTokenizer.from_pretrained('bansalaman18/uncased_L-10_H-256_A-4')
55
+
56
+ # Example usage
57
+ text = "Hello, this is a sample text for BERT."
58
+ inputs = tokenizer(text, return_tensors='pt')
59
+ outputs = model(**inputs)
60
+ ```
61
+
62
+ ## Training Data
63
+
64
+ This model was originally trained on the same data as the standard BERT models:
65
+ - English Wikipedia (2500M words)
66
+ - BookCorpus (800M words)
67
+
68
+ ## Conversion Details
69
+
70
+ This model was converted from the original TensorFlow checkpoint to PyTorch format using a custom conversion script with the Hugging Face Transformers library.
71
+
72
+ ## Citation
73
+
74
+ ```bibtex
75
+ @article{devlin2018bert,
76
+ title={BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding},
77
+ author={Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina},
78
+ journal={arXiv preprint arXiv:1810.04805},
79
+ year={2018}
80
+ }
81
+ ```