SyedA5688 commited on
Commit
a95e6b5
·
verified ·
1 Parent(s): 5477e5c

Updated cell and gene numberd for diverse task Pythia C2S model

Browse files
Files changed (1) hide show
  1. README.md +4 -2
README.md CHANGED
@@ -24,8 +24,10 @@ This model was trained on over 57 million human and mouse cells gathered from ov
24
  datasets from CellxGene and the Human Cell Atlas. This dataset covers a broad range of cell types and conditions
25
  from multiple tissues in both human and mouse.
26
 
27
- This model was trained with a variable number of genes per cell sentence. For multi cell samples, each sample
28
- contained between 5 and 20 cells, with the same number of genes for each of the cells in the same sample.
 
 
29
 
30
  # Tasks
31
  This model is designed for the following tasks:
 
24
  datasets from CellxGene and the Human Cell Atlas. This dataset covers a broad range of cell types and conditions
25
  from multiple tissues in both human and mouse.
26
 
27
+ This model was trained with a variable number of genes per cell sentence, with a maximum context length of 8192 tokens.
28
+ The context length of the default Pythia model was extended using rotary positional embeddings prior to C2S training.
29
+ - Cells: For multi cell samples, each training sample contained between 5 and 20 cells, with the same number of genes for each of the cells in the same sample.
30
+ - Genes: For single cell samples, each cell sentence contained between 100 and 2048 genes. For multi cell samples, each cell sentence per cell contained between 100 and 400 genes.
31
 
32
  # Tasks
33
  This model is designed for the following tasks: