Data and filtering models of our financial open-source YiZhao Dataset.
AI & ML interests
None defined yet.
Recent Activity
View all activity
Organization Card
Text Machine Group (TMG) from Harbin Institute of Technology (Shenzhen). 🔥
-
KaLM-Embedding: Superior Training Data Brings A Stronger Embedding Model
Paper • 2501.01028 • Published • 17 -
KaLM-Embedding-V2: Superior Training Techniques and Data Inspire A Versatile Embedding Model
Paper • 2506.20923 • Published • 4 -
HIT-TMG/KaLM-embedding-multilingual-mini-v1
Sentence Similarity • 0.5B • Updated • 1.35k • 27 -
HIT-TMG/KaLM-embedding-multilingual-mini-instruct-v1
Sentence Similarity • 0.5B • Updated • 197 • 32
Data and filtering models of our financial open-source YiZhao Dataset.
-
KaLM-Embedding: Superior Training Data Brings A Stronger Embedding Model
Paper • 2501.01028 • Published • 17 -
KaLM-Embedding-V2: Superior Training Techniques and Data Inspire A Versatile Embedding Model
Paper • 2506.20923 • Published • 4 -
HIT-TMG/KaLM-embedding-multilingual-mini-v1
Sentence Similarity • 0.5B • Updated • 1.35k • 27 -
HIT-TMG/KaLM-embedding-multilingual-mini-instruct-v1
Sentence Similarity • 0.5B • Updated • 197 • 32
models
23

HIT-TMG/EviOmni-nq_train-1.5B
Question Answering
•
2B
•
Updated
•
165
•
5

HIT-TMG/EviOmni-nq_train-7B
Question Answering
•
8B
•
Updated
•
720
•
2

HIT-TMG/CIGEval-Qwen2.5-VL-7B-Instruct-sft
8B
•
Updated
•
5

HIT-TMG/CIGEval-Qwen2-VL-7B-Instruct-sft
8B
•
Updated
•
6

HIT-TMG/KaLM-embedding-multilingual-mini-instruct-v2
Feature Extraction
•
0.5B
•
Updated
•
557
•
26

HIT-TMG/KaLM-embedding-multilingual-mini-instruct-v1-GGUF
Sentence Similarity
•
0.5B
•
Updated
•
32

HIT-TMG/KaLM-embedding-multilingual-mini-instruct-v1.5-GGUF
Sentence Similarity
•
0.5B
•
Updated
•
27

HIT-TMG/KaLM-embedding-multilingual-mini-instruct-v1.5
Sentence Similarity
•
0.5B
•
Updated
•
2.02k
•
•
61

HIT-TMG/KaLM-embedding-multilingual-mini-instruct-v1
Sentence Similarity
•
0.5B
•
Updated
•
197
•
32

HIT-TMG/KaLM-embedding-multilingual-mini-unsupervised
0.5B
•
Updated
•
4
datasets
6
HIT-TMG/CIGEval_sft_data
Viewer
•
Updated
•
6.63k
•
127
HIT-TMG/YiZhao
Viewer
•
Updated
•
47.5M
•
667
•
6
HIT-TMG/KaLM-embedding-pretrain-data
Viewer
•
Updated
•
23.7M
•
651
•
5
HIT-TMG/MultiSkill
Viewer
•
Updated
•
1k
•
7
HIT-TMG/TruthReader_RAG_train
Viewer
•
Updated
•
7.16k
•
27
•
6
HIT-TMG/Hansel
Viewer
•
Updated
•
7.81M
•
3.57k
•
8