SubjECTiveQA-CLEAR / README.md

Added Label Interpretation

6f8fc77 verified 5 months ago

3.99 kB

	---
	license: cc-by-4.0
	datasets:
	- gtfintechlab/subjectiveqa
	language:
	- en
	metrics:
	- accuracy
	- precision
	- recall
	- f1
	base_model:
	- google-bert/bert-base-uncased
	pipeline_tag: text-classification
	library_name: transformers
	---

	# SubjECTiveQA-CLEAR Model

	Model Name: SubjECTiveQA-CLEAR

	Model Type: Text Classification

	Language: English

	License: [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)

	Base Model: [google-bert/bert-base-uncased](https://huggingface.co/google/bert-base-uncased)

	Dataset Used for Training: [gtfintechlab/SubjECTive-QA](https://huggingface.co/datasets/gtfintechlab/SubjECTive-QA)

	## Model Overview

	SubjECTiveQA-CLEAR is a fine-tuned BERT-based model designed to classify text data according to the 'CLEAR' attribute. The 'CLEAR' attribute is one of several subjective attributes annotated in the SubjECTive-QA dataset, which focuses on subjective question-answer pairs in financial contexts.

	## Intended Use

	This model is intended for researchers and practitioners working on subjective text classification, particularly within financial domains. It is specifically designed to assess the 'CLEAR' attribute in question-answer pairs, aiding in the analysis of subjective content in financial communications.

	## How to Use

	To utilize this model, you can load it using the Hugging Face `transformers` library:

	```python
	from transformers import pipeline, AutoTokenizer, AutoModelForSequenceClassification, AutoConfig

	# Load the tokenizer, model, and configuration
	tokenizer = AutoTokenizer.from_pretrained("gtfintechlab/SubjECTiveQA-CLEAR", do_lower_case=True, do_basic_tokenize=True)
	model = AutoModelForSequenceClassification.from_pretrained("gtfintechlab/SubjECTiveQA-CLEAR", num_labels=3)
	config = AutoConfig.from_pretrained("gtfintechlab/SubjECTiveQA-CLEAR")

	# Initialize the text classification pipeline
	classifier = pipeline('text-classification', model=model, tokenizer=tokenizer, config=config, framework="pt")

	# Classify the 'CLEAR' attribute in your question-answer pairs
	qa_pairs = [
	"Question: What are your company's projections for the next quarter? Answer: We anticipate a 10% increase in revenue due to the launch of our new product line.",
	"Question: Can you explain the recent decline in stock prices? Answer: Market fluctuations are normal, and we are confident in our long-term strategy."
	]
	results = classifier(qa_pairs, batch_size=128, truncation="only_first")

	print(results)
	```
	## Label Interpretation

	- LABEL_0: Negatively Demonstrative of 'CLEAR' (0)
	Indicates that the response lacks clarity.

	- LABEL_1: Neutral Demonstration of 'CLEAR' (1)
	Indicates that the response has an average level of clarity.

	- LABEL_2: Positively Demonstrative of 'CLEAR' (2)
	Indicates that the response is clear and transparent.


	## Training Data

	The model was trained on the SubjECTive-QA dataset, which comprises question-answer pairs from financial contexts, annotated with various subjective attributes, including 'CLEAR'. The dataset is divided into training, validation, and test sets, facilitating robust model training and evaluation.

	## Citation

	If you use this model in your research, please cite the SubjECTive-QA dataset:

	```
	@article{SubjECTiveQA,
	title={SubjECTive-QA: Measuring Subjectivity in Earnings Call Transcripts’ QA Through Six-Dimensional Feature Analysis},
	author={Huzaifa Pardawala, Siddhant Sukhani, Agam Shah, Veer Kejriwal, Abhishek Pillai, Rohan Bhasin, Andrew DiBiasio, Tarun Mandapati, Dhruv Adha, Sudheer Chava},
	journal={arXiv preprint arXiv:2410.20651},
	year={2024}
	}
	```

	For more details, refer to the [SubjECTive-QA dataset documentation](https://huggingface.co/datasets/gtfintechlab/SubjECTive-QA).

	## Contact

	For any SubjECTive-QA related issues and questions, please contact:

	- Huzaifa Pardawala: huzaifahp7[at]gatech[dot]edu

	- Siddhant Sukhani: ssukhani3[at]gatech[dot]edu

	- Agam Shah: ashah482[at]gatech[dot]edu