Skip to content

MidiyaZhu/KVWEFFER

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Domain Lexical Knowledge-based Word Embedding Learning for Text Classification under Small Data

Applications

We provide three example applications:

  • Emotion Recognition
  • Sentiment Analysis
  • Question Answering

Datasets, knowledge bases, and corresponding learning network checkpoint files are available in the following directories:

  • /data/dataset/: Contains datasets for the applications.
  • /data/knowledgebase/: Stores the knowledge bases.
  • /mappingmodel/: Includes the learning network checkpoint files.

Custom Processing

If you want to run custom processes, the code is available in /code/basemodel_path/*kefPL.py. You can choose from the following base models:

  • Bi-LSTM att
  • DualCL
  • Kil
  • LCL

Knowledge Base Collection

To collect a knowledge base for your own downstream tasks, run the following scripts in /code/knowledgecollection/ in sequence:

  1. extreackWords.py
  2. augKnowledge.py
  3. select_unique_token.py

Experiment Preparation

Dependencies

Ensure the following dependencies are installed:

  • python>=3.6
  • torch>=1.7.1
  • datasets>=1.12.1
  • transformers>=4.9.2 (Hugging Face)

Alternatively, install all required dependencies using:

pip install -r requirements.txt

Citation

@misc{zhu2025domainlexicalknowledgebasedword,
      title={Domain Lexical Knowledge-based Word Embedding Learning for Text Classification under Small Data}, 
      author={Zixiao Zhu and Kezhi Mao},
      year={2025},
      eprint={2506.01621},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2506.01621}, 
}

About

knowledge-enhanced text classification

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages