History

hq_wei 24c590bb04 Ner task (#148 ) * update ner standard code format * add pytest * fix pre-commit * Annotate the dataset section * fix pre-commit for dataset * rm big files and add comments in dataset * rename configs for ner task * minor changes if metric * Note modification * fix pre-commit * detail modification * rm transform * rm magic number * fix warnings in pylint * fix pre-commit * correct help info * rename model files * rename err fixed * 428_tag * Adjust to more general pipline * update unit test rate * update * Unit test coverage over 90% and add Readme * modify details * fix precommit * update * fix pre-commit * update * update * update * update result * update readme * update baseline config * update config and small minor changes * minor changes in readme and etc. * back to original * update toy config * upload model and log * fix pytest * Modify the notes. * fix readme * Delete Chinese punctuation * add demo and fix some logic and naming problems * add To_tensor transformer for ner and load pretrained model in config * delete extra lines * split ner loss to MaskedCrossEntropyLoss and MaskedFocalLoss * update config * fix err * updata * modify noqa * update new model report * fix err in ner demo * Update ner_dataset.py * Update test_ner_dataset.py * Update ner_dataset.py * Update ner_transforms.py * rm toy config and data * add comment * add empty * fix conflict * fix precommit * fix pytest * fix pytest err * Update ner_dataset.py * change dataset name to cluener2020 * move the postprocess in metric to convertor * rm __init__ etc. * precommit * add discription in loss * add auto download * add http * update * remove some 'issert' * replace unsqueeze * update config * update doc and bert.py * update * update demo code Co-authored-by: weihuaqiang <weihuaqiang@sensetime.com> Co-authored-by: Hongbin Sun <hongbin306@gmail.com>	2021-05-18 11:33:51 +08:00
..
README.md	Ner task (#148 )	2021-05-18 11:33:51 +08:00
bert_softmax_cluener_18e.py	Ner task (#148 )	2021-05-18 11:33:51 +08:00

* update ner standard code format

* add pytest

* fix pre-commit

* Annotate the dataset section

* fix pre-commit for dataset

* rm big files and add comments in dataset

* rename configs for ner task

* minor changes if metric

* Note modification

* fix pre-commit

* detail modification

* rm transform

* rm magic number

* fix warnings in pylint

* fix pre-commit

* correct help info

* rename model files

* rename err fixed

* 428_tag

* Adjust to more general pipline

* update unit test rate

* update

* Unit test coverage over 90% and add Readme

* modify details

* fix precommit

* update

* fix pre-commit

* update

* update

* update

* update result

* update readme

* update baseline config

* update config and small minor changes

* minor changes in readme and etc.

* back to original

* update toy config

* upload model and log

* fix pytest

* Modify the notes.

* fix readme

* Delete Chinese punctuation

* add demo and fix some logic and naming problems

* add To_tensor transformer for ner and load pretrained model in config

* delete extra lines

* split ner loss to MaskedCrossEntropyLoss and MaskedFocalLoss

* update config

* fix err

* updata

* modify noqa

* update new model report

* fix err in ner demo

* Update ner_dataset.py

* Update test_ner_dataset.py

* Update ner_dataset.py

* Update ner_transforms.py

* rm toy config and data

* add comment

* add empty

* fix conflict

* fix precommit

* fix pytest

* fix pytest err

* Update ner_dataset.py

* change dataset name to cluener2020

* move the postprocess in metric to convertor

* rm __init__ etc.

* precommit

* add discription in loss

* add auto download

* add http

* update

* remove some 'issert'

* replace unsqueeze

* update config

* update doc and bert.py

* update

* update demo code

Co-authored-by: weihuaqiang <weihuaqiang@sensetime.com>
Co-authored-by: Hongbin Sun <hongbin306@gmail.com>

2021-05-18 11:33:51 +08:00

README.md

Ner task (#148 )

2021-05-18 11:33:51 +08:00

bert_softmax_cluener_18e.py

Ner task (#148 )

2021-05-18 11:33:51 +08:00

README.md

Chinese Named Entity Recognition using BERT + Softmax

Introduction

[ALGORITHM]

@article{devlin2018bert,
  title={Bert: Pre-training of deep bidirectional transformers for language understanding},
  author={Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina},
  journal={arXiv preprint arXiv:1810.04805},
  year={2018}
}

Dataset

Train Dataset

trainset	text_num	entity_num
CLUENER2020	10748	23338

Test Dataset

testset	text_num	entity_num
CLUENER2020	1343	2982

Results and models

Method	Pretrain	Precision	Recall	F1-Score	Download
bert_softmax	pretrain	0.7885	0.7998	0.7941	model \| log