zoukankan      html  css  js  c++  java
  • OpenKiwi学习笔记

    Python 打印调用栈:

    import traceback
    traceback.print_stack()

    For CLI usage, the general command is:

    kiwi (train|pretrain|predict|evaluate|search) CONFIG_FILE

    导入bert模型:

    from transformers import (
        BERT_PRETRAINED_MODEL_ARCHIVE_LIST,
        DISTILBERT_PRETRAINED_MODEL_ARCHIVE_LIST,
        AutoTokenizer,
        BertConfig,
        BertModel,
    )
    

    qe system 的组成:

    Furthermore, all of our QE systems share a similar architecture. They are composed of
    
    - Encoder
    - Decoder
    - Output
    - (Optionally) TLM Output
    
    Encoder: Embedding and creating features to be used for downstream tasks. i.e. Predictor,
    BERT, etc
    
    Decoder: Responsible for learning feature transformations better suited for the
    downstream task. i.e. MLP, LSTM, etc
    
    Output: Simple feedforwards that take decoder features and transform them into the
    prediction required by the downstream task. Something in the same line as the common
    "classification heads" being used with transformers.
    
    TLM Output: A simple output layer that trains for the specific TLM objective. It can be
    useful to continue finetuning the predictor during training of the complete QE system.
    

    qe system 继承关系:

    All QE systems inherit from :class:`kiwi.systems.qe_system.QESystem`.
    
    Use ``kiwi train`` to train these systems.
    
    Currently available are:
    
    +--------------------------------------------------------------+
    | :class:`kiwi.systems.nuqe.NuQE`                              |
    +--------------------------------------------------------------+
    | :class:`kiwi.systems.predictor_estimator.PredictorEstimator` |
    +--------------------------------------------------------------+
    | :class:`kiwi.systems.bert.Bert`                              |
    +--------------------------------------------------------------+
    | :class:`kiwi.systems.xlm.XLM`                                |
    +--------------------------------------------------------------+
    | :class:`kiwi.systems.xlmroberta.XLMRoberta`                  |
    +--------------------------------------------------------------+
    

    TLM 继承关系:

    TLM --- :mod:`kiwi.systems.tlm_system`
    --------------------------------------
    
    All TLM systems inherit from :class:`kiwi.systems.tlm_system.TLMSystem`.
    
    Use ``kiwi pretrain`` to train these systems. These systems can then be used as the
    encoder part in QE systems by using the `load_encoder` flag.
    
    Currently available are:
    
    +-------------------------------------------+
    | :class:`kiwi.systems.predictor.Predictor` |
    

    Configuration class

    kiwi.lib.train.Configuration
    kiwi.lib.train.RunConfig
    kiwi.lib.train.TrainerConfig
    kiwi.data.datasets.wmt_qe_dataset.WMTQEDataset.Config
    kiwi.systems.qe_system.QESystem.Config
    

    encoder 采用 bert 模型,模型整体结构为:

    Bert(
      (encoder): BertEncoder(
        (bert): BertModel(
          (embeddings): BertEmbeddings(
            (word_embeddings): Embedding(119547, 768, padding_idx=0)
            (position_embeddings): Embedding(512, 768)
            (token_type_embeddings): Embedding(2, 768)
            (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
            (dropout): Dropout(p=0.1, inplace=False)
          )
          (encoder): BertEncoder(
            (layer): ModuleList(
              (0): BertLayer(
                (attention): BertAttention(
                  (self): BertSelfAttention(
                    (query): Linear(in_features=768, out_features=768, bias=True)
                    (key): Linear(in_features=768, out_features=768, bias=True)
                    (value): Linear(in_features=768, out_features=768, bias=True)
                    (dropout): Dropout(p=0.1, inplace=False)
                  )
                  (output): BertSelfOutput(
                    (dense): Linear(in_features=768, out_features=768, bias=True)
                    (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                    (dropout): Dropout(p=0.1, inplace=False)
                  )
                )
                (intermediate): BertIntermediate(
                  (dense): Linear(in_features=768, out_features=3072, bias=True)
                )
                (output): BertOutput(
                  (dense): Linear(in_features=3072, out_features=768, bias=True)
                  (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
         ................
              (11): BertLayer(
                (attention): BertAttention(
                  (self): BertSelfAttention(
                    (query): Linear(in_features=768, out_features=768, bias=True)
                    (key): Linear(in_features=768, out_features=768, bias=True)
                    (value): Linear(in_features=768, out_features=768, bias=True)
                    (dropout): Dropout(p=0.1, inplace=False)
                  )
                  (output): BertSelfOutput(
                    (dense): Linear(in_features=768, out_features=768, bias=True)
                    (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                    (dropout): Dropout(p=0.1, inplace=False)
                  )
                )
                (intermediate): BertIntermediate(
                  (dense): Linear(in_features=768, out_features=3072, bias=True)
                )
                (output): BertOutput(
                  (dense): Linear(in_features=3072, out_features=768, bias=True)
                  (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
                  (dropout): Dropout(p=0.1, inplace=False)
                )
              )
            )
          )
          (pooler): BertPooler(
            (dense): Linear(in_features=768, out_features=768, bias=True)
            (activation): Tanh()
          )
        )
        (scalar_mix): ScalarMixWithDropout(
          (scalar_parameters): ParameterList(
              (0): Parameter containing: [torch.FloatTensor of size 1]
              (1): Parameter containing: [torch.FloatTensor of size 1]
              (2): Parameter containing: [torch.FloatTensor of size 1]
              (3): Parameter containing: [torch.FloatTensor of size 1]
              (4): Parameter containing: [torch.FloatTensor of size 1]
              (5): Parameter containing: [torch.FloatTensor of size 1]
              (6): Parameter containing: [torch.FloatTensor of size 1]
              (7): Parameter containing: [torch.FloatTensor of size 1]
              (8): Parameter containing: [torch.FloatTensor of size 1]
              (9): Parameter containing: [torch.FloatTensor of size 1]
              (10): Parameter containing: [torch.FloatTensor of size 1]
              (11): Parameter containing: [torch.FloatTensor of size 1]
              (12): Parameter containing: [torch.FloatTensor of size 1]
          )
        )
        (output_embeddings): Embedding(119547, 768, padding_idx=0)
      )
      (decoder): LinearDecoder(
        (linear_outs): ModuleDict(
          (target): Sequential(
            (0): Linear(in_features=768, out_features=768, bias=True)
            (1): Tanh()
          )
          (source): Sequential(
            (0): Linear(in_features=768, out_features=768, bias=True)
            (1): Tanh()
          )
          (target_sentence): Sequential(
            (0): Linear(in_features=768, out_features=768, bias=True)
            (1): Tanh()
            (2): Dropout(p=0.1, inplace=False)
            (3): Linear(in_features=768, out_features=768, bias=True)
            (4): Tanh()
            (5): Dropout(p=0.1, inplace=False)
          )
        )
        (dropout): Dropout(p=0.1, inplace=False)
      )
      (outputs): QEOutputs(
        (word_outputs): ModuleDict(
          (target_tags): WordLevelOutput(
            (linear): Linear(in_features=768, out_features=2, bias=True)
            (loss_fn): CrossEntropyLoss()
          )
          (gap_tags): GapTagsOutput(
            (linear): Linear(in_features=1536, out_features=2, bias=True)
            (loss_fn): CrossEntropyLoss()
          )
          (source_tags): WordLevelOutput(
            (linear): Linear(in_features=768, out_features=2, bias=True)
            (loss_fn): CrossEntropyLoss()
          )
        )
        (sentence_outputs): ModuleDict(
          (sentence_scores): SentenceScoreRegression(
            (sentence_pred): Sequential(
              (linear_0): Linear(in_features=768, out_features=384, bias=True)
              (activation_0): Tanh()
              (dropout_0): Dropout(p=0.0, inplace=False)
              (linear_1): Linear(in_features=384, out_features=1, bias=True)
            )
            (loss_fn): MSELoss()
          )
        )
      )
      (tlm_outputs): TLMOutputs(
        (masked_word_outputs): ModuleDict()
      )
    )
    

    cli.py:print(arguments):

    {'--example': False,
     '--help': False,
     '--quiet': False,
     '--verbose': False,
     '--version': False,
     'CONFIG_FILE': 'config/bert.yaml',
     'OVERWRITES': [],
     'evaluate': False,
     'predict': False,
     'pretrain': False,
     'search': False,
     'train': True}
    

    cli.py:config_dict:

    {
    	'run': {
    		'experiment_name': 'BERT WMT20 EN-ZH',
    		'seed': 42,
    		'use_mlflow': False
    	},
    	'trainer': {
    		'deterministic': True,
    		'gpus': 1,
    		'epochs': 10,
    		'main_metric': ['WMT19_MCC', 'PEARSON'],
    		'gradient_max_norm': 1.0,
    		'gradient_accumulation_steps': 1,
    		'amp_level': 'O2',
    		'precision': 16,
    		'log_interval': 100,
    		'checkpoint': {
    			'validation_steps': 0.2,
    			'early_stop_patience': 10
    		}
    	},
    	'system': {
    		'class_name': 'Bert',
    		'batch_size': 2,
    		'num_data_workers': 1,
    		'model': {
    			'encoder': {
    				'model_name': 'bert-base-multilingual-cased',
    				'use_mlp': False,
    				'freeze': False
    			},
    			'decoder': {
    				'hidden_size': 768,
    				'bottleneck_size': 768,
    				'dropout': 0.1
    			},
    			'outputs': {
    				'word_level': {
    					'target': True,
    					'gaps': True,
    					'source': True,
    					'class_weights': {
    						'target_tags': {
    							'BAD': 3.0
    						},
    						'gap_tags': {
    							'BAD': 5.0
    						},
    						'source_tags': {
    							'BAD': 3.0
    						}
    					}
    				},
    				'sentence_level': {
    					'hter': True,
    					'use_distribution': False,
    					'binary': False
    				},
    				'n_layers_output': 2,
    				'sentence_loss_weight': 1
    			},
    			'tlm_outputs': {
    				'fine_tune': False
    			}
    		},
    		'optimizer': {
    			'class_name': 'adamw',
    			'learning_rate': 1e-05,
    			'warmup_steps': 0.1,
    			'training_steps': 12000
    		},
    		'data_processing': {
    			'share_input_fields_encoders': True
    		}
    	},
    	'data': {
    		'train': {
    			'input': {
    				'source': 'data/WMT20/en-zh/train/train.src',
    				'target': 'data/WMT20/en-zh/train/train.mt',
    				'alignments': 'data/WMT20/en-zh/train/train.src-mt.alignments',
    				'post_edit': 'data/WMT20/en-zh/train/train.pe'
    			},
    			'output': {
    				'source_tags': 'data/WMT20/en-zh/train/train.source_tags',
    				'target_tags': 'data/WMT20/en-zh/train/train.tags',
    				'sentence_scores': 'data/WMT20/en-zh/train/train.hter'
    			}
    		},
    		'valid': {
    			'input': {
    				'source': 'data/WMT20/en-zh/dev/dev.src',
    				'target': 'data/WMT20/en-zh/dev/dev.mt',
    				'alignments': 'data/WMT20/en-zh/dev/dev.src-mt.alignments',
    				'post_edit': 'data/WMT20/en-zh/dev/dev.pe'
    			},
    			'output': {
    				'source_tags': 'data/WMT20/en-zh/dev/dev.source_tags',
    				'target_tags': 'data/WMT20/en-zh/dev/dev.tags',
    				'sentence_scores': 'data/WMT20/en-zh/dev/dev.hter'
    			}
    		},
    		'test': {
    			'input': {
    				'source': 'data/WMT20/en-zh/test-blind/test.src',
    				'target': 'data/WMT20/en-zh/test-blind/test.mt',
    				'alignments': 'data/WMT20/en-zh/test-blind/test.src-mt.alignments'
    			}
    		}
    	},
    	'verbose': False,
    	'quiet': False
    }
    

    QESystem:

        def forward(self, batch_inputs):
            encoder_features = self.encoder(batch_inputs)
            features = self.decoder(encoder_features, batch_inputs)
            outputs = self.outputs(features, batch_inputs)
    
            # For fine-tuning the encoder
            if self.tlm_outputs:
                outputs.update(self.tlm_outputs(encoder_features, batch_inputs))
    
            return outputs
    
    • pytorch_lightning 执行流程:

    PL的流程很简单,生产流水线,有一个固定的顺序。

    这部分代码只执行一次。

    1. `__init__()`(初始化 LightningModule )
    2. `prepare_data()` (准备数据,包括下载数据、预处理等等)
    3. `configure_optimizers()` (配置优化器)
    

    测试 “验证代码”,提前来做的意义在于:不需要等待漫长的训练过程才发现验证代码有错。

    1. `val_dataloader()`
    2. `validation_step()`
    3. `validation_epoch_end()` 

    batch 数据:

    {
    	'source': BatchedSentence(
    	    tensor = tensor([
    		    [
    		        15846, 10491, 82978, 10226, 75312, 10571, 10105, 106095, 11942,
    			    38587, 10108, 10955, 11586, 119, 102, 0, 0, 0,
    			    0, 0, 0, 0, 0, 0, 0, 0, 0,
    			    0, 0, 0, 0, 0, 0, 0, 0, 0,
    			    0, 0
    			],
    		    [
    		        10135, 10386, 11288, 10207, 117, 17668, 11945, 39091, 10226,
    			    10751, 14310, 107, 17316, 10230, 10108, 11589, 107, 117,
    			    17846, 15736, 10188, 13209, 14951, 13240, 117, 11406, 14320,
    			    10270, 10108, 10226, 84977, 11305, 68999, 10731, 11202, 31419,
    			    119, 102
    		    ]
    		], device = 'cuda:0'), 
    		lengths = tensor([15, 38], device = 'cuda:0'), 
    		bounds = tensor([
    		    [0, 1, 3, 4, 5, 6, 7, 9, 10, 11, 12, 13, 14, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1],
    		    [0, 1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 23, 24, 25, 26, 27, 28, 29, 30, 34, 36, 37]
    	    ], device = 'cuda:0'), 
    	    bounds_lengths = tensor([13, 31], device = 'cuda:0'), 
    	    strict_masks = tensor([
    			[
    			    True, True, True, True, True, True, True, True, True, True,
    			    True, True, True, True, False, False, False, False, False, False,
    				False, False, False, False, False, False, False, False, False, False,
    				False, False, False, False, False, False, False, False
    			],
    			[   True, True, True, True, True, True, True, True, True, True,
    				True, True, True, True, True, True, True, True, True, True,
    				True, True, True, True, True, True, True, True, True, True,
    				True, True, True, True, True, True, True, False
    			]
    		], device = 'cuda:0'), 
    		number_of_tokens = tensor([12, 30], device = 'cuda:0', dtype = torch.int32)
    	),
    
    	'target': BatchedSentence(
    	    tensor = tensor([
    			[   101, 4877, 113183, 113227, 118188, 3031, 2128, 114696, 217,
    				2674, 115512, 118188, 5718, 7724, 111978, 6348, 114696, 2079,
    				5740, 115551, 2196, 5718, 2773, 111915, 2206, 119, 102,
    				0, 0, 0, 0, 0, 0, 0, 0, 0,
    				0, 0, 0, 0, 0, 0, 0, 0, 0,
    				0, 0, 0, 0, 0, 0, 0
    			],
    			[   101, 10207, 3642, 10186, 4460, 10386, 4348, 10064, 4536,
    				4580, 114286, 7735, 117459, 2196, 5718, 84977, 11305, 68999,
    				10731, 4237, 113162, 6063, 10270, 8272, 10064, 4163, 112293,
    				2146, 2196, 5718, 4333, 5718, 107, 3976, 5718, 4614,
    				113664, 107, 10064, 2460, 111870, 4461, 3228, 118573, 113826,
    				217, 4608, 5718, 4784, 112939, 1882, 102
    			]
    		], device = 'cuda:0'), 
    		lengths = tensor([27, 52], device = 'cuda:0'), 
    		bounds = tensor([
    		    [0, 1, 5, 6, 8, 9, 12, 13, 15, 17, 18, 20, 21, 22, 24, 25, 26, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1],
    		    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 13, 14, 15, 19, 21, 22, 23, 24, 25, 27, 28, 29, 30, 31, 32, 33, 34, 35, 37, 38, 39, 41, 42, 45, 46,47, 48, 50, 51]
    		], device = 'cuda:0'), 
    		bounds_lengths = tensor([17, 40], device = 'cuda:0'), 
    		strict_masks = tensor([
    		    [
    		        False, True, True, True, True, True, True, True, True, True,
    			    True, True, True, True, True, True, True, True, True, True,
    			    True, True, True, True, True, True, False, False, False, False,
    			    False, False, False, False, False, False, False, False, False, False,
    			    False, False, False, False, False, False, False, False, False, False,
    			    False, False
    			],
    		    [
    		        False, True, True, True, True, True, True, True, True, True,
    			    True, True, True, True, True, True, True, True, True, True,
    			    True, True, True, True, True, True, True, True, True, True,
    			    True, True, True, True, True, True, True, True, True, True,
    			    True, True, True, True, True, True, True, True, True, True,
    			    True, False
    			]
    		], device = 'cuda:0'), 
    		number_of_tokens = tensor([15, 38], device = 'cuda:0', dtype = torch.int32)
    	),
    
    	'alignments': tensor([
    		    [
    			    [1, 0, 0, ..., 0, 0, 0],
    			    [0, 1, 0, ..., 0, 0, 0],
    			    [0, 0, 0, ..., 0, 0, 0],
    			    ..., [0, 0, 0, ..., 0, 0, 0],
    			    [0, 0, 0, ..., 0, 0, 0],
    			    [0, 0, 0, ..., 0, 0, 0]
    		    ],
    		    [
    			    [1, 0, 0, ..., 0, 0, 0],
    			    [0, 0, 0, ..., 0, 0, 0],
    			    [0, 0, 0, ..., 0, 0, 0],
    			    ..., [0, 0, 0, ..., 0, 0, 0],
    			    [0, 0, 0, ..., 0, 1, 0],
    			    [0, 0, 0, ..., 0, 0, 0]
    		    ]
    	    ], device = 'cuda:0', dtype = torch.int32),
    
    	'pe': BatchedSentence(
    	    tensor = tensor([
    		    [
    		        101, 4877, 113183, 113227, 118188, 10060, 15846, 10061, 3031,
    		        2128, 114696, 217, 2674, 115512, 118188, 10060, 10955, 11586,
    			    10061, 3740, 117490, 2775, 5718, 2212, 114236, 2468, 7475,
    			    117244, 5740, 115551, 2196, 5718, 2773, 112165, 1882, 102,
    			    0, 0, 0, 0, 0, 0, 0, 0, 0,
    			    0, 0, 0, 0, 0
    			],
    		    [ 
    		        101, 10207, 3642, 10186, 4460, 10386, 4348, 10064, 2234,
    			    114346, 4520, 7735, 117459, 2196, 5718, 84977, 11305, 68999,
    			    10731, 4237, 113162, 6063, 10270, 8272, 8505, 114003, 2196,
    			    5718, 4333, 4447, 115521, 100, 17316, 10230, 10108, 11589,
    			    100, 10064, 4792, 114213, 5611, 13209, 14951, 13240, 2726,
    			    111847, 5162, 112652, 1882, 102
    			]
    		], device = 'cuda:0'), 
    		lengths = tensor([36, 50], device = 'cuda:0'), 
    		bounds = tensor([
    		    [0, 1, 5, 6, 7, 8, 9, 11, 12, 15, 16, 17, 18, 19, 21, 22, 23, 25, 26, 28, 30, 31, 32, 34, 35, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1
    		    ],
    		    [0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 13, 14, 15, 19, 21, 22, 23, 24, 26, 27, 28, 29, 31, 32, 34, 35, 36, 37, 38, 40, 41, 43, 44, 46, 48, 49
    		    ]
    		], device = 'cuda:0'), bounds_lengths = tensor([25, 37], device = 'cuda:0'), 
    		strict_masks = tensor([
    			[   False, True, True, True, True, True, True, True, True, True,
    				True, True, True, True, True, True, True, True, True, True,
    				True, True, True, True, True, True, True, True, True, True,
    				True, True, True, True, True, False, False, False, False, False,
    				False, False, False, False, False, False, False, False, False, False
    			],
    			[   False, True, True, True, True, True, True, True, True, True,
    				True, True, True, True, True, True, True, True, True, True,
    				True, True, True, True, True, True, True, True, True, True,
    				True, True, True, True, True, True, True, True, True, True,
    				True, True, True, True, True, True, True, True, True, False
    			]
    		],
    		device = 'cuda:0'), 
    		number_of_tokens = tensor([23, 35], device = 'cuda:0', dtype = torch.int32)
    	),
    
    	'source_tags': BatchedSentence(
    	    tensor = tensor([
    		    [0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2],
    		    [1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1]
    		], device = 'cuda:0'), 
    		lengths = tensor([12, 30], device = 'cuda:0'), 
    		bounds = tensor([
    		    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1],
    		    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29]
    		], device = 'cuda:0'), 
    		bounds_lengths = tensor([12, 30], device = 'cuda:0'), 
    		strict_masks = tensor([
    			[   True, True, True, True, True, True, True, True, True, True,
    				True, True, False, False, False, False, False, False, False, False,
    				False, False, False, False, False, False, False, False, False, False
    			],
    			[   True, True, True, True, True, True, True, True, True, True,
    				True, True, True, True, True, True, True, True, True, True,
    				True, True, True, True, True, True, True, True, True, True
    			]
    		], device = 'cuda:0'), 
    		number_of_tokens = tensor([12, 30], device = 'cuda:0', dtype = torch.int32)
    	),
    
    	'target_tags': BatchedSentence(
    	    tensor = tensor([
    		    [1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2],
    		    [1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1]
    		], device = 'cuda:0'), 
    		lengths = tensor([15, 38], device = 'cuda:0'), 
    		bounds = tensor([
    		    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1],
    		    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37]
    		], device = 'cuda:0'), 
    		bounds_lengths = tensor([15, 38], device = 'cuda:0'), 
    		strict_masks = tensor([
    			[   True, True, True, True, True, True, True, True, True, True,
    				True, True, True, True, True, False, False, False, False, False,
    				False, False, False, False, False, False, False, False, False, False,
    				False, False, False, False, False, False, False, False
    			],
    			[   True, True, True, True, True, True, True, True, True, True,
    				True, True, True, True, True, True, True, True, True, True,
    				True, True, True, True, True, True, True, True, True, True,
    				True, True, True, True, True, True, True, True
    			]
    		], device = 'cuda:0'), 
    		number_of_tokens = tensor([15, 38], device = 'cuda:0', dtype = torch.int32)
    	),
    
    	'sentence_scores': tensor([0.6522, 0.5143], device = 'cuda:0'),
    
    	'gap_tags': BatchedSentence(
    	    tensor = tensor([
    		    [1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2],
    		    [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
    	    ], device = 'cuda:0'), 
    	    lengths = tensor([16, 39], device = 'cuda:0'), 
    	    bounds = tensor([
    		    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1],
    		    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38]
    		], device = 'cuda:0'), 
    		bounds_lengths = tensor([16, 39], device = 'cuda:0'), 
    		strict_masks = tensor([
    			[   True, True, True, True, True, True, True, True, True, True,
    				True, True, True, True, True, True, False, False, False, False,
    				False, False, False, False, False, False, False, False, False, False,
    				False, False, False, False, False, False, False, False, False
    			],
    			[   True, True, True, True, True, True, True, True, True, True,
    				True, True, True, True, True, True, True, True, True, True,
    				True, True, True, True, True, True, True, True, True, True,
    				True, True, True, True, True, True, True, True, True
    			]
    		], device = 'cuda:0'), 
    		number_of_tokens = tensor([16, 39], device = 'cuda:0', dtype = torch.int32)),
    	
    	'binary': tensor([1, 1], device = 'cuda:0')
    }

    result example:

    2021-08-27T08:15:17Z INFO      kiwi.training.callbacks:117: Best validation so far was in epoch 0:
    	val_loss: 85.9468, 
    	val_WMT19_MCC: 0.5628, 
    	val_loss_target_tags: 30.9671,
    	val_loss_gap_tags: 22.8837, 
    
    	val_loss_source_tags: 32.0960, 
    	val_WMT19_F1_MULT: 0.5807, 
    	val_F1_BAD: 0.7120, 
    	val_F1_OK: 0.3879, 
    	val_WMT19_CORRECT: 0.7842,
    
    	val_target_tags_F1_MULT: 0.3510, 
    	val_target_tags_MCC: 0.3523,
    	val_target_tags_CORRECT: 0.6544, 
    
    	val_gap_tags_F1_MULT: 0.2005,
    	val_gap_tags_MCC: 0.1625, 
    	val_gap_tags_CORRECT: 0.9068,
    
    	val_source_tags_F1_MULT: 0.2762, 
    	val_source_tags_MCC: 0.2858,
    	val_source_tags_CORRECT: 0.6083 
    

    未完待续。。。。。。

  • 相关阅读:
    Functors in OpenCV
    绘图及注释
    矩阵操作
    图像与大数组类型
    OpenCV的数据类型
    OpenCV入门
    去掉微信公众号里面的菜单栏
    解决python语言在cmd下中文乱码的问题
    解决python无法安装mysql数据库问题
    微信分享功能出现签名错误功能导致的原因
  • 原文地址:https://www.cnblogs.com/yanghh/p/15182933.html
Copyright © 2011-2022 走看看