Creating Memory Mapped NLP-based Data#

In this notebook, we will use a fast dataset provider-based abstraction that interfaces with Hugging Face’s datasets (and has been created by HazyResearch). The key advantage of this approach is the use of either shared memory or memory maps in Python to accelerate the caching process. Furthermore, the dataset is cached as a contiguous numpy array, enabling manipulation of data with any sequence length. This feature eliminates the need for re-encoding data for multiple lengths, streamlining the data processing pipeline.

Instantiating the Provider#

The first step is to instantiate the FastHfDatasetProvider.from_hub(), which loads and encodes the dataset. A set of arguments can be passed to its class method according to the user’s needs:

  • dataset_name: Name of the dataset.

  • dataset_config_name: Name of the dataset configuration.

  • data_dir: Path to the data directory.

  • tokenizer: Instance of tokenizer to use.

  • tokenizer_name: Name of the tokenizer, if tokenizer has not been passed.

  • mapping_column_name: The columns in dataset that should be tokenized.

  • validation_split: Fraction of the dataset to use for validation.

  • seed: Random seed.

  • num_workers: Number of workers to use for encoding.

  • use_eos_token: Whether to use EOS token to separate sequences.

  • use_shared_memory: Whether to use shared memory for caching.

  • cache_dir: Path to the cache directory.

[1]:
from archai.datasets.nlp.fast_hf_dataset_provider import FastHfDatasetProvider

# The provider will automatically download the dataset and tokenizer, encode
# the dataset and cache it for future use
dataset_provider = FastHfDatasetProvider.from_hub(
    "glue",
    dataset_config_name="sst2",
    tokenizer_name="gpt2",
    mapping_column_name=["sentence"],
    use_shared_memory=False,
    cache_dir="cache/glue-sst2-gpt2"
)

# (inputs, labels) can be retrieved with any sequence length
train_dataset = dataset_provider.get_train_dataset(seq_len=512)
val_dataset = dataset_provider.get_val_dataset(seq_len=512)
print(train_dataset[0], val_dataset[0])
2023-03-21 15:07:57,990 - archai.datasets.nlp.fast_hf_dataset_provider — WARNING —  Shared memory is not available in Python < 3.8.
2023-03-21 15:08:00,865 - archai.datasets.nlp.fast_hf_dataset_provider — INFO —  Downloading dataset ...
Found cached dataset glue (C:/Users/gderosa/.cache/huggingface/datasets/glue/sst2/1.0.0/dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad)
2023-03-21 15:08:04,789 - archai.datasets.nlp.fast_hf_dataset_provider — INFO —  Encoding dataset ...
2023-03-21 15:08:04,793 - archai.datasets.nlp.fast_hf_dataset_provider — INFO —  Number of workers: 1 | EOS token: True
Loading cached processed dataset at C:\Users\gderosa\.cache\huggingface\datasets\glue\sst2\1.0.0\dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad\cache-02d7057c177051d2.arrow
Loading cached processed dataset at C:\Users\gderosa\.cache\huggingface\datasets\glue\sst2\1.0.0\dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad\cache-bfdab1158dc66b61.arrow
Loading cached processed dataset at C:\Users\gderosa\.cache\huggingface\datasets\glue\sst2\1.0.0\dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad\cache-24fdf102efed86dd.arrow
2023-03-21 15:08:05,096 - archai.datasets.nlp.fast_hf_dataset_provider — INFO —  Processing dataset to memory ...
2023-03-21 15:08:05,099 - archai.datasets.nlp.fast_hf_dataset_provider — INFO —  Number of workers: 1 | Shared memory: False
2023-03-21 15:08:06,767 - archai.datasets.nlp.fast_hf_dataset_provider — INFO —  Saving dataset to: cache\glue-sst2-gpt2
(tensor([24717,   649,  3200,   507,   422,   262, 21694,  4991,   220, 50256,
         3642,  1299,   645, 20868,   837,   691,  2248,  1850,   308,  3775,
          220, 50256,  5562, 10408,   663,  3435,   290, 48556,  1223,  2138,
         4950,   546,  1692,  3450,   220, 50256,  2787,  1299, 15950, 11378,
          284,  3520,   262,   976,  3690,   220, 50256,   261,   262,  5290,
        15827,    12,  1659,    12,  1169,    12,  1008,  9310, 35478, 20954,
          262, 28303,   714, 47478,   469,   510,   220, 50256,  5562,   705,
           82,  1290,  1165, 15444,   284, 17004,   884, 31194,  3513,   220,
        50256, 26567,  2536,   689,   326,   262,  3437,   286,   884,   289,
        31777,  2512, 30181,   355, 29408,  1830,   460,   991,  1210,   503,
          257,  1402,   837,  2614,  2646,   351,   281,  7016,  3355,   404,
          764,   220, 50256,  1659,   473,    84,   948,   220, 50256,    64,
        19095, 17280,    12,  1941,    12,   727,   705,    82, 26781, 19518,
          220, 50256,   533,   517,  7744,  1807,   832,   621,   287,   749,
         4600,   826,    12, 28973,   705,  7328,   220, 50256,  2188,   274,
          284, 12986, 20428,   220, 50256,  1640,   883,  3807, 31006,   508,
        13121,   326,  4600,   484,   466,   299,   470,   787,  6918,   588,
          484,   973,   284,  7471,   220, 50256,  1169,   636,   810,  2147,
          705,    82,  5836,   837,   220, 50256, 43439,   703,  2089,   428,
         3807,   373,   220, 50256,    75,   437,   617, 16247,   284,   257,
        13526,  1621,   220, 50256,  1169,  6000, 17245,   220, 50256, 36673,
         3807,   220, 50256,  4480,   465,  6678,  4430,   290, 11800,   774,
          220, 50256,   445,   917,   415,  3721,   220, 50256,  2032, 27428,
          318,  2029,   477,   546,   257,  1862,  2415,   705,    82,  1986,
          837,   290,   416, 13092,   281, 14549,  3025,  1986,  4493,   326,
         2415,   705,    82, 17188,   290,   614, 23400,   837,   340, 31137,
          764,   220, 50256,  4853,   874,   262,  2656,   290,   287,   617,
         2842,   772,   731,  1010,   340,   220, 50256,   361,  1997,   837,
          766,   340,   329,   479,  5757,  2042,   837,   508, 11665,   510,
          257,  6388,   355,   257, 26821, 14314, 10086, 46410,  3706, 11841,
        19317,   764,   220, 50256,    64,  8212,   319,   534,  1986,   220,
        50256,  8988,   422,   262, 14802,   837, 26329, 44139, 13289,   220,
        50256,  1069, 32838, 46136,   306,  3684, 16948,   290,  6028, 17049,
          555,   398,  5109,   220, 50256,   268, 30486,   416,   281, 25007,
         9404,  7668,  3350,   286, 47792, 13747,   220, 50256,  4758,  2063,
          286, 10441, 12254,   318,  4785,  1058,   262,   636,   810,  2147,
          705,    82,  5836,   837,   393,   262,   636,   810,  1223,   705,
           82,  5836,   220, 50256,   259,   995, 22041,   220, 50256,   548,
          922, 11681,  5559,   220, 50256,  1169,  7110,   318,  2147,   475,
        36741,  6816, 35478, 20954,   422,   923,   284,  5461,   837,   220,
        50256,  1169,  2223,   318,   336,   346,  1513,   220, 50256,   261,
          477, 43386,   220, 50256, 10594,  1064,  1310,   286,  1393,   287,
          428,  2646,   837,   543,   318,  1690, 32950,    88,   290, 13455,
        13134,   220, 50256,  1525,  1290,   262,  5290,  3807,   286,   262,
          614,   220, 50256, 48937,   832,   837,   220, 50256,  3549,   621,
         1194,  7559,  1266,   582, 10148, 17271,   416, 44889,   257,  7505,
         3690,   428,  8258,  2646,   220, 50256,   270,   705,    82,   546,
         2428,   749,  6490,   423,   284,  1986,   287,  4845,   290,  1312,
          892,   326,   705,    82,   644,  1312,  8288,   546,   340,  1377,
          262,  1103,  2428, 29779,  1022,   262, 14397,   290, 14897, 22992,
          220, 50256, 11718,   274,   220, 50256,   672, 35260,   284,   262,
         6224,   286]), tensor([  649,  3200,   507,   422,   262, 21694,  4991,   220, 50256,  3642,
         1299,   645, 20868,   837,   691,  2248,  1850,   308,  3775,   220,
        50256,  5562, 10408,   663,  3435,   290, 48556,  1223,  2138,  4950,
          546,  1692,  3450,   220, 50256,  2787,  1299, 15950, 11378,   284,
         3520,   262,   976,  3690,   220, 50256,   261,   262,  5290, 15827,
           12,  1659,    12,  1169,    12,  1008,  9310, 35478, 20954,   262,
        28303,   714, 47478,   469,   510,   220, 50256,  5562,   705,    82,
         1290,  1165, 15444,   284, 17004,   884, 31194,  3513,   220, 50256,
        26567,  2536,   689,   326,   262,  3437,   286,   884,   289, 31777,
         2512, 30181,   355, 29408,  1830,   460,   991,  1210,   503,   257,
         1402,   837,  2614,  2646,   351,   281,  7016,  3355,   404,   764,
          220, 50256,  1659,   473,    84,   948,   220, 50256,    64, 19095,
        17280,    12,  1941,    12,   727,   705,    82, 26781, 19518,   220,
        50256,   533,   517,  7744,  1807,   832,   621,   287,   749,  4600,
          826,    12, 28973,   705,  7328,   220, 50256,  2188,   274,   284,
        12986, 20428,   220, 50256,  1640,   883,  3807, 31006,   508, 13121,
          326,  4600,   484,   466,   299,   470,   787,  6918,   588,   484,
          973,   284,  7471,   220, 50256,  1169,   636,   810,  2147,   705,
           82,  5836,   837,   220, 50256, 43439,   703,  2089,   428,  3807,
          373,   220, 50256,    75,   437,   617, 16247,   284,   257, 13526,
         1621,   220, 50256,  1169,  6000, 17245,   220, 50256, 36673,  3807,
          220, 50256,  4480,   465,  6678,  4430,   290, 11800,   774,   220,
        50256,   445,   917,   415,  3721,   220, 50256,  2032, 27428,   318,
         2029,   477,   546,   257,  1862,  2415,   705,    82,  1986,   837,
          290,   416, 13092,   281, 14549,  3025,  1986,  4493,   326,  2415,
          705,    82, 17188,   290,   614, 23400,   837,   340, 31137,   764,
          220, 50256,  4853,   874,   262,  2656,   290,   287,   617,  2842,
          772,   731,  1010,   340,   220, 50256,   361,  1997,   837,   766,
          340,   329,   479,  5757,  2042,   837,   508, 11665,   510,   257,
         6388,   355,   257, 26821, 14314, 10086, 46410,  3706, 11841, 19317,
          764,   220, 50256,    64,  8212,   319,   534,  1986,   220, 50256,
         8988,   422,   262, 14802,   837, 26329, 44139, 13289,   220, 50256,
         1069, 32838, 46136,   306,  3684, 16948,   290,  6028, 17049,   555,
          398,  5109,   220, 50256,   268, 30486,   416,   281, 25007,  9404,
         7668,  3350,   286, 47792, 13747,   220, 50256,  4758,  2063,   286,
        10441, 12254,   318,  4785,  1058,   262,   636,   810,  2147,   705,
           82,  5836,   837,   393,   262,   636,   810,  1223,   705,    82,
         5836,   220, 50256,   259,   995, 22041,   220, 50256,   548,   922,
        11681,  5559,   220, 50256,  1169,  7110,   318,  2147,   475, 36741,
         6816, 35478, 20954,   422,   923,   284,  5461,   837,   220, 50256,
         1169,  2223,   318,   336,   346,  1513,   220, 50256,   261,   477,
        43386,   220, 50256, 10594,  1064,  1310,   286,  1393,   287,   428,
         2646,   837,   543,   318,  1690, 32950,    88,   290, 13455, 13134,
          220, 50256,  1525,  1290,   262,  5290,  3807,   286,   262,   614,
          220, 50256, 48937,   832,   837,   220, 50256,  3549,   621,  1194,
         7559,  1266,   582, 10148, 17271,   416, 44889,   257,  7505,  3690,
          428,  8258,  2646,   220, 50256,   270,   705,    82,   546,  2428,
          749,  6490,   423,   284,  1986,   287,  4845,   290,  1312,   892,
          326,   705,    82,   644,  1312,  8288,   546,   340,  1377,   262,
         1103,  2428, 29779,  1022,   262, 14397,   290, 14897, 22992,   220,
        50256, 11718,   274,   220, 50256,   672, 35260,   284,   262,  6224,
          286,   428])) (tensor([  270,   705,    82,   257, 23332,   290,  1690, 13891,  7002,   764,
          220, 50256,   403,  2704,  8589,  4420, 30942,   290, 12111,   220,
        50256, 47205,   514,   284,  2911,   326,   299, 16617,   318, 24357,
          284, 21030,   257,  1688,  3451,   355,   257,  5068,  1865, 47602,
        26479,   764,   220, 50256,  1169,  7205,   837, 24138,   837,  2647,
          837, 13483, 45501,   290,  2128,   389,   477, 34328,  1813,   262,
         3227,   705,    82, 38132,   567,  1957,   274,   764,   220, 50256,
          270,   705,    82,  3105,  1377,   845,   837,   845,  3105,   764,
          220, 50256, 16670, 49699,   351, 14733,   290,   257,  1178, 39980,
         4135, 18105,   837,   262,  2646,   318,   257, 23056,   306,  2726,
          804,   379,  1862,  1466,   764,   220, 50256,    64,  3360, 32460,
         2646,   764,   220, 50256,   273,  1804,   938,   614,   705,    82,
         5704,   351,   534,   409,    12, 22095,   764,   220, 50256,  5832,
          466,   299,   470,   423,   284,   760,   546,  2647,   284,  9144,
          262,  2646,   705,    82,  2562,  5146, 13516,   286, 10997,   290,
        19661,   764,   220, 50256,   259,  3446,  9919,  2431,   837,   749,
          286,   543,  3804,   355,  6364,   355,   611,  1312,   705,    67,
          587,  5586, 12105,   319,   281, 45329, 29680,   837, 10451,  6885,
        30895,   422, 37276,   284, 13665,  2584,   284, 10517, 26876,   764,
          220, 50256,  1169, 46814,  2890, 13289,   286,   262,  5983,  1394,
          262,  2646, 22804,   290,  1394,   262,  5386, 40112,  1513,   764,
          220, 50256,   270,  2753,   257,  6283,  1611,   286, 37296,  1272,
          284,  7030,   262, 18054,   286,   686,  4835,   329,  1706,   837,
          281,   710,   285,   451,    64,   837,   304,  1018,  1734, 35783,
          837,   290,   842,  1292,    67, 11555, 30686,  1559,   477,   287,
          262,   976,  3807,   764,   220, 50256,   986,   262,  2646, 21046,
          422,   257,  3092,   286, 14733,   357,  1223,  2622,   284,  5236,
          503,   262,  3685,  1267,  2644,   220, 50256,   732,  6808,   329,
          357,   537,  3301,   290,   279,  2518,  1267,   837,   772,   588,
          606,   837,   996,  3737,   340,   705,    82,   281,  9942,  5699,
          284, 26246,   764,   220, 50256, 10197,  9961,  3296,   481,   749,
         1884,   407,  1064,   644,   484,   705,   260,  6095,   351,  5876,
          790,  1110,  2162,   262,  3807, 16523,  1111,  5636,  2171,   290,
        14733,   764,   220, 50256,    64, 18857,   837,  1029,    12, 45564,
          863, 10530,   422,   773,   544,   326, 33954,   271,  3973, 32067,
         2647,   837,  9280,   837,  3496,   837,   290,  1029, 10512,   764,
          220, 50256,  1169, 10825,   389,  8246,   290,   481,  5587,   257,
        16384,   351,  2687,   508,   705,    82,  1683,   550,  1641, 14649,
          764,   220, 50256,  3885,  4364,   256,   265,   280,   468,   257,
        47868,   329, 10868,  9176,   326,  7842,  1958,   607, 23077, 20024,
          837,   290,   287,   428,  4187,   378, 48718, 10997,   837,   673,
          705,    82,   355,  3329,    12,  4743,   652,   409, 18478,   415,
          355,   673,   373,   287,   716,  2634, 14485,   764,   220, 50256,
          986,   262,  3807,   318,   655,   257,  8631,  1468,  9234,   764,
          220, 50256,   259,   663,  1266,  7188,   837, 22960,   257,  2089,
         1029,  1524,  3227,   286, 35537,   837,  1231,  4414,   286,  3496,
          764,   220, 50256,    79,   931,  5116,  2753,   281, 37959,   804,
          379,   262, 29442,   286,  1964, 29409,   837,   475,   340,   857,
          523,   351,   884,   281, 30690,  8216,   326,   345,  1239,   760,
          618, 14733,  5645,   290, 13574,  6140,   764,   220, 50256,  1169,
         4686,  7940,   375, 20374,   329,  1528,   532,   428,   655,  2936,
          588,   340]), tensor([  705,    82,   257, 23332,   290,  1690, 13891,  7002,   764,   220,
        50256,   403,  2704,  8589,  4420, 30942,   290, 12111,   220, 50256,
        47205,   514,   284,  2911,   326,   299, 16617,   318, 24357,   284,
        21030,   257,  1688,  3451,   355,   257,  5068,  1865, 47602, 26479,
          764,   220, 50256,  1169,  7205,   837, 24138,   837,  2647,   837,
        13483, 45501,   290,  2128,   389,   477, 34328,  1813,   262,  3227,
          705,    82, 38132,   567,  1957,   274,   764,   220, 50256,   270,
          705,    82,  3105,  1377,   845,   837,   845,  3105,   764,   220,
        50256, 16670, 49699,   351, 14733,   290,   257,  1178, 39980,  4135,
        18105,   837,   262,  2646,   318,   257, 23056,   306,  2726,   804,
          379,  1862,  1466,   764,   220, 50256,    64,  3360, 32460,  2646,
          764,   220, 50256,   273,  1804,   938,   614,   705,    82,  5704,
          351,   534,   409,    12, 22095,   764,   220, 50256,  5832,   466,
          299,   470,   423,   284,   760,   546,  2647,   284,  9144,   262,
         2646,   705,    82,  2562,  5146, 13516,   286, 10997,   290, 19661,
          764,   220, 50256,   259,  3446,  9919,  2431,   837,   749,   286,
          543,  3804,   355,  6364,   355,   611,  1312,   705,    67,   587,
         5586, 12105,   319,   281, 45329, 29680,   837, 10451,  6885, 30895,
          422, 37276,   284, 13665,  2584,   284, 10517, 26876,   764,   220,
        50256,  1169, 46814,  2890, 13289,   286,   262,  5983,  1394,   262,
         2646, 22804,   290,  1394,   262,  5386, 40112,  1513,   764,   220,
        50256,   270,  2753,   257,  6283,  1611,   286, 37296,  1272,   284,
         7030,   262, 18054,   286,   686,  4835,   329,  1706,   837,   281,
          710,   285,   451,    64,   837,   304,  1018,  1734, 35783,   837,
          290,   842,  1292,    67, 11555, 30686,  1559,   477,   287,   262,
          976,  3807,   764,   220, 50256,   986,   262,  2646, 21046,   422,
          257,  3092,   286, 14733,   357,  1223,  2622,   284,  5236,   503,
          262,  3685,  1267,  2644,   220, 50256,   732,  6808,   329,   357,
          537,  3301,   290,   279,  2518,  1267,   837,   772,   588,   606,
          837,   996,  3737,   340,   705,    82,   281,  9942,  5699,   284,
        26246,   764,   220, 50256, 10197,  9961,  3296,   481,   749,  1884,
          407,  1064,   644,   484,   705,   260,  6095,   351,  5876,   790,
         1110,  2162,   262,  3807, 16523,  1111,  5636,  2171,   290, 14733,
          764,   220, 50256,    64, 18857,   837,  1029,    12, 45564,   863,
        10530,   422,   773,   544,   326, 33954,   271,  3973, 32067,  2647,
          837,  9280,   837,  3496,   837,   290,  1029, 10512,   764,   220,
        50256,  1169, 10825,   389,  8246,   290,   481,  5587,   257, 16384,
          351,  2687,   508,   705,    82,  1683,   550,  1641, 14649,   764,
          220, 50256,  3885,  4364,   256,   265,   280,   468,   257, 47868,
          329, 10868,  9176,   326,  7842,  1958,   607, 23077, 20024,   837,
          290,   287,   428,  4187,   378, 48718, 10997,   837,   673,   705,
           82,   355,  3329,    12,  4743,   652,   409, 18478,   415,   355,
          673,   373,   287,   716,  2634, 14485,   764,   220, 50256,   986,
          262,  3807,   318,   655,   257,  8631,  1468,  9234,   764,   220,
        50256,   259,   663,  1266,  7188,   837, 22960,   257,  2089,  1029,
         1524,  3227,   286, 35537,   837,  1231,  4414,   286,  3496,   764,
          220, 50256,    79,   931,  5116,  2753,   281, 37959,   804,   379,
          262, 29442,   286,  1964, 29409,   837,   475,   340,   857,   523,
          351,   884,   281, 30690,  8216,   326,   345,  1239,   760,   618,
        14733,  5645,   290, 13574,  6140,   764,   220, 50256,  1169,  4686,
         7940,   375, 20374,   329,  1528,   532,   428,   655,  2936,   588,
          340,   750]))

Loading from Cache#

After loading and encoding the dataset for the first time, a cache will be created with a unique fingerprint (identifier) based on its configuration. The cached is composed by the following files:

  • config.json: Dataset provider configuration (used to re-create the object when loaded from cache).

  • tokenizer.pkl: Tokenizer used to encode the data (also re-created when loaded from cache).

  • train.npy: Training tokens (inputs and labels).

  • validation.npy: Validation tokens (inputs and labels).

  • test.npy: Testing tokens (inputs and labels).

The FastHfDatasetProvider class provides a from_cache method which can be used to re-instantiate the cached dataset provider, in case the user wants to re-use in different places.

[2]:
# The caching mechanism automatically saves `config.json` and `tokenizer.pkl`,
# which are used to recreate the provider when calling `from_cache` method
dataset_provider = FastHfDatasetProvider.from_cache("cache/glue-sst2-gpt2")

train_dataset = dataset_provider.get_train_dataset(seq_len=512)
val_dataset = dataset_provider.get_val_dataset(seq_len=512)
print(train_dataset[0], val_dataset[0])
2023-03-21 15:08:07,007 - archai.datasets.nlp.fast_hf_dataset_provider — INFO —  Loading dataset from: cache/glue-sst2-gpt2
(tensor([24717,   649,  3200,   507,   422,   262, 21694,  4991,   220, 50256,
         3642,  1299,   645, 20868,   837,   691,  2248,  1850,   308,  3775,
          220, 50256,  5562, 10408,   663,  3435,   290, 48556,  1223,  2138,
         4950,   546,  1692,  3450,   220, 50256,  2787,  1299, 15950, 11378,
          284,  3520,   262,   976,  3690,   220, 50256,   261,   262,  5290,
        15827,    12,  1659,    12,  1169,    12,  1008,  9310, 35478, 20954,
          262, 28303,   714, 47478,   469,   510,   220, 50256,  5562,   705,
           82,  1290,  1165, 15444,   284, 17004,   884, 31194,  3513,   220,
        50256, 26567,  2536,   689,   326,   262,  3437,   286,   884,   289,
        31777,  2512, 30181,   355, 29408,  1830,   460,   991,  1210,   503,
          257,  1402,   837,  2614,  2646,   351,   281,  7016,  3355,   404,
          764,   220, 50256,  1659,   473,    84,   948,   220, 50256,    64,
        19095, 17280,    12,  1941,    12,   727,   705,    82, 26781, 19518,
          220, 50256,   533,   517,  7744,  1807,   832,   621,   287,   749,
         4600,   826,    12, 28973,   705,  7328,   220, 50256,  2188,   274,
          284, 12986, 20428,   220, 50256,  1640,   883,  3807, 31006,   508,
        13121,   326,  4600,   484,   466,   299,   470,   787,  6918,   588,
          484,   973,   284,  7471,   220, 50256,  1169,   636,   810,  2147,
          705,    82,  5836,   837,   220, 50256, 43439,   703,  2089,   428,
         3807,   373,   220, 50256,    75,   437,   617, 16247,   284,   257,
        13526,  1621,   220, 50256,  1169,  6000, 17245,   220, 50256, 36673,
         3807,   220, 50256,  4480,   465,  6678,  4430,   290, 11800,   774,
          220, 50256,   445,   917,   415,  3721,   220, 50256,  2032, 27428,
          318,  2029,   477,   546,   257,  1862,  2415,   705,    82,  1986,
          837,   290,   416, 13092,   281, 14549,  3025,  1986,  4493,   326,
         2415,   705,    82, 17188,   290,   614, 23400,   837,   340, 31137,
          764,   220, 50256,  4853,   874,   262,  2656,   290,   287,   617,
         2842,   772,   731,  1010,   340,   220, 50256,   361,  1997,   837,
          766,   340,   329,   479,  5757,  2042,   837,   508, 11665,   510,
          257,  6388,   355,   257, 26821, 14314, 10086, 46410,  3706, 11841,
        19317,   764,   220, 50256,    64,  8212,   319,   534,  1986,   220,
        50256,  8988,   422,   262, 14802,   837, 26329, 44139, 13289,   220,
        50256,  1069, 32838, 46136,   306,  3684, 16948,   290,  6028, 17049,
          555,   398,  5109,   220, 50256,   268, 30486,   416,   281, 25007,
         9404,  7668,  3350,   286, 47792, 13747,   220, 50256,  4758,  2063,
          286, 10441, 12254,   318,  4785,  1058,   262,   636,   810,  2147,
          705,    82,  5836,   837,   393,   262,   636,   810,  1223,   705,
           82,  5836,   220, 50256,   259,   995, 22041,   220, 50256,   548,
          922, 11681,  5559,   220, 50256,  1169,  7110,   318,  2147,   475,
        36741,  6816, 35478, 20954,   422,   923,   284,  5461,   837,   220,
        50256,  1169,  2223,   318,   336,   346,  1513,   220, 50256,   261,
          477, 43386,   220, 50256, 10594,  1064,  1310,   286,  1393,   287,
          428,  2646,   837,   543,   318,  1690, 32950,    88,   290, 13455,
        13134,   220, 50256,  1525,  1290,   262,  5290,  3807,   286,   262,
          614,   220, 50256, 48937,   832,   837,   220, 50256,  3549,   621,
         1194,  7559,  1266,   582, 10148, 17271,   416, 44889,   257,  7505,
         3690,   428,  8258,  2646,   220, 50256,   270,   705,    82,   546,
         2428,   749,  6490,   423,   284,  1986,   287,  4845,   290,  1312,
          892,   326,   705,    82,   644,  1312,  8288,   546,   340,  1377,
          262,  1103,  2428, 29779,  1022,   262, 14397,   290, 14897, 22992,
          220, 50256, 11718,   274,   220, 50256,   672, 35260,   284,   262,
         6224,   286]), tensor([  649,  3200,   507,   422,   262, 21694,  4991,   220, 50256,  3642,
         1299,   645, 20868,   837,   691,  2248,  1850,   308,  3775,   220,
        50256,  5562, 10408,   663,  3435,   290, 48556,  1223,  2138,  4950,
          546,  1692,  3450,   220, 50256,  2787,  1299, 15950, 11378,   284,
         3520,   262,   976,  3690,   220, 50256,   261,   262,  5290, 15827,
           12,  1659,    12,  1169,    12,  1008,  9310, 35478, 20954,   262,
        28303,   714, 47478,   469,   510,   220, 50256,  5562,   705,    82,
         1290,  1165, 15444,   284, 17004,   884, 31194,  3513,   220, 50256,
        26567,  2536,   689,   326,   262,  3437,   286,   884,   289, 31777,
         2512, 30181,   355, 29408,  1830,   460,   991,  1210,   503,   257,
         1402,   837,  2614,  2646,   351,   281,  7016,  3355,   404,   764,
          220, 50256,  1659,   473,    84,   948,   220, 50256,    64, 19095,
        17280,    12,  1941,    12,   727,   705,    82, 26781, 19518,   220,
        50256,   533,   517,  7744,  1807,   832,   621,   287,   749,  4600,
          826,    12, 28973,   705,  7328,   220, 50256,  2188,   274,   284,
        12986, 20428,   220, 50256,  1640,   883,  3807, 31006,   508, 13121,
          326,  4600,   484,   466,   299,   470,   787,  6918,   588,   484,
          973,   284,  7471,   220, 50256,  1169,   636,   810,  2147,   705,
           82,  5836,   837,   220, 50256, 43439,   703,  2089,   428,  3807,
          373,   220, 50256,    75,   437,   617, 16247,   284,   257, 13526,
         1621,   220, 50256,  1169,  6000, 17245,   220, 50256, 36673,  3807,
          220, 50256,  4480,   465,  6678,  4430,   290, 11800,   774,   220,
        50256,   445,   917,   415,  3721,   220, 50256,  2032, 27428,   318,
         2029,   477,   546,   257,  1862,  2415,   705,    82,  1986,   837,
          290,   416, 13092,   281, 14549,  3025,  1986,  4493,   326,  2415,
          705,    82, 17188,   290,   614, 23400,   837,   340, 31137,   764,
          220, 50256,  4853,   874,   262,  2656,   290,   287,   617,  2842,
          772,   731,  1010,   340,   220, 50256,   361,  1997,   837,   766,
          340,   329,   479,  5757,  2042,   837,   508, 11665,   510,   257,
         6388,   355,   257, 26821, 14314, 10086, 46410,  3706, 11841, 19317,
          764,   220, 50256,    64,  8212,   319,   534,  1986,   220, 50256,
         8988,   422,   262, 14802,   837, 26329, 44139, 13289,   220, 50256,
         1069, 32838, 46136,   306,  3684, 16948,   290,  6028, 17049,   555,
          398,  5109,   220, 50256,   268, 30486,   416,   281, 25007,  9404,
         7668,  3350,   286, 47792, 13747,   220, 50256,  4758,  2063,   286,
        10441, 12254,   318,  4785,  1058,   262,   636,   810,  2147,   705,
           82,  5836,   837,   393,   262,   636,   810,  1223,   705,    82,
         5836,   220, 50256,   259,   995, 22041,   220, 50256,   548,   922,
        11681,  5559,   220, 50256,  1169,  7110,   318,  2147,   475, 36741,
         6816, 35478, 20954,   422,   923,   284,  5461,   837,   220, 50256,
         1169,  2223,   318,   336,   346,  1513,   220, 50256,   261,   477,
        43386,   220, 50256, 10594,  1064,  1310,   286,  1393,   287,   428,
         2646,   837,   543,   318,  1690, 32950,    88,   290, 13455, 13134,
          220, 50256,  1525,  1290,   262,  5290,  3807,   286,   262,   614,
          220, 50256, 48937,   832,   837,   220, 50256,  3549,   621,  1194,
         7559,  1266,   582, 10148, 17271,   416, 44889,   257,  7505,  3690,
          428,  8258,  2646,   220, 50256,   270,   705,    82,   546,  2428,
          749,  6490,   423,   284,  1986,   287,  4845,   290,  1312,   892,
          326,   705,    82,   644,  1312,  8288,   546,   340,  1377,   262,
         1103,  2428, 29779,  1022,   262, 14397,   290, 14897, 22992,   220,
        50256, 11718,   274,   220, 50256,   672, 35260,   284,   262,  6224,
          286,   428])) (tensor([  270,   705,    82,   257, 23332,   290,  1690, 13891,  7002,   764,
          220, 50256,   403,  2704,  8589,  4420, 30942,   290, 12111,   220,
        50256, 47205,   514,   284,  2911,   326,   299, 16617,   318, 24357,
          284, 21030,   257,  1688,  3451,   355,   257,  5068,  1865, 47602,
        26479,   764,   220, 50256,  1169,  7205,   837, 24138,   837,  2647,
          837, 13483, 45501,   290,  2128,   389,   477, 34328,  1813,   262,
         3227,   705,    82, 38132,   567,  1957,   274,   764,   220, 50256,
          270,   705,    82,  3105,  1377,   845,   837,   845,  3105,   764,
          220, 50256, 16670, 49699,   351, 14733,   290,   257,  1178, 39980,
         4135, 18105,   837,   262,  2646,   318,   257, 23056,   306,  2726,
          804,   379,  1862,  1466,   764,   220, 50256,    64,  3360, 32460,
         2646,   764,   220, 50256,   273,  1804,   938,   614,   705,    82,
         5704,   351,   534,   409,    12, 22095,   764,   220, 50256,  5832,
          466,   299,   470,   423,   284,   760,   546,  2647,   284,  9144,
          262,  2646,   705,    82,  2562,  5146, 13516,   286, 10997,   290,
        19661,   764,   220, 50256,   259,  3446,  9919,  2431,   837,   749,
          286,   543,  3804,   355,  6364,   355,   611,  1312,   705,    67,
          587,  5586, 12105,   319,   281, 45329, 29680,   837, 10451,  6885,
        30895,   422, 37276,   284, 13665,  2584,   284, 10517, 26876,   764,
          220, 50256,  1169, 46814,  2890, 13289,   286,   262,  5983,  1394,
          262,  2646, 22804,   290,  1394,   262,  5386, 40112,  1513,   764,
          220, 50256,   270,  2753,   257,  6283,  1611,   286, 37296,  1272,
          284,  7030,   262, 18054,   286,   686,  4835,   329,  1706,   837,
          281,   710,   285,   451,    64,   837,   304,  1018,  1734, 35783,
          837,   290,   842,  1292,    67, 11555, 30686,  1559,   477,   287,
          262,   976,  3807,   764,   220, 50256,   986,   262,  2646, 21046,
          422,   257,  3092,   286, 14733,   357,  1223,  2622,   284,  5236,
          503,   262,  3685,  1267,  2644,   220, 50256,   732,  6808,   329,
          357,   537,  3301,   290,   279,  2518,  1267,   837,   772,   588,
          606,   837,   996,  3737,   340,   705,    82,   281,  9942,  5699,
          284, 26246,   764,   220, 50256, 10197,  9961,  3296,   481,   749,
         1884,   407,  1064,   644,   484,   705,   260,  6095,   351,  5876,
          790,  1110,  2162,   262,  3807, 16523,  1111,  5636,  2171,   290,
        14733,   764,   220, 50256,    64, 18857,   837,  1029,    12, 45564,
          863, 10530,   422,   773,   544,   326, 33954,   271,  3973, 32067,
         2647,   837,  9280,   837,  3496,   837,   290,  1029, 10512,   764,
          220, 50256,  1169, 10825,   389,  8246,   290,   481,  5587,   257,
        16384,   351,  2687,   508,   705,    82,  1683,   550,  1641, 14649,
          764,   220, 50256,  3885,  4364,   256,   265,   280,   468,   257,
        47868,   329, 10868,  9176,   326,  7842,  1958,   607, 23077, 20024,
          837,   290,   287,   428,  4187,   378, 48718, 10997,   837,   673,
          705,    82,   355,  3329,    12,  4743,   652,   409, 18478,   415,
          355,   673,   373,   287,   716,  2634, 14485,   764,   220, 50256,
          986,   262,  3807,   318,   655,   257,  8631,  1468,  9234,   764,
          220, 50256,   259,   663,  1266,  7188,   837, 22960,   257,  2089,
         1029,  1524,  3227,   286, 35537,   837,  1231,  4414,   286,  3496,
          764,   220, 50256,    79,   931,  5116,  2753,   281, 37959,   804,
          379,   262, 29442,   286,  1964, 29409,   837,   475,   340,   857,
          523,   351,   884,   281, 30690,  8216,   326,   345,  1239,   760,
          618, 14733,  5645,   290, 13574,  6140,   764,   220, 50256,  1169,
         4686,  7940,   375, 20374,   329,  1528,   532,   428,   655,  2936,
          588,   340]), tensor([  705,    82,   257, 23332,   290,  1690, 13891,  7002,   764,   220,
        50256,   403,  2704,  8589,  4420, 30942,   290, 12111,   220, 50256,
        47205,   514,   284,  2911,   326,   299, 16617,   318, 24357,   284,
        21030,   257,  1688,  3451,   355,   257,  5068,  1865, 47602, 26479,
          764,   220, 50256,  1169,  7205,   837, 24138,   837,  2647,   837,
        13483, 45501,   290,  2128,   389,   477, 34328,  1813,   262,  3227,
          705,    82, 38132,   567,  1957,   274,   764,   220, 50256,   270,
          705,    82,  3105,  1377,   845,   837,   845,  3105,   764,   220,
        50256, 16670, 49699,   351, 14733,   290,   257,  1178, 39980,  4135,
        18105,   837,   262,  2646,   318,   257, 23056,   306,  2726,   804,
          379,  1862,  1466,   764,   220, 50256,    64,  3360, 32460,  2646,
          764,   220, 50256,   273,  1804,   938,   614,   705,    82,  5704,
          351,   534,   409,    12, 22095,   764,   220, 50256,  5832,   466,
          299,   470,   423,   284,   760,   546,  2647,   284,  9144,   262,
         2646,   705,    82,  2562,  5146, 13516,   286, 10997,   290, 19661,
          764,   220, 50256,   259,  3446,  9919,  2431,   837,   749,   286,
          543,  3804,   355,  6364,   355,   611,  1312,   705,    67,   587,
         5586, 12105,   319,   281, 45329, 29680,   837, 10451,  6885, 30895,
          422, 37276,   284, 13665,  2584,   284, 10517, 26876,   764,   220,
        50256,  1169, 46814,  2890, 13289,   286,   262,  5983,  1394,   262,
         2646, 22804,   290,  1394,   262,  5386, 40112,  1513,   764,   220,
        50256,   270,  2753,   257,  6283,  1611,   286, 37296,  1272,   284,
         7030,   262, 18054,   286,   686,  4835,   329,  1706,   837,   281,
          710,   285,   451,    64,   837,   304,  1018,  1734, 35783,   837,
          290,   842,  1292,    67, 11555, 30686,  1559,   477,   287,   262,
          976,  3807,   764,   220, 50256,   986,   262,  2646, 21046,   422,
          257,  3092,   286, 14733,   357,  1223,  2622,   284,  5236,   503,
          262,  3685,  1267,  2644,   220, 50256,   732,  6808,   329,   357,
          537,  3301,   290,   279,  2518,  1267,   837,   772,   588,   606,
          837,   996,  3737,   340,   705,    82,   281,  9942,  5699,   284,
        26246,   764,   220, 50256, 10197,  9961,  3296,   481,   749,  1884,
          407,  1064,   644,   484,   705,   260,  6095,   351,  5876,   790,
         1110,  2162,   262,  3807, 16523,  1111,  5636,  2171,   290, 14733,
          764,   220, 50256,    64, 18857,   837,  1029,    12, 45564,   863,
        10530,   422,   773,   544,   326, 33954,   271,  3973, 32067,  2647,
          837,  9280,   837,  3496,   837,   290,  1029, 10512,   764,   220,
        50256,  1169, 10825,   389,  8246,   290,   481,  5587,   257, 16384,
          351,  2687,   508,   705,    82,  1683,   550,  1641, 14649,   764,
          220, 50256,  3885,  4364,   256,   265,   280,   468,   257, 47868,
          329, 10868,  9176,   326,  7842,  1958,   607, 23077, 20024,   837,
          290,   287,   428,  4187,   378, 48718, 10997,   837,   673,   705,
           82,   355,  3329,    12,  4743,   652,   409, 18478,   415,   355,
          673,   373,   287,   716,  2634, 14485,   764,   220, 50256,   986,
          262,  3807,   318,   655,   257,  8631,  1468,  9234,   764,   220,
        50256,   259,   663,  1266,  7188,   837, 22960,   257,  2089,  1029,
         1524,  3227,   286, 35537,   837,  1231,  4414,   286,  3496,   764,
          220, 50256,    79,   931,  5116,  2753,   281, 37959,   804,   379,
          262, 29442,   286,  1964, 29409,   837,   475,   340,   857,   523,
          351,   884,   281, 30690,  8216,   326,   345,  1239,   760,   618,
        14733,  5645,   290, 13574,  6140,   764,   220, 50256,  1169,  4686,
         7940,   375, 20374,   329,  1528,   532,   428,   655,  2936,   588,
          340,   750]))