Creating Memory Mapped NLP-based Data#
In this notebook, we will use a fast dataset provider-based abstraction that interfaces with Hugging Face’s datasets
(and has been created by HazyResearch). The key advantage of this approach is the use of either shared memory or memory maps in Python to accelerate the caching process. Furthermore, the dataset is cached as a contiguous numpy array, enabling manipulation of data with any sequence length. This feature eliminates the need for re-encoding data for multiple lengths, streamlining
the data processing pipeline.
Instantiating the Provider#
The first step is to instantiate the FastHfDatasetProvider.from_hub()
, which loads and encodes the dataset. A set of arguments can be passed to its class method according to the user’s needs:
dataset_name
: Name of the dataset.dataset_config_name
: Name of the dataset configuration.data_dir
: Path to the data directory.tokenizer
: Instance of tokenizer to use.tokenizer_name
: Name of the tokenizer, iftokenizer
has not been passed.mapping_column_name
: The columns indataset
that should be tokenized.validation_split
: Fraction of the dataset to use for validation.seed
: Random seed.num_workers
: Number of workers to use for encoding.use_eos_token
: Whether to use EOS token to separate sequences.use_shared_memory
: Whether to use shared memory for caching.cache_dir
: Path to the cache directory.
[1]:
from archai.datasets.nlp.fast_hf_dataset_provider import FastHfDatasetProvider
# The provider will automatically download the dataset and tokenizer, encode
# the dataset and cache it for future use
dataset_provider = FastHfDatasetProvider.from_hub(
"glue",
dataset_config_name="sst2",
tokenizer_name="gpt2",
mapping_column_name=["sentence"],
use_shared_memory=False,
cache_dir="cache/glue-sst2-gpt2"
)
# (inputs, labels) can be retrieved with any sequence length
train_dataset = dataset_provider.get_train_dataset(seq_len=512)
val_dataset = dataset_provider.get_val_dataset(seq_len=512)
print(train_dataset[0], val_dataset[0])
2023-03-21 15:07:57,990 - archai.datasets.nlp.fast_hf_dataset_provider — WARNING — Shared memory is not available in Python < 3.8.
2023-03-21 15:08:00,865 - archai.datasets.nlp.fast_hf_dataset_provider — INFO — Downloading dataset ...
Found cached dataset glue (C:/Users/gderosa/.cache/huggingface/datasets/glue/sst2/1.0.0/dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad)
2023-03-21 15:08:04,789 - archai.datasets.nlp.fast_hf_dataset_provider — INFO — Encoding dataset ...
2023-03-21 15:08:04,793 - archai.datasets.nlp.fast_hf_dataset_provider — INFO — Number of workers: 1 | EOS token: True
Loading cached processed dataset at C:\Users\gderosa\.cache\huggingface\datasets\glue\sst2\1.0.0\dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad\cache-02d7057c177051d2.arrow
Loading cached processed dataset at C:\Users\gderosa\.cache\huggingface\datasets\glue\sst2\1.0.0\dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad\cache-bfdab1158dc66b61.arrow
Loading cached processed dataset at C:\Users\gderosa\.cache\huggingface\datasets\glue\sst2\1.0.0\dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad\cache-24fdf102efed86dd.arrow
2023-03-21 15:08:05,096 - archai.datasets.nlp.fast_hf_dataset_provider — INFO — Processing dataset to memory ...
2023-03-21 15:08:05,099 - archai.datasets.nlp.fast_hf_dataset_provider — INFO — Number of workers: 1 | Shared memory: False
2023-03-21 15:08:06,767 - archai.datasets.nlp.fast_hf_dataset_provider — INFO — Saving dataset to: cache\glue-sst2-gpt2
(tensor([24717, 649, 3200, 507, 422, 262, 21694, 4991, 220, 50256,
3642, 1299, 645, 20868, 837, 691, 2248, 1850, 308, 3775,
220, 50256, 5562, 10408, 663, 3435, 290, 48556, 1223, 2138,
4950, 546, 1692, 3450, 220, 50256, 2787, 1299, 15950, 11378,
284, 3520, 262, 976, 3690, 220, 50256, 261, 262, 5290,
15827, 12, 1659, 12, 1169, 12, 1008, 9310, 35478, 20954,
262, 28303, 714, 47478, 469, 510, 220, 50256, 5562, 705,
82, 1290, 1165, 15444, 284, 17004, 884, 31194, 3513, 220,
50256, 26567, 2536, 689, 326, 262, 3437, 286, 884, 289,
31777, 2512, 30181, 355, 29408, 1830, 460, 991, 1210, 503,
257, 1402, 837, 2614, 2646, 351, 281, 7016, 3355, 404,
764, 220, 50256, 1659, 473, 84, 948, 220, 50256, 64,
19095, 17280, 12, 1941, 12, 727, 705, 82, 26781, 19518,
220, 50256, 533, 517, 7744, 1807, 832, 621, 287, 749,
4600, 826, 12, 28973, 705, 7328, 220, 50256, 2188, 274,
284, 12986, 20428, 220, 50256, 1640, 883, 3807, 31006, 508,
13121, 326, 4600, 484, 466, 299, 470, 787, 6918, 588,
484, 973, 284, 7471, 220, 50256, 1169, 636, 810, 2147,
705, 82, 5836, 837, 220, 50256, 43439, 703, 2089, 428,
3807, 373, 220, 50256, 75, 437, 617, 16247, 284, 257,
13526, 1621, 220, 50256, 1169, 6000, 17245, 220, 50256, 36673,
3807, 220, 50256, 4480, 465, 6678, 4430, 290, 11800, 774,
220, 50256, 445, 917, 415, 3721, 220, 50256, 2032, 27428,
318, 2029, 477, 546, 257, 1862, 2415, 705, 82, 1986,
837, 290, 416, 13092, 281, 14549, 3025, 1986, 4493, 326,
2415, 705, 82, 17188, 290, 614, 23400, 837, 340, 31137,
764, 220, 50256, 4853, 874, 262, 2656, 290, 287, 617,
2842, 772, 731, 1010, 340, 220, 50256, 361, 1997, 837,
766, 340, 329, 479, 5757, 2042, 837, 508, 11665, 510,
257, 6388, 355, 257, 26821, 14314, 10086, 46410, 3706, 11841,
19317, 764, 220, 50256, 64, 8212, 319, 534, 1986, 220,
50256, 8988, 422, 262, 14802, 837, 26329, 44139, 13289, 220,
50256, 1069, 32838, 46136, 306, 3684, 16948, 290, 6028, 17049,
555, 398, 5109, 220, 50256, 268, 30486, 416, 281, 25007,
9404, 7668, 3350, 286, 47792, 13747, 220, 50256, 4758, 2063,
286, 10441, 12254, 318, 4785, 1058, 262, 636, 810, 2147,
705, 82, 5836, 837, 393, 262, 636, 810, 1223, 705,
82, 5836, 220, 50256, 259, 995, 22041, 220, 50256, 548,
922, 11681, 5559, 220, 50256, 1169, 7110, 318, 2147, 475,
36741, 6816, 35478, 20954, 422, 923, 284, 5461, 837, 220,
50256, 1169, 2223, 318, 336, 346, 1513, 220, 50256, 261,
477, 43386, 220, 50256, 10594, 1064, 1310, 286, 1393, 287,
428, 2646, 837, 543, 318, 1690, 32950, 88, 290, 13455,
13134, 220, 50256, 1525, 1290, 262, 5290, 3807, 286, 262,
614, 220, 50256, 48937, 832, 837, 220, 50256, 3549, 621,
1194, 7559, 1266, 582, 10148, 17271, 416, 44889, 257, 7505,
3690, 428, 8258, 2646, 220, 50256, 270, 705, 82, 546,
2428, 749, 6490, 423, 284, 1986, 287, 4845, 290, 1312,
892, 326, 705, 82, 644, 1312, 8288, 546, 340, 1377,
262, 1103, 2428, 29779, 1022, 262, 14397, 290, 14897, 22992,
220, 50256, 11718, 274, 220, 50256, 672, 35260, 284, 262,
6224, 286]), tensor([ 649, 3200, 507, 422, 262, 21694, 4991, 220, 50256, 3642,
1299, 645, 20868, 837, 691, 2248, 1850, 308, 3775, 220,
50256, 5562, 10408, 663, 3435, 290, 48556, 1223, 2138, 4950,
546, 1692, 3450, 220, 50256, 2787, 1299, 15950, 11378, 284,
3520, 262, 976, 3690, 220, 50256, 261, 262, 5290, 15827,
12, 1659, 12, 1169, 12, 1008, 9310, 35478, 20954, 262,
28303, 714, 47478, 469, 510, 220, 50256, 5562, 705, 82,
1290, 1165, 15444, 284, 17004, 884, 31194, 3513, 220, 50256,
26567, 2536, 689, 326, 262, 3437, 286, 884, 289, 31777,
2512, 30181, 355, 29408, 1830, 460, 991, 1210, 503, 257,
1402, 837, 2614, 2646, 351, 281, 7016, 3355, 404, 764,
220, 50256, 1659, 473, 84, 948, 220, 50256, 64, 19095,
17280, 12, 1941, 12, 727, 705, 82, 26781, 19518, 220,
50256, 533, 517, 7744, 1807, 832, 621, 287, 749, 4600,
826, 12, 28973, 705, 7328, 220, 50256, 2188, 274, 284,
12986, 20428, 220, 50256, 1640, 883, 3807, 31006, 508, 13121,
326, 4600, 484, 466, 299, 470, 787, 6918, 588, 484,
973, 284, 7471, 220, 50256, 1169, 636, 810, 2147, 705,
82, 5836, 837, 220, 50256, 43439, 703, 2089, 428, 3807,
373, 220, 50256, 75, 437, 617, 16247, 284, 257, 13526,
1621, 220, 50256, 1169, 6000, 17245, 220, 50256, 36673, 3807,
220, 50256, 4480, 465, 6678, 4430, 290, 11800, 774, 220,
50256, 445, 917, 415, 3721, 220, 50256, 2032, 27428, 318,
2029, 477, 546, 257, 1862, 2415, 705, 82, 1986, 837,
290, 416, 13092, 281, 14549, 3025, 1986, 4493, 326, 2415,
705, 82, 17188, 290, 614, 23400, 837, 340, 31137, 764,
220, 50256, 4853, 874, 262, 2656, 290, 287, 617, 2842,
772, 731, 1010, 340, 220, 50256, 361, 1997, 837, 766,
340, 329, 479, 5757, 2042, 837, 508, 11665, 510, 257,
6388, 355, 257, 26821, 14314, 10086, 46410, 3706, 11841, 19317,
764, 220, 50256, 64, 8212, 319, 534, 1986, 220, 50256,
8988, 422, 262, 14802, 837, 26329, 44139, 13289, 220, 50256,
1069, 32838, 46136, 306, 3684, 16948, 290, 6028, 17049, 555,
398, 5109, 220, 50256, 268, 30486, 416, 281, 25007, 9404,
7668, 3350, 286, 47792, 13747, 220, 50256, 4758, 2063, 286,
10441, 12254, 318, 4785, 1058, 262, 636, 810, 2147, 705,
82, 5836, 837, 393, 262, 636, 810, 1223, 705, 82,
5836, 220, 50256, 259, 995, 22041, 220, 50256, 548, 922,
11681, 5559, 220, 50256, 1169, 7110, 318, 2147, 475, 36741,
6816, 35478, 20954, 422, 923, 284, 5461, 837, 220, 50256,
1169, 2223, 318, 336, 346, 1513, 220, 50256, 261, 477,
43386, 220, 50256, 10594, 1064, 1310, 286, 1393, 287, 428,
2646, 837, 543, 318, 1690, 32950, 88, 290, 13455, 13134,
220, 50256, 1525, 1290, 262, 5290, 3807, 286, 262, 614,
220, 50256, 48937, 832, 837, 220, 50256, 3549, 621, 1194,
7559, 1266, 582, 10148, 17271, 416, 44889, 257, 7505, 3690,
428, 8258, 2646, 220, 50256, 270, 705, 82, 546, 2428,
749, 6490, 423, 284, 1986, 287, 4845, 290, 1312, 892,
326, 705, 82, 644, 1312, 8288, 546, 340, 1377, 262,
1103, 2428, 29779, 1022, 262, 14397, 290, 14897, 22992, 220,
50256, 11718, 274, 220, 50256, 672, 35260, 284, 262, 6224,
286, 428])) (tensor([ 270, 705, 82, 257, 23332, 290, 1690, 13891, 7002, 764,
220, 50256, 403, 2704, 8589, 4420, 30942, 290, 12111, 220,
50256, 47205, 514, 284, 2911, 326, 299, 16617, 318, 24357,
284, 21030, 257, 1688, 3451, 355, 257, 5068, 1865, 47602,
26479, 764, 220, 50256, 1169, 7205, 837, 24138, 837, 2647,
837, 13483, 45501, 290, 2128, 389, 477, 34328, 1813, 262,
3227, 705, 82, 38132, 567, 1957, 274, 764, 220, 50256,
270, 705, 82, 3105, 1377, 845, 837, 845, 3105, 764,
220, 50256, 16670, 49699, 351, 14733, 290, 257, 1178, 39980,
4135, 18105, 837, 262, 2646, 318, 257, 23056, 306, 2726,
804, 379, 1862, 1466, 764, 220, 50256, 64, 3360, 32460,
2646, 764, 220, 50256, 273, 1804, 938, 614, 705, 82,
5704, 351, 534, 409, 12, 22095, 764, 220, 50256, 5832,
466, 299, 470, 423, 284, 760, 546, 2647, 284, 9144,
262, 2646, 705, 82, 2562, 5146, 13516, 286, 10997, 290,
19661, 764, 220, 50256, 259, 3446, 9919, 2431, 837, 749,
286, 543, 3804, 355, 6364, 355, 611, 1312, 705, 67,
587, 5586, 12105, 319, 281, 45329, 29680, 837, 10451, 6885,
30895, 422, 37276, 284, 13665, 2584, 284, 10517, 26876, 764,
220, 50256, 1169, 46814, 2890, 13289, 286, 262, 5983, 1394,
262, 2646, 22804, 290, 1394, 262, 5386, 40112, 1513, 764,
220, 50256, 270, 2753, 257, 6283, 1611, 286, 37296, 1272,
284, 7030, 262, 18054, 286, 686, 4835, 329, 1706, 837,
281, 710, 285, 451, 64, 837, 304, 1018, 1734, 35783,
837, 290, 842, 1292, 67, 11555, 30686, 1559, 477, 287,
262, 976, 3807, 764, 220, 50256, 986, 262, 2646, 21046,
422, 257, 3092, 286, 14733, 357, 1223, 2622, 284, 5236,
503, 262, 3685, 1267, 2644, 220, 50256, 732, 6808, 329,
357, 537, 3301, 290, 279, 2518, 1267, 837, 772, 588,
606, 837, 996, 3737, 340, 705, 82, 281, 9942, 5699,
284, 26246, 764, 220, 50256, 10197, 9961, 3296, 481, 749,
1884, 407, 1064, 644, 484, 705, 260, 6095, 351, 5876,
790, 1110, 2162, 262, 3807, 16523, 1111, 5636, 2171, 290,
14733, 764, 220, 50256, 64, 18857, 837, 1029, 12, 45564,
863, 10530, 422, 773, 544, 326, 33954, 271, 3973, 32067,
2647, 837, 9280, 837, 3496, 837, 290, 1029, 10512, 764,
220, 50256, 1169, 10825, 389, 8246, 290, 481, 5587, 257,
16384, 351, 2687, 508, 705, 82, 1683, 550, 1641, 14649,
764, 220, 50256, 3885, 4364, 256, 265, 280, 468, 257,
47868, 329, 10868, 9176, 326, 7842, 1958, 607, 23077, 20024,
837, 290, 287, 428, 4187, 378, 48718, 10997, 837, 673,
705, 82, 355, 3329, 12, 4743, 652, 409, 18478, 415,
355, 673, 373, 287, 716, 2634, 14485, 764, 220, 50256,
986, 262, 3807, 318, 655, 257, 8631, 1468, 9234, 764,
220, 50256, 259, 663, 1266, 7188, 837, 22960, 257, 2089,
1029, 1524, 3227, 286, 35537, 837, 1231, 4414, 286, 3496,
764, 220, 50256, 79, 931, 5116, 2753, 281, 37959, 804,
379, 262, 29442, 286, 1964, 29409, 837, 475, 340, 857,
523, 351, 884, 281, 30690, 8216, 326, 345, 1239, 760,
618, 14733, 5645, 290, 13574, 6140, 764, 220, 50256, 1169,
4686, 7940, 375, 20374, 329, 1528, 532, 428, 655, 2936,
588, 340]), tensor([ 705, 82, 257, 23332, 290, 1690, 13891, 7002, 764, 220,
50256, 403, 2704, 8589, 4420, 30942, 290, 12111, 220, 50256,
47205, 514, 284, 2911, 326, 299, 16617, 318, 24357, 284,
21030, 257, 1688, 3451, 355, 257, 5068, 1865, 47602, 26479,
764, 220, 50256, 1169, 7205, 837, 24138, 837, 2647, 837,
13483, 45501, 290, 2128, 389, 477, 34328, 1813, 262, 3227,
705, 82, 38132, 567, 1957, 274, 764, 220, 50256, 270,
705, 82, 3105, 1377, 845, 837, 845, 3105, 764, 220,
50256, 16670, 49699, 351, 14733, 290, 257, 1178, 39980, 4135,
18105, 837, 262, 2646, 318, 257, 23056, 306, 2726, 804,
379, 1862, 1466, 764, 220, 50256, 64, 3360, 32460, 2646,
764, 220, 50256, 273, 1804, 938, 614, 705, 82, 5704,
351, 534, 409, 12, 22095, 764, 220, 50256, 5832, 466,
299, 470, 423, 284, 760, 546, 2647, 284, 9144, 262,
2646, 705, 82, 2562, 5146, 13516, 286, 10997, 290, 19661,
764, 220, 50256, 259, 3446, 9919, 2431, 837, 749, 286,
543, 3804, 355, 6364, 355, 611, 1312, 705, 67, 587,
5586, 12105, 319, 281, 45329, 29680, 837, 10451, 6885, 30895,
422, 37276, 284, 13665, 2584, 284, 10517, 26876, 764, 220,
50256, 1169, 46814, 2890, 13289, 286, 262, 5983, 1394, 262,
2646, 22804, 290, 1394, 262, 5386, 40112, 1513, 764, 220,
50256, 270, 2753, 257, 6283, 1611, 286, 37296, 1272, 284,
7030, 262, 18054, 286, 686, 4835, 329, 1706, 837, 281,
710, 285, 451, 64, 837, 304, 1018, 1734, 35783, 837,
290, 842, 1292, 67, 11555, 30686, 1559, 477, 287, 262,
976, 3807, 764, 220, 50256, 986, 262, 2646, 21046, 422,
257, 3092, 286, 14733, 357, 1223, 2622, 284, 5236, 503,
262, 3685, 1267, 2644, 220, 50256, 732, 6808, 329, 357,
537, 3301, 290, 279, 2518, 1267, 837, 772, 588, 606,
837, 996, 3737, 340, 705, 82, 281, 9942, 5699, 284,
26246, 764, 220, 50256, 10197, 9961, 3296, 481, 749, 1884,
407, 1064, 644, 484, 705, 260, 6095, 351, 5876, 790,
1110, 2162, 262, 3807, 16523, 1111, 5636, 2171, 290, 14733,
764, 220, 50256, 64, 18857, 837, 1029, 12, 45564, 863,
10530, 422, 773, 544, 326, 33954, 271, 3973, 32067, 2647,
837, 9280, 837, 3496, 837, 290, 1029, 10512, 764, 220,
50256, 1169, 10825, 389, 8246, 290, 481, 5587, 257, 16384,
351, 2687, 508, 705, 82, 1683, 550, 1641, 14649, 764,
220, 50256, 3885, 4364, 256, 265, 280, 468, 257, 47868,
329, 10868, 9176, 326, 7842, 1958, 607, 23077, 20024, 837,
290, 287, 428, 4187, 378, 48718, 10997, 837, 673, 705,
82, 355, 3329, 12, 4743, 652, 409, 18478, 415, 355,
673, 373, 287, 716, 2634, 14485, 764, 220, 50256, 986,
262, 3807, 318, 655, 257, 8631, 1468, 9234, 764, 220,
50256, 259, 663, 1266, 7188, 837, 22960, 257, 2089, 1029,
1524, 3227, 286, 35537, 837, 1231, 4414, 286, 3496, 764,
220, 50256, 79, 931, 5116, 2753, 281, 37959, 804, 379,
262, 29442, 286, 1964, 29409, 837, 475, 340, 857, 523,
351, 884, 281, 30690, 8216, 326, 345, 1239, 760, 618,
14733, 5645, 290, 13574, 6140, 764, 220, 50256, 1169, 4686,
7940, 375, 20374, 329, 1528, 532, 428, 655, 2936, 588,
340, 750]))
Loading from Cache#
After loading and encoding the dataset for the first time, a cache will be created with a unique fingerprint (identifier) based on its configuration. The cached is composed by the following files:
config.json
: Dataset provider configuration (used to re-create the object when loaded from cache).tokenizer.pkl
: Tokenizer used to encode the data (also re-created when loaded from cache).train.npy
: Training tokens (inputs and labels).validation.npy
: Validation tokens (inputs and labels).test.npy
: Testing tokens (inputs and labels).
The FastHfDatasetProvider
class provides a from_cache
method which can be used to re-instantiate the cached dataset provider, in case the user wants to re-use in different places.
[2]:
# The caching mechanism automatically saves `config.json` and `tokenizer.pkl`,
# which are used to recreate the provider when calling `from_cache` method
dataset_provider = FastHfDatasetProvider.from_cache("cache/glue-sst2-gpt2")
train_dataset = dataset_provider.get_train_dataset(seq_len=512)
val_dataset = dataset_provider.get_val_dataset(seq_len=512)
print(train_dataset[0], val_dataset[0])
2023-03-21 15:08:07,007 - archai.datasets.nlp.fast_hf_dataset_provider — INFO — Loading dataset from: cache/glue-sst2-gpt2
(tensor([24717, 649, 3200, 507, 422, 262, 21694, 4991, 220, 50256,
3642, 1299, 645, 20868, 837, 691, 2248, 1850, 308, 3775,
220, 50256, 5562, 10408, 663, 3435, 290, 48556, 1223, 2138,
4950, 546, 1692, 3450, 220, 50256, 2787, 1299, 15950, 11378,
284, 3520, 262, 976, 3690, 220, 50256, 261, 262, 5290,
15827, 12, 1659, 12, 1169, 12, 1008, 9310, 35478, 20954,
262, 28303, 714, 47478, 469, 510, 220, 50256, 5562, 705,
82, 1290, 1165, 15444, 284, 17004, 884, 31194, 3513, 220,
50256, 26567, 2536, 689, 326, 262, 3437, 286, 884, 289,
31777, 2512, 30181, 355, 29408, 1830, 460, 991, 1210, 503,
257, 1402, 837, 2614, 2646, 351, 281, 7016, 3355, 404,
764, 220, 50256, 1659, 473, 84, 948, 220, 50256, 64,
19095, 17280, 12, 1941, 12, 727, 705, 82, 26781, 19518,
220, 50256, 533, 517, 7744, 1807, 832, 621, 287, 749,
4600, 826, 12, 28973, 705, 7328, 220, 50256, 2188, 274,
284, 12986, 20428, 220, 50256, 1640, 883, 3807, 31006, 508,
13121, 326, 4600, 484, 466, 299, 470, 787, 6918, 588,
484, 973, 284, 7471, 220, 50256, 1169, 636, 810, 2147,
705, 82, 5836, 837, 220, 50256, 43439, 703, 2089, 428,
3807, 373, 220, 50256, 75, 437, 617, 16247, 284, 257,
13526, 1621, 220, 50256, 1169, 6000, 17245, 220, 50256, 36673,
3807, 220, 50256, 4480, 465, 6678, 4430, 290, 11800, 774,
220, 50256, 445, 917, 415, 3721, 220, 50256, 2032, 27428,
318, 2029, 477, 546, 257, 1862, 2415, 705, 82, 1986,
837, 290, 416, 13092, 281, 14549, 3025, 1986, 4493, 326,
2415, 705, 82, 17188, 290, 614, 23400, 837, 340, 31137,
764, 220, 50256, 4853, 874, 262, 2656, 290, 287, 617,
2842, 772, 731, 1010, 340, 220, 50256, 361, 1997, 837,
766, 340, 329, 479, 5757, 2042, 837, 508, 11665, 510,
257, 6388, 355, 257, 26821, 14314, 10086, 46410, 3706, 11841,
19317, 764, 220, 50256, 64, 8212, 319, 534, 1986, 220,
50256, 8988, 422, 262, 14802, 837, 26329, 44139, 13289, 220,
50256, 1069, 32838, 46136, 306, 3684, 16948, 290, 6028, 17049,
555, 398, 5109, 220, 50256, 268, 30486, 416, 281, 25007,
9404, 7668, 3350, 286, 47792, 13747, 220, 50256, 4758, 2063,
286, 10441, 12254, 318, 4785, 1058, 262, 636, 810, 2147,
705, 82, 5836, 837, 393, 262, 636, 810, 1223, 705,
82, 5836, 220, 50256, 259, 995, 22041, 220, 50256, 548,
922, 11681, 5559, 220, 50256, 1169, 7110, 318, 2147, 475,
36741, 6816, 35478, 20954, 422, 923, 284, 5461, 837, 220,
50256, 1169, 2223, 318, 336, 346, 1513, 220, 50256, 261,
477, 43386, 220, 50256, 10594, 1064, 1310, 286, 1393, 287,
428, 2646, 837, 543, 318, 1690, 32950, 88, 290, 13455,
13134, 220, 50256, 1525, 1290, 262, 5290, 3807, 286, 262,
614, 220, 50256, 48937, 832, 837, 220, 50256, 3549, 621,
1194, 7559, 1266, 582, 10148, 17271, 416, 44889, 257, 7505,
3690, 428, 8258, 2646, 220, 50256, 270, 705, 82, 546,
2428, 749, 6490, 423, 284, 1986, 287, 4845, 290, 1312,
892, 326, 705, 82, 644, 1312, 8288, 546, 340, 1377,
262, 1103, 2428, 29779, 1022, 262, 14397, 290, 14897, 22992,
220, 50256, 11718, 274, 220, 50256, 672, 35260, 284, 262,
6224, 286]), tensor([ 649, 3200, 507, 422, 262, 21694, 4991, 220, 50256, 3642,
1299, 645, 20868, 837, 691, 2248, 1850, 308, 3775, 220,
50256, 5562, 10408, 663, 3435, 290, 48556, 1223, 2138, 4950,
546, 1692, 3450, 220, 50256, 2787, 1299, 15950, 11378, 284,
3520, 262, 976, 3690, 220, 50256, 261, 262, 5290, 15827,
12, 1659, 12, 1169, 12, 1008, 9310, 35478, 20954, 262,
28303, 714, 47478, 469, 510, 220, 50256, 5562, 705, 82,
1290, 1165, 15444, 284, 17004, 884, 31194, 3513, 220, 50256,
26567, 2536, 689, 326, 262, 3437, 286, 884, 289, 31777,
2512, 30181, 355, 29408, 1830, 460, 991, 1210, 503, 257,
1402, 837, 2614, 2646, 351, 281, 7016, 3355, 404, 764,
220, 50256, 1659, 473, 84, 948, 220, 50256, 64, 19095,
17280, 12, 1941, 12, 727, 705, 82, 26781, 19518, 220,
50256, 533, 517, 7744, 1807, 832, 621, 287, 749, 4600,
826, 12, 28973, 705, 7328, 220, 50256, 2188, 274, 284,
12986, 20428, 220, 50256, 1640, 883, 3807, 31006, 508, 13121,
326, 4600, 484, 466, 299, 470, 787, 6918, 588, 484,
973, 284, 7471, 220, 50256, 1169, 636, 810, 2147, 705,
82, 5836, 837, 220, 50256, 43439, 703, 2089, 428, 3807,
373, 220, 50256, 75, 437, 617, 16247, 284, 257, 13526,
1621, 220, 50256, 1169, 6000, 17245, 220, 50256, 36673, 3807,
220, 50256, 4480, 465, 6678, 4430, 290, 11800, 774, 220,
50256, 445, 917, 415, 3721, 220, 50256, 2032, 27428, 318,
2029, 477, 546, 257, 1862, 2415, 705, 82, 1986, 837,
290, 416, 13092, 281, 14549, 3025, 1986, 4493, 326, 2415,
705, 82, 17188, 290, 614, 23400, 837, 340, 31137, 764,
220, 50256, 4853, 874, 262, 2656, 290, 287, 617, 2842,
772, 731, 1010, 340, 220, 50256, 361, 1997, 837, 766,
340, 329, 479, 5757, 2042, 837, 508, 11665, 510, 257,
6388, 355, 257, 26821, 14314, 10086, 46410, 3706, 11841, 19317,
764, 220, 50256, 64, 8212, 319, 534, 1986, 220, 50256,
8988, 422, 262, 14802, 837, 26329, 44139, 13289, 220, 50256,
1069, 32838, 46136, 306, 3684, 16948, 290, 6028, 17049, 555,
398, 5109, 220, 50256, 268, 30486, 416, 281, 25007, 9404,
7668, 3350, 286, 47792, 13747, 220, 50256, 4758, 2063, 286,
10441, 12254, 318, 4785, 1058, 262, 636, 810, 2147, 705,
82, 5836, 837, 393, 262, 636, 810, 1223, 705, 82,
5836, 220, 50256, 259, 995, 22041, 220, 50256, 548, 922,
11681, 5559, 220, 50256, 1169, 7110, 318, 2147, 475, 36741,
6816, 35478, 20954, 422, 923, 284, 5461, 837, 220, 50256,
1169, 2223, 318, 336, 346, 1513, 220, 50256, 261, 477,
43386, 220, 50256, 10594, 1064, 1310, 286, 1393, 287, 428,
2646, 837, 543, 318, 1690, 32950, 88, 290, 13455, 13134,
220, 50256, 1525, 1290, 262, 5290, 3807, 286, 262, 614,
220, 50256, 48937, 832, 837, 220, 50256, 3549, 621, 1194,
7559, 1266, 582, 10148, 17271, 416, 44889, 257, 7505, 3690,
428, 8258, 2646, 220, 50256, 270, 705, 82, 546, 2428,
749, 6490, 423, 284, 1986, 287, 4845, 290, 1312, 892,
326, 705, 82, 644, 1312, 8288, 546, 340, 1377, 262,
1103, 2428, 29779, 1022, 262, 14397, 290, 14897, 22992, 220,
50256, 11718, 274, 220, 50256, 672, 35260, 284, 262, 6224,
286, 428])) (tensor([ 270, 705, 82, 257, 23332, 290, 1690, 13891, 7002, 764,
220, 50256, 403, 2704, 8589, 4420, 30942, 290, 12111, 220,
50256, 47205, 514, 284, 2911, 326, 299, 16617, 318, 24357,
284, 21030, 257, 1688, 3451, 355, 257, 5068, 1865, 47602,
26479, 764, 220, 50256, 1169, 7205, 837, 24138, 837, 2647,
837, 13483, 45501, 290, 2128, 389, 477, 34328, 1813, 262,
3227, 705, 82, 38132, 567, 1957, 274, 764, 220, 50256,
270, 705, 82, 3105, 1377, 845, 837, 845, 3105, 764,
220, 50256, 16670, 49699, 351, 14733, 290, 257, 1178, 39980,
4135, 18105, 837, 262, 2646, 318, 257, 23056, 306, 2726,
804, 379, 1862, 1466, 764, 220, 50256, 64, 3360, 32460,
2646, 764, 220, 50256, 273, 1804, 938, 614, 705, 82,
5704, 351, 534, 409, 12, 22095, 764, 220, 50256, 5832,
466, 299, 470, 423, 284, 760, 546, 2647, 284, 9144,
262, 2646, 705, 82, 2562, 5146, 13516, 286, 10997, 290,
19661, 764, 220, 50256, 259, 3446, 9919, 2431, 837, 749,
286, 543, 3804, 355, 6364, 355, 611, 1312, 705, 67,
587, 5586, 12105, 319, 281, 45329, 29680, 837, 10451, 6885,
30895, 422, 37276, 284, 13665, 2584, 284, 10517, 26876, 764,
220, 50256, 1169, 46814, 2890, 13289, 286, 262, 5983, 1394,
262, 2646, 22804, 290, 1394, 262, 5386, 40112, 1513, 764,
220, 50256, 270, 2753, 257, 6283, 1611, 286, 37296, 1272,
284, 7030, 262, 18054, 286, 686, 4835, 329, 1706, 837,
281, 710, 285, 451, 64, 837, 304, 1018, 1734, 35783,
837, 290, 842, 1292, 67, 11555, 30686, 1559, 477, 287,
262, 976, 3807, 764, 220, 50256, 986, 262, 2646, 21046,
422, 257, 3092, 286, 14733, 357, 1223, 2622, 284, 5236,
503, 262, 3685, 1267, 2644, 220, 50256, 732, 6808, 329,
357, 537, 3301, 290, 279, 2518, 1267, 837, 772, 588,
606, 837, 996, 3737, 340, 705, 82, 281, 9942, 5699,
284, 26246, 764, 220, 50256, 10197, 9961, 3296, 481, 749,
1884, 407, 1064, 644, 484, 705, 260, 6095, 351, 5876,
790, 1110, 2162, 262, 3807, 16523, 1111, 5636, 2171, 290,
14733, 764, 220, 50256, 64, 18857, 837, 1029, 12, 45564,
863, 10530, 422, 773, 544, 326, 33954, 271, 3973, 32067,
2647, 837, 9280, 837, 3496, 837, 290, 1029, 10512, 764,
220, 50256, 1169, 10825, 389, 8246, 290, 481, 5587, 257,
16384, 351, 2687, 508, 705, 82, 1683, 550, 1641, 14649,
764, 220, 50256, 3885, 4364, 256, 265, 280, 468, 257,
47868, 329, 10868, 9176, 326, 7842, 1958, 607, 23077, 20024,
837, 290, 287, 428, 4187, 378, 48718, 10997, 837, 673,
705, 82, 355, 3329, 12, 4743, 652, 409, 18478, 415,
355, 673, 373, 287, 716, 2634, 14485, 764, 220, 50256,
986, 262, 3807, 318, 655, 257, 8631, 1468, 9234, 764,
220, 50256, 259, 663, 1266, 7188, 837, 22960, 257, 2089,
1029, 1524, 3227, 286, 35537, 837, 1231, 4414, 286, 3496,
764, 220, 50256, 79, 931, 5116, 2753, 281, 37959, 804,
379, 262, 29442, 286, 1964, 29409, 837, 475, 340, 857,
523, 351, 884, 281, 30690, 8216, 326, 345, 1239, 760,
618, 14733, 5645, 290, 13574, 6140, 764, 220, 50256, 1169,
4686, 7940, 375, 20374, 329, 1528, 532, 428, 655, 2936,
588, 340]), tensor([ 705, 82, 257, 23332, 290, 1690, 13891, 7002, 764, 220,
50256, 403, 2704, 8589, 4420, 30942, 290, 12111, 220, 50256,
47205, 514, 284, 2911, 326, 299, 16617, 318, 24357, 284,
21030, 257, 1688, 3451, 355, 257, 5068, 1865, 47602, 26479,
764, 220, 50256, 1169, 7205, 837, 24138, 837, 2647, 837,
13483, 45501, 290, 2128, 389, 477, 34328, 1813, 262, 3227,
705, 82, 38132, 567, 1957, 274, 764, 220, 50256, 270,
705, 82, 3105, 1377, 845, 837, 845, 3105, 764, 220,
50256, 16670, 49699, 351, 14733, 290, 257, 1178, 39980, 4135,
18105, 837, 262, 2646, 318, 257, 23056, 306, 2726, 804,
379, 1862, 1466, 764, 220, 50256, 64, 3360, 32460, 2646,
764, 220, 50256, 273, 1804, 938, 614, 705, 82, 5704,
351, 534, 409, 12, 22095, 764, 220, 50256, 5832, 466,
299, 470, 423, 284, 760, 546, 2647, 284, 9144, 262,
2646, 705, 82, 2562, 5146, 13516, 286, 10997, 290, 19661,
764, 220, 50256, 259, 3446, 9919, 2431, 837, 749, 286,
543, 3804, 355, 6364, 355, 611, 1312, 705, 67, 587,
5586, 12105, 319, 281, 45329, 29680, 837, 10451, 6885, 30895,
422, 37276, 284, 13665, 2584, 284, 10517, 26876, 764, 220,
50256, 1169, 46814, 2890, 13289, 286, 262, 5983, 1394, 262,
2646, 22804, 290, 1394, 262, 5386, 40112, 1513, 764, 220,
50256, 270, 2753, 257, 6283, 1611, 286, 37296, 1272, 284,
7030, 262, 18054, 286, 686, 4835, 329, 1706, 837, 281,
710, 285, 451, 64, 837, 304, 1018, 1734, 35783, 837,
290, 842, 1292, 67, 11555, 30686, 1559, 477, 287, 262,
976, 3807, 764, 220, 50256, 986, 262, 2646, 21046, 422,
257, 3092, 286, 14733, 357, 1223, 2622, 284, 5236, 503,
262, 3685, 1267, 2644, 220, 50256, 732, 6808, 329, 357,
537, 3301, 290, 279, 2518, 1267, 837, 772, 588, 606,
837, 996, 3737, 340, 705, 82, 281, 9942, 5699, 284,
26246, 764, 220, 50256, 10197, 9961, 3296, 481, 749, 1884,
407, 1064, 644, 484, 705, 260, 6095, 351, 5876, 790,
1110, 2162, 262, 3807, 16523, 1111, 5636, 2171, 290, 14733,
764, 220, 50256, 64, 18857, 837, 1029, 12, 45564, 863,
10530, 422, 773, 544, 326, 33954, 271, 3973, 32067, 2647,
837, 9280, 837, 3496, 837, 290, 1029, 10512, 764, 220,
50256, 1169, 10825, 389, 8246, 290, 481, 5587, 257, 16384,
351, 2687, 508, 705, 82, 1683, 550, 1641, 14649, 764,
220, 50256, 3885, 4364, 256, 265, 280, 468, 257, 47868,
329, 10868, 9176, 326, 7842, 1958, 607, 23077, 20024, 837,
290, 287, 428, 4187, 378, 48718, 10997, 837, 673, 705,
82, 355, 3329, 12, 4743, 652, 409, 18478, 415, 355,
673, 373, 287, 716, 2634, 14485, 764, 220, 50256, 986,
262, 3807, 318, 655, 257, 8631, 1468, 9234, 764, 220,
50256, 259, 663, 1266, 7188, 837, 22960, 257, 2089, 1029,
1524, 3227, 286, 35537, 837, 1231, 4414, 286, 3496, 764,
220, 50256, 79, 931, 5116, 2753, 281, 37959, 804, 379,
262, 29442, 286, 1964, 29409, 837, 475, 340, 857, 523,
351, 884, 281, 30690, 8216, 326, 345, 1239, 760, 618,
14733, 5645, 290, 13574, 6140, 764, 220, 50256, 1169, 4686,
7940, 375, 20374, 329, 1528, 532, 428, 655, 2936, 588,
340, 750]))