File size: 2,362 Bytes
ad16788
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
import collections
from pathlib import Path
from typing import Union

import numpy as np
from typeguard import check_argument_types

from espnet2.fileio.read_text import load_num_sequence_text


class FloatRandomGenerateDataset(collections.abc.Mapping):
    """Generate float array from shape.txt.

    Examples:
        shape.txt
        uttA 123,83
        uttB 34,83
        >>> dataset = FloatRandomGenerateDataset("shape.txt")
        >>> array = dataset["uttA"]
        >>> assert array.shape == (123, 83)
        >>> array = dataset["uttB"]
        >>> assert array.shape == (34, 83)

    """

    def __init__(
        self,
        shape_file: Union[Path, str],
        dtype: Union[str, np.dtype] = "float32",
        loader_type: str = "csv_int",
    ):
        assert check_argument_types()
        shape_file = Path(shape_file)
        self.utt2shape = load_num_sequence_text(shape_file, loader_type)
        self.dtype = np.dtype(dtype)

    def __iter__(self):
        return iter(self.utt2shape)

    def __len__(self):
        return len(self.utt2shape)

    def __getitem__(self, item) -> np.ndarray:
        shape = self.utt2shape[item]
        return np.random.randn(*shape).astype(self.dtype)


class IntRandomGenerateDataset(collections.abc.Mapping):
    """Generate float array from shape.txt

    Examples:
        shape.txt
        uttA 123,83
        uttB 34,83
        >>> dataset = IntRandomGenerateDataset("shape.txt", low=0, high=10)
        >>> array = dataset["uttA"]
        >>> assert array.shape == (123, 83)
        >>> array = dataset["uttB"]
        >>> assert array.shape == (34, 83)

    """

    def __init__(
        self,
        shape_file: Union[Path, str],
        low: int,
        high: int = None,
        dtype: Union[str, np.dtype] = "int64",
        loader_type: str = "csv_int",
    ):
        assert check_argument_types()
        shape_file = Path(shape_file)
        self.utt2shape = load_num_sequence_text(shape_file, loader_type)
        self.dtype = np.dtype(dtype)
        self.low = low
        self.high = high

    def __iter__(self):
        return iter(self.utt2shape)

    def __len__(self):
        return len(self.utt2shape)

    def __getitem__(self, item) -> np.ndarray:
        shape = self.utt2shape[item]
        return np.random.randint(self.low, self.high, size=shape, dtype=self.dtype)