DiffWave audio samples

The following samples are generated from the pre-trained 22.05 kHz model available on our project page.

Seen-speaker samples

Model trained on the LJ Speech dataset. Samples generated from a held-out set from the LJ Speech dataset.

Text Reference DiffWave (T=50, C=64)
Printing, in the only sense with which we are at present concerned, differs from most if not from all the arts and crafts represented in the Exhibition
in being comparatively modern.
For although the Chinese took impressions from wood blocks engraved in relief for centuries before the wood-cutters of the Netherlands, by a similar process
produced the block books, which were the immediate predecessors of the true printed book,
the invention of movable metal letters in the middle of the fifteenth century may justly be considered as the invention of the art of printing.

Unseen-speaker samples

Model trained on the LJ Speech dataset. Samples generated from the VCTK dataset.

Text Reference DiffWave (T=50, C=64)
We lost our composure towards the interval, he said.
There were no immediate reports of injuries.
It was very full anyway.
The prime minister has a huge regard for Mo.
Since then, he has played no active part in the company.