Could there be such thing like pic related but with audio upscaling? that would be cool

could there be such thing like pic related but with audio upscaling? that would be cool

Attached: waifu2x.png (520x500, 189K)

Other urls found in this thread:

people.xiph.org/~jm/demo/rnnoise/
github.com/andabi/deep-voice-conversion/
svp-team.com/wiki/Main_Page
files.catbox.moe/w226b5.webm
files.catbox.moe/44o7f8.webm
github.com/aaaboypop/N4A-V2/releases/tag/v0.1
github.com/aaaboypop/N4A-V2/releases/tag/v0.3
github.com/aaaboypop/N4A-V2/releases/tag/v0.5
twitter.com/SFWRedditGifs

I'm quite sure 3 letter agencies have tools for reconstructing sounds.

There could be if you made one.

It would make no practical sense because audio files often have sample rates that span near the entire human hearing range so there's no need to make up for anything. Furthermore it's impossible to reconstruct higher frequencies by just having lower ones. Lastly, the equivalent of your anime image would most likely be 4 track chiptune music.

> it's impossible to reconstruct higher frequencies by just having lower ones.
Actually if I understand correctly this is a common technique in newer lossy codecs to save even more space, HE-AAC calls it Spectral Band Replication (SBR) and Opus calls it band folding

I'm no expert, but think about it. Images like this follow patterns, and so it's (relatively) easy to train algorithms to fill in the blanks. The missing sound lost to. compression or low quality equipment could basically be anything though. You might be able to refine what you already have (but I don't really know honestly) but I doubt you could ever train an algorithm to recover lost sounds.

Those encodes must be storing some data to recreate the frequencies accurately. But with OPs way, there would only be downsampled data with no accompanying information to tell how the original source may have looked. These are essentially different situations.

There are audio filters that in theory try remove compression artifacts, but going from 8kHz to 32kHz is absolutely alien.

Exciter?

Interpolation is already standard in music players.

But yeah, you could theoretically improve audio quality using ML. But it would be extremely difficult to train properly, and would require a lot of high quality samples, of similar songs.

>Furthermore it's impossible to reconstruct higher frequencies by just having lower ones.
Not true, that's basically the same as saying that it's impossible to upscale images because that too is creating new data that wasn't part of the input. The way it does this is that the neural network have learned how anime drawings tend to look, so that it can guess what that extra data would be based on the incomplete input data. Similarly, an audio version could learn how certain instruments sound in its lossless format, and then fill in the frequencies that are missing of the instruments it thinks it's hearing based on the frequencies that are in the input.

Superresolution of images is actually relatively easy to train because you can just downscale hi-res images for an input/output pair. You could do the same thing for audio, taking high quality images and low-quality transcodes.

It's mostly a problem of structuring the network so that it has the right features to work with. I don't know what the state of the art for audio networks is, but I imagine the network structures used in audio synthesis or translation would be good.

There are NNs trained to removed noise already people.xiph.org/~jm/demo/rnnoise/

and this audio -> audio voice style transfer github.com/andabi/deep-voice-conversion/

Understand that this is upscaling, not reconstructing.
Two very different things. All Waifu2x does is guess what pixels should be were by having seen thousands of shitty 2D girls before.

There is no information being reconstructed, added from previous models to make the illusion of having something reconstruct.

I.e. waifu2x won't magically put back a deleted pixel, it will just guess what pixel should go there, same can be done with sound, it's all just numerical data for a computer.

Upscaling looks like shit. I don't know why the tech is being pursued as if it will be comparable to a native image of the same resolution.

makes shitty images usable, makes older images usable etc.

I've used gigapixel a few times for printing low res images at a good size and it works well.

Attached: keiku-edit.jpg (5000x5000, 3.47M)

Why would you want cleaner audio files
Now, AI that could draw the obnoxious in-between frames in animation, that would be goat

that it would

a cs project in the making...

Attached: smooth silky sally1.webm (640x480, 2.78M)

I don't care about the upscaling, but waifu2x is pretty good at removing JPEG compression artifacts (and you can't always find the lossless original of an image).

SVP?

Why doesn't adobe incorporate waifu2x into photoshop

Or webp

FUCK YOUR INTEGER SCALING.

it's niche and limited, gigapixel is better for general use

why doesn't adobe do their own ai based upscaling

probably easier to just buy topaz

why bother when you can just wait and buy the winner

aye, I think it was

Attached: sally2.webm (450x338, 2.95M)

Use a vector tool.

Attached: 21123.jpg (1440x810, 173K)

what is this magic? how to make my animu this smooth?

read above.

Same here.
2 licenses for convenience.

svp-team.com/wiki/Main_Page

does it work with MadVR?

The idea is there, people are researching it. Ultimately the encodings of a sound, image, video, are all the same algorithm working on noise issue.

Welcome to 2013

Yes.
It's actually pre-configured with it if you pick it during the setup.
You can choose from mpv, mpc-hc and some other I forgot during the install, so you're not limited to mpc-hc.
Also comes with SVPTube which is a similar feature but for media players in browsers, also have a built in downloader so you can backup your still-not-copyright-infringed-videos.

>implying I haven't used SVP since that time.
Hey at least you'll get a (You) now.
Be grateful.

Be careful, user. The smoothness comes at the price.

Attached: example.png (640x480, 311K)

It's a small price though.

Dragon Ball
files.catbox.moe/w226b5.webm

Drive
files.catbox.moe/44o7f8.webm

Attached: 1429703044472.png (672x434, 143K)

For slow pans that have animation frames repeat this definitely is ideal.

where was this shit all my life, just installed the evaluation version and god it is mindblowing. how do i pirate it? because fuck them for letting only lelnux users to have it for free.

Depends on your taste. I rather take judder than glitchy frames.
Also congratulations. You're the first person in a long time to post example clips that actually look decent. Usually people try to sell motion-based interpolation by posting the worst and most glitchy clips imaginable.

My sound card from 2006 could upscale with something called the X-Fi crystallizer. Was like anti-aliasing for sound waves I think but never noticed a change

waifu2xing an entire anime when?

Not unheard of. It just takes a while.

K-on 4k rerelease

Looks retarded and the motion blured area of the image does not catch up

How can quads be so wrong?

I am right

In fact I am an expert at 60 fps, and I can tell when a video was smoothened or not.

Why are windows users so poor and entitled?

>upscaling
if by that refer to "freshing up" the audio quality then the answer is no.
if youhave a lossy source you can convert it to a container format which is lossless, but with the source material being lossy, the outcome will still be lossy.

github.com/aaaboypop/N4A-V2/releases/tag/v0.1
+
github.com/aaaboypop/N4A-V2/releases/tag/v0.3
+
github.com/aaaboypop/N4A-V2/releases/tag/v0.5

Run the Video -> Image work first.
Then Waifu2x Vulkan to convert the images.
Then finally Image->Video.

I've already converted few of the low res hentai-anime myself and made them extra fap worthy.

>a waifu2x frontend written in AHK
Now I've seen everything.

They have their own version called Preserve Details 2

Attached: Nausicaa v02p127.png (2800x2048, 1.54M)

Attached: ftfy.png (520x500, 221K)

Based. I'm going to try that with Bible Black.

because there's no native image at the same resolution, and waifu2x sure looks better than nearest neighbor

Audio signals are normally very redundant. Both in the mathematical sense and even more so to the brain of a human listener. This gets worse as you add more channels, each giving more and more knowledge about the same sounds repeated on each channel.
I do not understand what upscaling means in the audio sense. Quantization error is not a serious issue for 16-bit audio, only lossy codecs.

It's meant as an approximation of the harmonics and noise normally present. The higher audio frequencies are expensive in terms of data. A rough approximation of the energy(squared amplitude) in each of the filter bands is used to scale the harmonics and shape the noise in the filterbank.

DSEE HX

Aint seen nothing until brainfuck implementation appears.

yeah, but it would only improve the quality of sounds similar to the training model, while most likely distorting everything else.
in other words you can't expect to input any 128kbps pop song and get out the same song but sounding like 320kbps

Attached: aiportraits_1563521935.jpg (1024x512, 89K)