New features: >No longer hosted on a literal raspberry pi >You can upload from a URL without downloading the file >You can tell me if the guess was correct or not >The classifier is a bit better
The classifier has been improved somewhat but it's still a bit spotty. I think it's more likely to guess high than low. It has a hard time with western cartoons, smooth blocks of color, and the picture of Stallman from the sticky.
Datasets: >/co/ about 3000 >/a/ about 3000 >imgur (aka reddit) about 3000 >imagenet about 10000 >safebooru about 12000 Together about 31000 images, which take up about 15 GB in a targz archive.
I got a bunch of images from /co/ hoping to get some western cartoon counterexamples in the dataset, but it turns out they like to dump entire pages of marvel-type comics. There's not enough of the cartoon type of comics, so it's weak there.
Why would you train it on /co/? Or are you just using /co/ to tell it what is not anime? Also, why not also use /c/, /w/, and maybe /e/ and /h/ as sources? there's lots of styles on those boards that aren't on /a/.
Landon Ramirez
>Or are you just using /co/ to tell it what is not anime? That was the goal. >Also, why not also use /c/, /w/, and maybe /e/ and /h/ as sources? I could, Jow Forums is actually pretty easy to scrape. But I need not-anime examples now more than I need anime examples.
Samuel Bell
It's down.
Benjamin Baker
Are you sure? It's hosted on amazon and I can get to it from my end.
Wow, I need to test with more browsers, and probably ship the font too. That looks pretty broken.
That will be the safebooru effect. I'd like a dataset for older stuff, but that's hard to find.
yw user
CNN, built with keras.
Ethan Russell
I'm feeding it all the non-anime pics from my root picture folder. I'll loop back through to feed it the anime pics after.
Lucas Walker
>CNN, built with keras. What I mean was more precisely, what layers, how large they are, what kernels are you using, what activations, etc.
Sebastian Taylor
I gave it a 7.8MB 3100x1900 image and it's stuck on classifying
Jordan Brown
Can you add a feature to allow batch-uploading? Maybe display a grid with small buttons under each one and what it guessed. Most of my pics are sorted into folders, so I could upload a touhou folder and know it's all anime, or my 3dpd folder and know none of it's anime.
I think AWS sets a 6 MB limit on the kind of requests I'm using. Having a large file like that doesn't matter anyway, it's resized to 512x512 before it gets classified regardless. If I knew how, I'd do that in the browser before upload.
#! /usr/bin/python3 import requests from base64 import b64encode import json
# Load data as base64 with open('test.png', 'rb') as in_f: image_data = b64encode(in_f.read())
# You can just POST b64 to the endpoint and it will come back r = requests.post(url, data=encode_png) print(r.text)
# If you want to classify a URL, don't send data, but include "url" as a param r = requests.post(url, params={'url': 'safebooru.org/includes/header.png'}) print(r.text) # It will return with the b64 of the image you linked
# If you want to tell it what your image was, you send it with "classify" and "key" params # The "key" param comes from the response of a classification j = json.loads(r.text) r = requests.post(url, params={'classify': 'anime', 'key': j['key']}) print(r.text) # The response here doesn't really mean anything
Christian Sanchez
Spotted a couple bugs in the code there r = requests.post(url, data=encode_png)
Can be changed to r = requests.post(url, data=image_data)
>variable names
Valid options for "classify" are "anime" and "notanime"
Why do you have right before dense three conv2d layers with 5x5 3x3 and 3x3 kernels going into each one another?
Kevin Moore
Part of it is because I have no idea what's best practice when it comes to network design. I used a few heuristics to choose those numbers: >I want to slowly build up depth >Try to keep the amount of information rougly the same between layers >Better to have a few smaller kernels stacked than a single larger one
Those layers have 100, 120, and 150 channels, and xy dimensions of 37, 35, 31, which work out to being about the same number of neurons.
Why three of them stacked like that? Why a 5x5 after a bunch of 3x3s? I basically guessed that it would work out after looking at alexnet and inception.
Luis Turner
Those are outdated in terms of deep learning. Read up on resnet and rcnns, it would improve your algorithm a lot. Also read up on siamese networks as they are the best networks to check similarity which may forward your project by also telling it which anime or something similar by training those siamese networks on different networks to show similar art styles. You can ask if you have any doubts i have around 5 years of experience so i can help you get a little clarity if you want.
John Murphy
Thanks for the advice! Residual networks / less sequential topologies are next on my list of things to learn. I might try shaking up the network a bit for the next round of training.
I had not heard of siamese networks before. They seem to be used for one shot classification? That seems less than optimal for a binary classifier, but I'll look some more to see if there's something there.
My biggest question is, is there a good way to choose numbers and architectures besides just copying? I chose 7 convolutional layers because I felt like it, and that's a pretty weird way to program.
Anthony Wilson
>Thanks for the advice! Residual networks / less sequential topologies are next on my list of things to learn. I might try shaking up the network a bit for the next round of training.
it is one shot learning. but you can add mutliple (nested) layers of siamese to improve the accuracy. it is not more optimal than a binary classifier at all. if you want better performing model binary classifier like softmax in cnn or svm is maybe speed wise better for large images of around 15-20mb but most of the time, you wont get large images at that size and accuracy wise siamese is far better.
>My biggest question is, is there a good way to choose numbers and architectures besides just copying? I chose 7 convolutional layers because I felt like it, and that's a pretty weird way to program.
it is obviously empirically determined most of the time and more layers is better. but, remember more width(nodes) means more context, more height means(finer details.)