Why the fuck is python de facto language in ML? Fucking annoying

Why the fuck is python de facto language in ML? Fucking annoying

Attached: nuralnet.png (1318x862, 67K)

Other urls found in this thread:

wordsandbuttons.online/outperforming_everything_with_anything.html
twitter.com/SFWRedditImages

Because you can learn it in 2 months even if you have an 85 IQ

because it's easy

for retards by retards

Because Phd Data Scientists don't know how to program.
The real default is Spark BTW, if you're running on a cluster. I usually port all the Python nonsense I get from Data Science to Scala before putting in production.

because the people who use it aren't programmers, they are scientists

>all the mad salty ctards
Keep getting dabbed on, brainlets

Attached: 1525205529736945759.jpg (1452x1252, 355K)

I like python, it's a nice language for small projects.

Attached: 1514493487651.png (800x750, 106K)

import machine_learning as ml

ml.learn(data)

To keep brainlets like you out.

>Fucking annoying
Just wait for Python 3.7 support in GraalVM become mature.

300k starting

you can rewrite libraries in C if you like. I’m sure you would get a big following if you did so.

It’s only the language for prototyping and showcasing algorithms. The final implementations are written in C++ to improve efficiency.
Really python is just the best middle ground for writing algorithms because it’s syntax is simple and widely understood. It beats looking at a pile of C++ or Java oop generics vomit.

Because it's convenient and the code you call us C or C++ anyway, so performance is not an issue, either.
And you write a lot of code that is meant to be thrown away, so a language that lets you do stuff fast is convenient.
And many libraries are multilanguage, e.g. you create a model in Python, and can deploy it in a Java application. And even if you don't, Python is a general purpose language, so you can easily make a web service or whatever else.

Spark is not the default for ML, for data manipulation at scale it is but the garbage that is pandas is still top for single setups.
Never use MLLib unless you have to, h20 is better but if you have other options use it

because the (((big tech companies))) want to lower the barrier and increase the pool for ML next generation blue collar workers just like they did with the web 2.0 and mobile devs 10 years ago

R is much easier to learn. Python is just better than all other languages

Python is to programming what Arduino is for electronics

Because with ML libraries it's literally a case of wordsandbuttons.online/outperforming_everything_with_anything.html as you are building the ML AST.
So while it's still shit tier software engineering, performance isn't one of the problems.

It won't.
>b-but my performance
Ruby is as dynamic as Python and GraalVM can improve its performance roughly about 8 times on average (let's leave out some trivial meme number benchmarks for now). Which is about the same as PyPy archives.
Now, let's be generous and assume GraalVM archives to improve those numbers to 10x, likely at the cost of startup time and memory consumption.
Congrats, but 10x as fast as CPython/CRuby is still slow garbage.
Just give up on trying to make those retarded dynamic semantics fast, you fucks. Best you can do is write crappy DSL JIT compilers like Numba and hope that nobody will feel the cracks in your foundation.

Attached: 1300044776986.jpg (250x250, 10K)

>anyone in the world
>using spark
next you'll say people use hadoop for ML, top kek.

>why won't R&D companies value low level autism that achieves +10% processing speed over readable, quick to write languages that give actual results?

>data science
>low barrier
that is if you copy paste code not knowing the mathematical background. It is all nice and cool until something breaks, if you don't have solid background in statistics you won't survive for long in the field.

increasing the workers pool the whole point of open sourcing ML frameworks in the first place, of course you need solid background just like any other field, but sooner or later more newcomers will come from everywhere digging for the gold they heard about from somebody else, then like every software field you will find yourself competing with so much devs and the salary won't be very high because in globalism it's always a race to the bottom when it comes to prices including labor

Echoing the prototype argument, outside of that you have caffe, darknet and others in c++ for deep learning. I'm not aware of how well built out data munging and manipulation libraries are in languages outside python, r and scala. But I'd rather run on a slower language and have properly coded data operations than have a quick language where the functions aren't properly coded.

I'm not talking about the performance here. It's more about being able to use those libraries in your language of choice without actually touching that stuff too much with your bare hands.

what a shit picture I can draw better than that

Attached: 1540742133662.jpg (400x400, 151K)

I am already in the field and I have seen the performance of the beginners. Data science might be "easy" to get into, but if you want to do anything beyond linear regression and basic logit then you need a lot of creative thinking. To be honest I have only a faint idea about software engineering and web development. But data science is the kind of field that requires extensive abstract thinking and the ability to convert real life problems into xyz.
What makes it different than occasional programming (in my opinion anyway) is that it is quite easy to verify the results through using validation sets and looking the F1 scores, Ham loss, etc, so one cannot fake their way through it.
just my 0.02$

And people will get what they pay for, a shitty webdev can make something that probably works and when it fails it just doesn't work but an average person can validate it for quality.
What will probably happen is that cloud providers will just provide the automated ML solutions for companies so that the same developers can just grab something and write off poor performance as it's "just AI, it's not perfect".

The most obvious larp I've ever seen. Meanwhile 99.9999% of the field is pure scam and overfitting the validation set. "you should independently sort the dependent and independent variables, it always gives me better predictive value" t. average data scientist.

>data """science""" is not for brainlets

Attached: 1537423693844.jpg (216x234, 10K)

Just works

I'm in an ml class right now. One of my friends is an engineering major and is taking it with me. Our first assignment was a network of perceptrons that could recognize digits 0-9.
I wrote mine in python, and it was less than 100 lines of code.
He wrote his assignment in C++, and it blew up to 500 lines of code, and it ended up running slower than my script.

Python has a bunch of libraries that makes parsing and preprocessing the data a breeze and libraries for doing the vector or matrix math efficiently.

Because its concise and fast; you can use whatever C reskin you want

It's all C/C++ underneath anyway. Python is just used as glue so it pretty much doesn't have any performance penalties when it comes to ML.

>overfitting the validation set.

You don't know what that word really means.

>What will probably happen is that cloud providers will just provide the automated ML solutions for companies so that the same developers can just grab something and write off poor performance as it's "just AI, it's not perfect".
Already happening, look at Azure ML or its equivalents in Amazon and Google. They are basically GUI toys and buttons that any average marketer can use. But again, when the results are poor, there is not much that Automation (TM) can do for you.
There is simply no such thing as one size fits all in data science.

How of all things would the Java ecosystem help in this field?

pyspark is pretty decent desu. especially before datasets were introduced.
jupyter notebook running python2 kernel + emr cluster is sick.

Its the easiest and haves no performance drawbacks.

Literally what other language would you use

I forget whats the hadoop alternative people actually do use

It really isn’t but a lot of sjw’s get affirmative actioned into it.

Currently working under a libshit feminazi professor just to get into the recommendations loop - hiding pwr lvl

Why are the brainlets so salty about ML?
All the pretending that it's easy, that it's going to be outsourced, just because it's something they didn't get into.

I'll never understand magical thinking.
As a non-brainlet, I downplay my field and are jealous of other fields (grass greener etc.). But every single mong always compensates for his own inadequacies with
>b-but others will fail! you'll see! AI winter soon!
Why do that? You'll inevitably fail and feel bad about predicting things that are optimistic for you and pessimistic for your "enemies".

Attached: 1525231954125.jpg (1663x791, 538K)

They don’t realize AI really isn’t AI its programmer-intelligence. “AI” is nowhere near what the name implied.

>pasting python libraries is hard

Fuck off, the future will follow a concept I though of to be the best:

Meta-Indian Machine Learning

Basically you hire hundreds of cheap pajeets 'data scientists', you choose the best models based on a fitness function and tell them to improve the model and choose again, its indian natural selection, an algorithm that makes algorithms

>Why are intelligent people so salty about ML?
FTFY.
>software bug, resulting in roughly 10k car crashes
>"Just a software bug, shit happens and we were haxxored by le evil russians."
Expect things to shift even more into this direction because ML "can't be programmed, only taught and corrected afterwards".

Not Java itself, but something like Scala or even Clojure, considering the burning desire of the whole industry to add moar cores.

Attached: tr.jpg (960x643, 67K)

To be fair but not exactly.
What people typically call 'data science' is the most brainlet shit in the world (but still nets you ridiculous salaries so there's nothing wrong with going for it).
Real (i.e. academic) data science is serious business but nobody knows what they're doing so a "smart brainlet" can easily dominate there.
Cutting-edge learning systems research (RL, DL, next gen approaches) is serious business but nobody really cares about these and you can't even get started without a PhD.

It's not. I'm doing a course on Automatic Speech Recognition right now and it's kicking my fucking ass. It's all in C++, fuck I wish it were in python. It's so fucking easy to make mistakes in C++ and the compiler doesn't help you at all with run time errors.

>python
not so bad, just ugly

But other software work in exactly the same way though. They experience the same error classes, just at a much higher rate. It's true that it's too hard to test edge cases in AI compared to other methods because you can't cleanly establish the edges of a result space though.

Larp harder, pajeet.

Surprisingly there is only one Indian in the course. The rest are German and Russian.

Thank you for indulging my request. That was unexpected. Also very funny.

So far not a single intelligent person in this thread has expressed anything but support for ML.

Masters for data analytics is more than fine. PhD might get you overqualified for some things.

i think you mean Matlab®

Data analytics is codewords for "I give you 100k user queries and you manually label each of them into the closest out of these 10k user intents so that our data engineers can copy-paste a model and learn on this data".

it's because most "data science" work is quick dirty exploration of data and trying different shit to what works best for that data set, so asspained software engineering practices with some "enterprise" language like java or c++ is only necessary at the later stages if at all

even academic shitheads are switching to julia, matlab is fuckin dead and good riddance

I meant data analytics in the broadest sense of the phrase. Including pattern recognition, ai, ML, data mining.

It was enjoyable but its closed nature finally started to strangle it. It could compete with R decently but not whats come since.

/thread

anyone who chooses anything but the easiest viable implementation of a solution is an idiot, and anyone who does it on a company's dime should be fired for it.

>he thinks ml is actually implemented in python

more like python is used as a meta language to describe models that are implemented in a more performant language


imagine being a data scientist/ml engineer and mucking around with manual memory management while your ACTUAL goal is to rapidly prototype and develop models. This may have been the case 10 years ago, but you will not be writing more performant code than an optimized tensorflow model by yourself

and guess what: if you're asking these questions in a butthurt manner (which you are) than you're either 1) a brainlet or 2) have no professional experience

Attached: 1529879949711.png (413x374, 199K)

It's much too broad to put it all together under one umbrella, and data analytics would not be the right label for it anyway.

t. clinically retarded

fuck i need to move to america starting for a software dev in the uk is 25k gbp and max is like 60-70k gbp

Data niggery is not software faggotry. That said, both are grossly underpaid EU-wide compared to north america as a whole.

Holy fuck are you even 18?
You better be underage, because the only alternative is that you're a NEET manchild.

I like how a mathematician recently looked at the ML pytards' work and said "This is just ordinary differential equations, but less efficient." Now in an attempt to look less stupid, for wasting years of effort, reinventing well known maths, they're calling it ODE and trying to hype it as the next big thing in neural nets.

Attached: thats_the_joke.jpg (500x367, 34K)

Butthurt underageb& redd!tard is butthurt

Just because you didn't understand anything that was going on doesn't mean your magical fairyland ideas are in any way relevant to reality.
Sorry you dropped out of highschool, but that doesn't make you an "underachieving genius" or whatever other meme you think you are.

Wrong

>I learned C, so I'll use it for everything
Fucking Jow Forums

this

This, but also python is outrageously unsuitable for ML. Loops are often used to glue the output of tools together and the slowness can most definitely be seen there, but writing C modules for python is hell so it's never worth it.
The lack of interactivity with python environments causes retarded amounts of wasted work when a single typo somewhere loses 2 hours of work.
The syntax itself greatly facilitates impossible to find errors hiding in the code, ready to blow up in your face, while simultaneously making it impossible for tools to automatically refactor code.
All of these are actual legitimate real-world concerns in ML.
On the other hand, compiled languages would never be suitable because it would just exacerbate one of python's own problems: the aforementioned lack of interactivity. And C would be even worse in making memory errors a big issue when ML involves manipulating massive amounts of data, hence it's much more susceptible to these kinds of problems.

Machine Learning is mostly used by Zoomers who can barely Program

>python is outrageously unsuitable for ML
Numpy
>Loops are often used to glue the output of tools together and the slowness can most definitely be seen there
Numpy
>writing C modules for python is hell so it's never worth it
But you don't need that because numpy and numba exist
>The lack of interactivity with python environments
Jupyter
>The syntax itself greatly facilitates impossible to find errors hiding in the code
The syntax does the opposite
The only thing that really makes it more error prone is the typing system that allows dynamic typing
>All of these are actual legitimate real-world concerns in ML
1. Speed is irrelevant, because ML is research first, polish and deploy later. Once you need speed, you change the slow python loops to proper numpy/numba or even C module stuff and then it's fast.
2. Interactivity with jupyter is top tier and it automatically "documents" what you did and what were the results last time you ran this shit. This is one of the reasons python is so good for ML.
3. Research - most of ML work - doesn't rely on code quality as much, because you'll be throwing away a lot

So not even one of your points is good.
You could replace your entire post with
>Python is shit because it doesn't force static types like Java
and it would be a much better post.

If you knew anything about ODEs and neural nets, you'd instantly know that's bullshit.

Neural nets and other kinds of ML are function minimization on steroids. You basically craft a function that describes how wrong your ML algorithm's output is, and the algorithm learns the parameters that, given an input, will give you the output that is statistically the least wrong.

>pandas is still top for single setups.
Aww, lookie.
He still does ML on a single node.
Call me when you graduate college, I'll give you a job cleaning my bathrooms.

I figured out why brainlets hate ML:
It's because it doesn't have an obvious solution that is 100% true.

Brainlets are terrible with ambiguity. They sperg out whenever someone doesn't agree with them on terms in the dictionary.
Since ML approximates "truth" and doesn't give hard rules "this is what is", "this is what is not", low IQ types see it as useless and wrong, because it doesn't let them reason about the results.
Statistical models require thinking in nearly continuous space of ifs and elses and this kills the dumbfuck.

Now, almost no non-trivial program really gives a 100% certain answer on all the inputs, but brainlets don't know that since they don't actually program for a living.

Attached: 1545260624383.jpg (1149x891, 139K)

the cars would be trained BEFORE they get released to the market you fucking spastic

Yup, recent PySpark actually hangs with Scala on the dataframe API.
Big difference is, Scala is pre-compiled, so it will throw an error right away instead of three hours into run-time.
A tiny annoyance, but saves a lot of testing time

most of the numerical libraries are c or fortran

you can learn to print hello world but you can't learn it

i have an iq of -150

t. never even looked at python, let alone used it, let alone ml.

>ML
>cpu
My fucking sides!

He's trying to brainletspeech about neural ODEs, one of the best paper award winners at nips 2018. The problem is that the only thing he knows about neural ODEs is the paper's title.