Intel Itanium

What went wrong, lads?
Is x86 a un exorcisable evil?

Attached: Itanium.png (800x800, 337K)

Other urls found in this thread:

blog.ensilo.com/one-bit-to-rule-them-all-bypassing-windows-10-protections-using-a-single-bit
twitter.com/NSFWRedditImage

IA-64 wasn't Intel's first effort at killing x86, they'd previously tried with i860 and iAPX 432.

Anyway Itanic's downfall was that it tried to punt the complicated branch-prediction and speculation hardware off to the compiler. An IA-64 instruction words is three instructions and some extra info. Each of those instructions gets fed to a different unit, and they can't depend on each other. The compiler has to work this out ahead of time - inserting nops if it can't prove that the instructions don't depend on each other. The pipelines are programmer-accessible instead of being an internal implementation detail, the compiler has to take care of the instruction ordering and branch prediction too.

Now when Itanium started development around '93 or so these were considered to be solvable problems, but it turns out they were much less solvable than thought. Later Itaniums started doing the same tricks that x86 was starting to do in the mid-90s and does an awful lot of now, kinda killing what was supposed to be the architecture's unique feature. By then the damage had been done.

Intel had this idea that they were gonna buy themselves time around the turn of the millennium by not extending x86 to 64 bits - so if you need 64-bit you'd have to go IA-64. Then AMD came along with the backward compatible x86-64 and put paid to all that. x86-64 also made some noticeable improvements in the warts of x86 - it's much less register-starved for instance, and a few old 16-bit instructions were deprecated. (there are x86 instructions that are supported in x86-32 but aren't in x86-64)

One thing IA-64 did do was kill off some high-end RISC players, HP's PA_RISC and Alpha, for instance. But the x86 arm of Intel was starting to eat their lunch and trap them at the high end as soon as '95 with the Pentium Pro. Turns out it ate the Itanium team's lunch, too.

>An IA-64 instruction words is three instructions and some extra info. Each of those instructions gets fed to a different unit, and they can't depend on each other. The compiler has to work this out ahead of time - inserting nops if it can't prove that the instructions don't depend on each other. The pipelines are programmer-accessible instead of being an internal implementation detail, the compiler has to take care of the instruction ordering and branch prediction too.
This sounds ritualistic as shit. And very innefficient in the long run.

Massively overhyped, endlessly delayed, when it finally came out it underperformed, the hardware x86 emulation was worse than software and it was ultimately just yet another family of reasonably performing but ridiculously expensive parts for HPC and enterprise systems with no tangible edge over commodity x86 products whatsoever in the low end and mid ranges.

By the mid 2000s, processor architecture was nothing more than a sticker on the case, and unlike the big-name RISC/SysV platforms that usually had their own unique experiences and software bases to keep people coming back, a new Itanium machine with Linux or Windows didn't do much that you wouldn't be doing on a PC at a quarter of the price in six more months.
Great post, I don't really see the i860 as an x86 replacement attempt, though. Just a different chip for a totally different market that x86 wasn't really ready to tackle yet.

Several RISC architectures had ideas like that, Intel just took it way too far with IA-64. For instance, x86 has a ton of ways you can use instructions. Like say you want to add two numbers. In x86 you can add a register to a register, an immediate to a register, the contents of a memory location to a register, or you can add an immediate to the contents of a memory location. Or a register to a memory location, as above. Most RISC archs don't do that sort of thing, you can add registers and thats it, if you want to use main memory you get to pull it in and push it out yourself. The idea being that fewer, simpler instructions makes it easier to implement the instruction set in a way that can be clocked higher. You need fewer transistors and therefore less area to do it, so it enables you to spend some of those gains on eliminating the internal bottlenecks that limit speed.

Branch prediction and speculation, which were already things some big-iron archs did in the early 90s, consume a lot of transistors and are hard to make fast, even though they improve the efficiency of the processor overall. So Intel thought hmm... if we can push THAT to the compiler too, we can go even faster!

Note that this was thought to be a more pressing problem then. Netburst hadn't happened yet, and it was widely thought that pushing past, say, 200-300 MHz was going to be very difficult. But the massive x86 juggernaut and the money it produced paid for the research, brains, and new fabs to bulldoze over those problems. This was great for Intel in one sense, they were making trainloads of money, but bad for IA-64, since it was solving a problem that was turning out to be less of a problem than thought, and having to hit an ever-higher bar to look like a compelling speed increase over x86. AMD's competition only sharpened the pace. (you may remember they beat Intel to 1GHz)

Good thread.

Attached: approved.png (720x644, 39K)

>I don't really see the i860 as an x86 replacement attempt, though
well remember the debate that Tannenbaum had with Linus. It was widely thought - certainly in the 80s when 860 was made, and well into the 90s - that x86 was just a temporary thing. An aberration driven by the unique concerns of the very low-powered "micros" it'd been born on. The large instruction set that was friendly to assembly programmers was viewed as an impediment to future development, since it complicated instruction decoding, and put the bottlenecks in your physical transistor layout that I mentioned above. The future was widely held to lie with the people making RISC chips, and Intel wanted one of its own because it presumed that x86 had a shelf life, and that eventually those RISC archs would take over the PC space.

Also as you kinda allude to Linux had an impact by obviating most of the need for proprietary Unices. It was free and rapidly got to be as good or better, so it wasn't like you bought SPARC machines just because you wanted to run Solaris really badly, architectures had to compete more on their own merits. And starting with the PPro, x86 got good enough for a lot of the things people bought Unix workstations and servers to do. Maybe not quite as good, but good enough, and plenty cheaper. That took money out of RISC development and shoveled it back into the x86 engine.

In a certain way, Tannenbaum also was right. These days CPUs don't do x86 directly. They decode it into their internal architectures and get the shit done on a risc-like way.
In a certain way, modern CPUs emulate massive overclocked 486s with a bunch of new instructions.

I don't disagree with that, I'm just saying I don't view the i860 by itself as Intel's attempt to hasten that, rather as a complementary architecture alongside x86 for jobs x86 just wasn't as suited for, like embedded systems or high-end workstations. I don't really feel even Intel took x86 seriously as a truly high-performance architecture until the Pentium Pro really declared their intention to push things that way.

Tannenbaum was also mostly right about kernels. Anyone who wants a monolithic kernel runs Linux or FreeDOS. Everything else is microkernels and hybrids, or descended from even older code bases (BSD, OpenVMS, etc.).

I wonder why GNU didnt settle on L4 and add shit on it instead of making Hurd with so little manpower.

>when it finally came out it underperformed,
I remember that it actually performed pretty well. It just performed liked shit for 32bit software, and unfortunately for Intel nobody wanted to spend their entire budget just on software upgrades when their 32bit software worked fine.

1. They tried to promote it with Wintel - which was almost entirely x86, Windows was no posix - porting x86 to Alpha was hard enough and Alpha made sense - Itanium was powerful but badly marketed.
2. x86 emulation - this was the death knell for Itanium - instead of just telling everyone to port, they tried to emulate x86 so existing software would work - but the emulation was slower than a P233MMX, for 90% of the users, it would be an expensive downgrade until such a time that software was majority IA64.

Lots of things.
>VLIW sucks for general purpose computation
>compilers can't schedule operations with variable latency like memory reads very well
>very high price
>delayed by several years
>it only barely competed with Pentium 4 on performance
>general Intel fuckery

>it wasn't like you bought SPARC machines just because you wanted to run Solaris really badly
Funny thing: the SPARC port of Linux ran faster than Solaris ever did.
Then again, Slowlaris was many things, but never fast.

>hybrids
This doesn't really exist. The whole "hybrid kernel" things is a marketing term invented by Microsoft to sell NT to gullible idiots.
In reality NT was never even close to microkernels, in fact they shoved so many things into kernelspace it should be called a maxikernel (including but not limited to the GUI, font rendering, and a webserver).

L4 didn't exist at the time, and GNU didn't do Hurd alone. They bought the code to Mach and started modifying it and making the userspace servers around that. They still failed, partly because Mach is a trainwreck, and partly because of the usual GNU/incompetence + featuritis. Microkernels are hard to do as it is, it's even harder when you're stupid and want to add everything and the kitchen sink.

>I remember that it actually performed pretty well.
You remember wrong. Merced had terrible problems with the memory controller and it took another year to fix that in Itanium2. It still needed fuckhuge pools of cache to outperform Intel's own x86 though.

You still have the cost of on the fly translations and bad code density. If they made a new CISC based on 40 years experience gained since x86 it would be a lot better than we have today.

HURD was always just noodling around. Redox-OS is far, far ahead.

>Anyway Itanic's downfall was that it tried to punt the complicated branch-prediction and speculation hardware off to the compiler
Can you explain why this is bad?
Something so complicated and buggy (recent cpu bugs) wouldn't be easier and safer to be implemented by the compiler?
Also faster because you can optimize it?
And free performance upgrades as compilers get better?

Wouldn't itanium or something similar end up simplifying the CPU design removing branch predictors and shit?

It still seems like a good idea to me

>They still failed, partly because Mach is a trainwreck
Why Apple managed to make Mach work?
I heard that launchd is using Mach messages which Linux lacks still since dbus (or similar) in the kernel still doesn't exist

>Windows was no posix

NT4 had a posix subsystem, but it was a pile of shit - only there because of a government regulation requiring posix software, and not actually enough to run most stuff.

>I wonder why GNU didnt settle on L4 and add shit on it instead of making Hurd with so little manpower.

I thought hurd had decided to port to L4. I don't think it was ever intended to be an entirely novel microkernel system, and started off building stuff around mach, before porting to L4. Been a while since I've looked at it, though.

The brief answer is that it's almost impossible to predict run-time code-paths at compile-time.

Analogy: Imagine you're fighting a boss in an MMO. Deciding things at compile time would be you recording all your keystrokes and attacks and spells and character movement as one big long macro before the fight starts, then starting the fight and playing it back. Deciding things at run-time - that is, having all the complicated branch-prediction, speculation, and instruction-reordering hardware in silicon on the chip - is being able to start off with a script of abilities and movements like above, but to be able to dynamically adjust it on the fly if the boss casts spell A instead of spell B, if he drops a void zone right under your feet, etc.

All that complicated silicon on the chip has a crucial advantage that the compiler doesn't - it knows the actual, not predicted, current state of the processor. There's lots of things that the compiler just plain can't know in advance. Say an instruction accesses memory. If the thing that instruction fetches from RAM is in the processor's cache already, the instruction will complete very quickly, and anything that depends on that value being loaded can be scheduled right after it. If the value isn't in cache and the CPU has to go all the way out to main memory to get it, then it takes hundreds of clocks, and you want to execute instructions that don't depend on that memory value right after you start the memory access. How does the compiler know which to pick? It can't.

Thanks

They can't design anything from scratch to save their retarded lives.

But Intel is dying anyway so whatever.

Attached: 1532756867725.jpg (988x854, 164K)

>2013
It's 2019, Intel is still around
I want Intel to fall but it's too big to fall (same for Microsoft, Apple, Google)
Consumers put up with shit from these companies that would refuse from a smaller company or even Free software
For example intel cpus, apple crap, gmail and gsuite crap, office 365, skype, windows 10
Especially now, that multinationals are considered ethical entities (fuck globalist propaganda) these companies are untouchable and infallible (unless a big unpredictable event happens)

>these companies are untouchable
Apple was made to pay France alone 500million Euros in taxes. Google is actually paying more in EU fines than it is in taxes. Microsoft sees itself as the EU's ATM.
Untouchable, sure.
But don't let facts interupt your fantasies.

what is general purpose computing, goy.

>Why Apple managed to make Mach work?
macOS kernel is based on an early version of Mach which wasn't a microkernel yet, and at any rate contains substantial parts of BSD kernel. It's not a microkernel.

The kind of code that usually runs on CPUs, e.g. lots of branching, mostly integer, etc.
Itanium (and VLIW in general) really shines in scientific computation and parallel code, but gets beaten at everything else.
VLIW is great for GPUs, though, as evidenced by TeraScale which outperformed everything from Nvidia at the time.

>It's 2019, Intel is still around
It's a slow death, user. Can't you read? They can't do much about it, their fate is sealed after missing on the mobile market.

>things into kernelspace it should be called a maxikernel (including but not limited to the GUI, font rendering, and a webserver)
Not anymore
Since vista the graphics driver can crash or update and the screen just blanks and the windows are restored

>in a certain way
>mostly
That's pity

>open thread
>OP has a question
>see a bunch of wall of text replies

Let me get coffee and get comfy, love when industry veterans go in-depth on Jow Forums

Attached: 1547247757055.gif (163x136, 583K)

This

>Since vista the graphics driver can crash or update and the screen just blanks and the windows are restored
That's unrelated to the fact that there's still a lot of GUI code in the kernel, which makes kernel exploits through a fucking scrollbar possible.
blog.ensilo.com/one-bit-to-rule-them-all-bypassing-windows-10-protections-using-a-single-bit
Literally the first google result.

> What went wrong

they have outsourced it to slavs

just imagine

Windows NT, then called NT OS/2, was originally designed for the i860 processor. It was in late 1990/early 1991 that Microsoft realized Intel's i860 was doomed and they began porting their code to x86.

Had intel actually taken note of this failure, they wouldn't have spent so many years trying to make IA-64 a thing.

During Windows NT 4.0's development, tons of work was done to port the code from x86 back to the Merced IA-64 arch. That project was never completed due to Merced being delayed.

IA-64 took too long to get to market. Intel managed to kill Itanium before it ever got into customer's hands. Microsoft did end up porting NT to it, albeit for the Itanium 2. Most of those boxes were SQL database servers, and ultimately are still in use in production environments - for better or worse.

NT4 also had versions for powerPC, alpha and MIPS at least. I don't know if they viewed i860 as special - windows NT was an attempt to diversify things, CPU-wise.

Most of the other CPU versions got killed with windows 2000.

>One thing IA-64 did do was kill off some high-end RISC players, HP's PA_RISC and Alpha
HP killed PA RISC because they were partnering with Intel on Itanium. They're the only company that every shipped it in any quantity. Alpha died because Intel bought the IP.

One of the reasons Itanium failed is because it got consistently beaten by IBM POWER and later the SPARC T series. At least for roles where it makes sense to have a T series in the first place.

brainlet here so don't get mad if my questions are dumb

would it be more difficult to write compilers to properly take care of the branch prediction and dependencies? (that means it's done in software, right?) does it kill performance at all? does it affect the way people write programs in higher level languages like C?
it's safer though right? (no speculative execution in hardware, so you can simply turn it off)

>Itanium
Was it dyslexia?

TL;DR: Unbelievably fucking hard

rumor has it they had a version in the works for SPARC too (ported by Integraph)

call me lame, but I think it's just cool to have variety. not *too much* variety like linux, but some is nice, like BSD.

>It's 2019, Intel is still around
So are sailboats.

There's a reason people in the business call that type of compiler magical, it can't be done for general purpose code. Technical code can get pretty good results though and guess what Itanium was best suited for?

Pretty sure the only way they'd get Itanium to actually work as intended is to stuff the compiler with a complete transistor-and-wire model of the CPU core its compiling for, and then emulate the code on it.

*Might* be somewhat doable today considering how many cores are available to the end-user to speed things up a bit, but even then the compiler will be horrifically complex and hard to code.

He mentions 2013 and 2000 in the pic, but that screencap was taken in 2018 or so.

Itanium wasn't radical enough. VLIW could work if cache was actually scratchpad, i.e. completely separate memory which has no automatic mechanisms like cache. Then there are no unpredictable memory accesses; you know whether you are going to cache or to main memory.

I suspect that VLIW will get some attention in the next few years because it solves Spectre.